• WaSQL Wired
  • Posts
  • Put your tables to REST with Postgres File DFWs

Put your tables to REST with Postgres File DFWs

Read REST services as a local table using PostgreSQL

In a previous article I showed you how you can use the File Foreign Data Wrappers (FDW) Postgres plugin to get server information in real time. If you recall, this is accomplished by using the PROGRAM Option in the CREATE FOREIGN TABLE command to run a local program and return the output in CSV format. Using that same concept, lets create an FDW that will read data from a REST service or API as if it were a local table.

To get started grab your favorite REST API and credentials needed to connect to it. For this demo we will use a news API called https://newsapi.org. While they do have paid tiers, they offer a free tier as well. I have already signed up for a free account so I will wait for a moment for you to do the same…

Now that you have an API key lets try your key to get a sample response. This will help us understand what the data looks like. Notice in the following example response that articles is the array we want to convert to CSV…

{
  "status": "ok",
  "totalResults": 34,
  "articles": [
    {
      "source": {
        "id": "the-washington-post",
        "name": "The Washington Post"
      },
      "author": "Ellen Francis",
      "title": "Mona Lisa to get a room of her own in Louvre museum renovation - The Washington Post",
      "description": "French President Emmanuel Macron announced plans to expand the Louvre and move the Mona Lisa to a new exhibition space, after the museum warned of overcrowding.",
      "url": "https://www.washingtonpost.com/world/2025/01/28/louve-museum-renovation-mona-lisa/",
      "urlToImage": "https://www.washingtonpost.com/wp-apps/imrs.php?src=https://arc-anglerfish-washpost-prod-washpost.s3.amazonaws.com/public/5WI5LFBRZZ5PK6FVJJK4POPIUE_size-normalized.jpg&w=1440",
      "publishedAt": "2025-01-29T00:33:57Z",
      "content": "Shes the star attraction in a museum filled with masterpieces, and soon she will be getting a room of her own.\r\nAs part of an ambitious overhaul of the Louvre in Paris, Leonardo da Vincis painting of… [+4089 chars]"
    },
    {
      "source": {
        "id": "ars-technica",
        "name": "Ars Technica"
      },
      "author": "Eric Berger",
      "title": "Why did Elon Musk just say Trump wants to bring two stranded astronauts home? - Ars Technica",
      "description": "“We will do so.”…",
      "url": "https://arstechnica.com/space/2025/01/why-did-elon-musk-just-say-trump-wants-to-bring-two-stranded-astronauts-home/",
      "urlToImage": "https://cdn.arstechnica.net/wp-content/uploads/2024/09/GYQYomvakAETlcS-1152x648.jpeg",
      "publishedAt": "2025-01-28T23:52:23Z",
      "content": null
    }
}

Now that we understand what the data looks like we can use wget to get the JSON payloads from the news service and then we will pipe it to jq to convert the JSON payloads into a nicely formatted CSV output. Since this REST endpoint requires an API key we can store that key in a file and the use cat to pull it into our wget command. This way your key not exposed. An expanded version of the command looks like the following:

wget --header="Authorization: $(cat /path/to/newsapi_key.txt)" \
     -qO- "https://newsapi.org/v2/top-headlines?country=us" | \
jq -r '.articles | map([
    .source.name,
    .author,
    .title,
    .description,
    .url,
    .urlToImage,
    .publishedAt,
    .content
]) | (["Source","Author","Title","Description","URL","Image URL","Published Date","Content"] | @csv), (.[] | @csv)'

Lastly, lets squish that command into a single line and feed it into a Postgres table using CREATE FOREIGN TABLE. Since our output from jq returns the CSV header row we will set the HEADER option to on so that it does not show up as a data line. NOTE: Postgres does not support CREATE OR REPLACE FOREIGN TABLE so I am adding a DROP FOREIGN TABLE IF EXISTS first. This makes it easier to rerun your CREATE statement until it works.

DROP FOREIGN TABLE IF EXISTS public.news;
CREATE FOREIGN TABLE public.news (
    source text, 
    author text,
    title text, 
    description text,
    url text,
    image_url text,
    published_date text,
    content text
) 
SERVER import
OPTIONS (
    PROGRAM $$wget --header="Authorization: $(cat /path/to/newsapi_key.txt)" -qO- "https://newsapi.org/v2/top-headlines?country=us" | jq -r '.articles | map([.source.name,.author,.title,.description,.url,.urlToImage,.publishedAt,.content]) | (["Source","Author","Title","Description","URL","Image URL","Published Date","Content"] | @csv), (.[] | @csv)'$$, 
    FORMAT 'csv', 
    HEADER 'on'
);

That is it! Now you can just call the table to get todays news.

select * from public.news

P.S. You may want to create an archive table (public.news_archive). This way you can store previous news by this simple query.

INSERT INTO public.news_archive SELECT * from public.news

For me this opens up all kinds of possibilities. It also eliminates the need to have a python script running as a cron to get this data and then put it into the database. Less moving parts usually means less chance for something to break.

P.S.S If you can run SQL commands but do not have SSH access to your Postgres server here is a way to get your API Key onto the server without SSH.

DROP FOREIGN TABLE IF EXISTS public.newstemp;
CREATE FOREIGN TABLE public.newstemp (
    source text
) 
SERVER import
OPTIONS (
    PROGRAM $$echo YOUR_APIKEY_HERE > /path/to/newsapi_key.txt $$, 
    FORMAT 'text' 
);
SELECT * FROM public.newstemp;
DROP FOREIGN TABLE IF EXISTS public.newstemp;

If you found this article enlightening then please subscribe AND SHARE! See you next week :)