Power BI / Looker / Sigma

When the standard reports aren't enough, you want your accounting data flowing into your BI stack — Power BI, Looker, Sigma, Tableau, Metabase, or whatever your team uses. The pattern is straightforward: a daily ETL job pulls from Solid's REST API and pushes to your BI tool's data source.

This recipe walks through the canonical version: a daily Python job that pulls account balances and journal entries, lands them in a warehouse (Postgres, BigQuery, Snowflake, etc.), and your BI tool reads from there.

Why not BI-tool-direct-to-Solid?

Most BI tools support REST API connectors, but going direct has real downsides:

Every dashboard refresh hits Solid's API (fast for small files, slow for large ones)
No historical snapshot — yesterday's "AR aging" can't be reproduced
BI auth lives in the BI tool's connector configuration, often less secure than a service account
Schema drift between Solid versions can break dashboards mid-refresh

A daily ETL into a warehouse solves all four. The marginal cost is ~50 lines of Python and a cron schedule.

The data model

What flows from Solid into the warehouse, daily:

Table	Source endpoint	Refresh strategy
`accounts`	`GET /accounts`	Full overwrite — small list (under 100 typically)
`contacts`	`GET /contacts`	Full overwrite — also small
`items`	`GET /items`	Full overwrite
`journal_entries`	`GET /entries?from=YYYY-MM-DD`	Incremental since last run
`account_balances_daily`	`GET /balances?as_of=YYYY-MM-DD`	Snapshot per day; never overwritten
`aging_ar_daily`	`GET /reports/ar-aging?as_of=YYYY-MM-DD`	Snapshot per day
`aging_ap_daily`	`GET /reports/ap-aging?as_of=YYYY-MM-DD`	Snapshot per day

The daily-snapshot tables are what enables "show me how AR aged over the last 6 months" — every day you store a row, your BI tool charts the trend.

Step 1 — Set up the warehouse

Pick a warehouse:

Warehouse	Best for
Postgres	Self-hosted; cheapest at low volume
Snowflake	Industry-standard for analytics; best for medium-to-large data
BigQuery	Flat per-query pricing; good for sporadic dashboard use
DuckDB (local file)	Smallest possible; Python script writes to a `.duckdb` file your BI tool reads

For a small business with under 100K transactions/year, DuckDB is enough. The whole "warehouse" is one .duckdb file in S3 (or anywhere). Your BI tool queries it like any other warehouse.

For a larger business or multi-tenant accounting firm, Snowflake or BigQuery is the right answer.

This recipe uses Postgres for concreteness; the SQL is portable.

Step 2 — Define the schemas

CREATE SCHEMA solid;
 
CREATE TABLE solid.accounts (
  id              TEXT PRIMARY KEY,
  account_number  TEXT,
  name            TEXT NOT NULL,
  account_type    TEXT NOT NULL,
  category        TEXT NOT NULL,    -- asset, liability, equity, revenue, expense
  normal_balance  TEXT NOT NULL,    -- debit or credit
  parent_id       TEXT,
  currency_code   TEXT NOT NULL,
  is_active       BOOLEAN NOT NULL,
  loaded_at       TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE solid.contacts (
  id              TEXT PRIMARY KEY,
  contact_type    TEXT NOT NULL,    -- customer, vendor, employee
  display_name    TEXT NOT NULL,
  email           TEXT,
  is_1099         BOOLEAN NOT NULL DEFAULT FALSE,
  currency_code   TEXT NOT NULL,
  is_active       BOOLEAN NOT NULL,
  loaded_at       TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE solid.journal_entries (
  id                 TEXT PRIMARY KEY,
  entry_number       TEXT NOT NULL,
  transaction_type   TEXT NOT NULL,
  date               DATE NOT NULL,
  status             TEXT NOT NULL,
  contact_id         TEXT,
  memo               TEXT,
  source             TEXT NOT NULL,
  total_cents        BIGINT NOT NULL,
  loaded_at          TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
 
CREATE TABLE solid.journal_entry_lines (
  id                  TEXT PRIMARY KEY,
  journal_entry_id    TEXT NOT NULL REFERENCES solid.journal_entries(id),
  line_number         INT NOT NULL,
  account_id          TEXT NOT NULL REFERENCES solid.accounts(id),
  description         TEXT,
  debit_cents         BIGINT NOT NULL DEFAULT 0,
  credit_cents        BIGINT NOT NULL DEFAULT 0,
  contact_id          TEXT,
  class_id            TEXT,
  location_id         TEXT,
  project_id          TEXT
);
 
-- Daily snapshot tables — never overwritten
CREATE TABLE solid.account_balances_daily (
  snapshot_date    DATE NOT NULL,
  account_id       TEXT NOT NULL,
  balance_cents    BIGINT NOT NULL,
  PRIMARY KEY (snapshot_date, account_id)
);
 
CREATE TABLE solid.aging_ar_daily (
  snapshot_date    DATE NOT NULL,
  customer_id      TEXT NOT NULL,
  bucket           TEXT NOT NULL,    -- current / 30 / 60 / 90 / over90
  balance_cents    BIGINT NOT NULL,
  PRIMARY KEY (snapshot_date, customer_id, bucket)
);

Indexes on journal_entries.date, journal_entry_lines.account_id, and journal_entry_lines.contact_id will speed up the most common BI queries. Add as you observe slow dashboards.

Step 3 — The ETL job

import os
import requests
import psycopg2
from psycopg2.extras import execute_batch
from datetime import date, timedelta
 
SOLID_API = os.environ["SOLID_API"]      # e.g. https://your-solid.example.com/api/v1
SOLID_API_KEY = os.environ["SOLID_API_KEY"]
PG_DSN = os.environ["PG_DSN"]
 
def solid_get(path, params=None):
    r = requests.get(
        f"{SOLID_API}{path}",
        headers={"X-API-Key": SOLID_API_KEY},
        params=params,
        timeout=60,
    )
    r.raise_for_status()
    return r.json()["data"]
 
def upsert_accounts(conn):
    accounts = solid_get("/accounts")
    with conn.cursor() as cur:
        cur.execute("TRUNCATE solid.accounts")
        execute_batch(cur, """
            INSERT INTO solid.accounts (id, account_number, name, account_type,
              category, normal_balance, parent_id, currency_code, is_active)
            VALUES (%(id)s, %(account_number)s, %(name)s, %(account_type)s,
              %(category)s, %(normal_balance)s, %(parent_id)s, %(currency_code)s,
              %(is_active)s)
        """, accounts)
    conn.commit()
 
def upsert_contacts(conn):
    contacts = solid_get("/contacts")
    # ... similar pattern
 
def incremental_journal_entries(conn):
    with conn.cursor() as cur:
        cur.execute("SELECT MAX(date) FROM solid.journal_entries")
        last = cur.fetchone()[0] or date(2020, 1, 1)
    # Pull entries from a few days before last, in case of late-arriving entries
    fetch_from = last - timedelta(days=7)
    entries = solid_get("/entries", params={"from": fetch_from.isoformat()})
    # Upsert by id (entries are immutable but we re-pull to catch reversals)
    with conn.cursor() as cur:
        for e in entries:
            cur.execute("""
                INSERT INTO solid.journal_entries (id, entry_number, transaction_type,
                  date, status, contact_id, memo, source, total_cents)
                VALUES (%(id)s, %(entry_number)s, %(transaction_type)s, %(date)s,
                  %(status)s, %(contact_id)s, %(memo)s, %(source)s, %(total_cents)s)
                ON CONFLICT (id) DO UPDATE SET
                  status = EXCLUDED.status,
                  memo = EXCLUDED.memo
            """, e)
            # Lines too — pull and upsert
    conn.commit()
 
def snapshot_balances(conn):
    today = date.today()
    balances = solid_get("/balances", params={"as_of": today.isoformat()})
    with conn.cursor() as cur:
        execute_batch(cur, """
            INSERT INTO solid.account_balances_daily (snapshot_date, account_id, balance_cents)
            VALUES (%s, %s, %s)
            ON CONFLICT (snapshot_date, account_id) DO UPDATE
              SET balance_cents = EXCLUDED.balance_cents
        """, [(today, b["account_id"], b["balance_cents"]) for b in balances])
    conn.commit()
 
def main():
    conn = psycopg2.connect(PG_DSN)
    upsert_accounts(conn)
    upsert_contacts(conn)
    incremental_journal_entries(conn)
    snapshot_balances(conn)
    conn.close()
    print("ETL complete")
 
if __name__ == "__main__":
    main()

Step 4 — Schedule it

Cron, Airflow, GitHub Actions, AWS EventBridge — pick whatever your stack uses. Daily at 03:00 local time is conventional (after the previous business day, before anyone's at their desk).

0 3 * * * /usr/bin/python3 /opt/solid-etl/run.py >> /var/log/solid-etl.log 2>&1

Set up alerting on the cron job's exit code — silent failures are the worst kind of integration bug.

Step 5 — Connect your BI tool

Each BI tool has its own way to add a Postgres connection. The shape:

Tool	Connection type
Power BI	Get Data → Database → PostgreSQL
Looker	Database connection → PostgreSQL → service-account credentials
Sigma	Connections → New → PostgreSQL
Metabase	Add database → PostgreSQL
Tableau	Data → New Data Source → PostgreSQL

Use a read-only service account for the BI tool — separate from the ETL writer account. Standard security hygiene.

Common dashboards

What you'd build first:

Revenue trend — journal_entry_lines filtered to revenue accounts, grouped by month
AR aging trend — aging_ar_daily grouped by snapshot_date and bucket; line chart over time
Top customers — journal_entry_lines filtered to revenue, joined to contacts, summed and sorted
Cash flow forecast — bills due in next 30/60/90 days vs. AR collectible in next 30/60/90 days
Cost-of-customer-acquisition — marketing spend (filtered) divided by new customers (contacts filtered to created-this-period)
Margin by class/location/project — joins through journal_entry_lines.class_id etc. to dimensional reporting beyond what Solid's UI does

Performance considerations

For files with multi-million-line GLs, the daily ETL can take 5–30 minutes. Things to optimize:

Incremental pulls — from=YYYY-MM-DD query parameter limits journal-entry pulls to recent activity
Pagination — Solid's API paginates at 500 entries; pull in pages and bulk-insert per page
Index your warehouse tables — particularly journal_entry_lines.account_id and journal_entry_lines.transaction_date
Skip snapshots on closed periods — once a period is closed, the daily snapshots stop changing; you can stop snapshotting it

Variations

DuckDB instead of Postgres

For solo / small teams, replace psycopg2 with the duckdb Python library and write to a local .duckdb file. Same SQL works. Your BI tool can read DuckDB files directly (most do).

Push to Snowflake / BigQuery

Same shape, different connection. Use snowflake-connector-python or google-cloud-bigquery. Both handle batched inserts efficiently.

Real-time via webhooks

Replace cron with a webhook receiver: Solid POSTs to your service whenever a journal entry posts (when outbound webhooks ship). Your service updates the warehouse row. BI tool sees data within seconds. More complex; only worth it for time-sensitive dashboards.

Cross-references

REST API → Endpoint groups — full list of available endpoints
Reports module — what's available built-in before you'd build a custom dashboard
Stripe → Solid Accounting — inbound integration as a different shape