Documentation

Documentation

Complete guide to CDC, Transformations, and Multi-Tenant Architecture

DataFlow Portal Overview

DataFlow Portal is a comprehensive data lake management solution that enables you to build robust ETL pipelines from Oracle ERP databases to Google BigQuery with full multi-tenant support.

Key Features

Real-time CDC

Multiple CDC methods for different use cases

3-Stage Pipeline

Bronze → Silver → Gold data lake architecture

Multi-Tenant

One pipeline, many tenants with data isolation

Row-Level Security

Automatic RLS policies per tenant

Data Lake Architecture
Oracle ERP
CDC Replication
Bronze (Raw)
Silver (Clean)
Gold (Business)
Bronze Layer

Raw data exactly as it comes from the source. No transformations applied. Each tenant gets their own isolated dataset (e.g., bronze_tenant_a).

Silver Layer

Cleaned and standardized data. Data types are cast, nulls are handled, and records are filtered. Shared tables with tenant_id column for isolation.

Gold Layer

Business-ready aggregated data. Joins, calculations, and KPIs. Protected by Row-Level Security (RLS) policies.