CM02

Financial Data Review

Journey: Customer Mobile
Duration: ~2 minutes
AI Agent: Data Aggregator
External APIs: TrueLayer, Xero, Companies House
1

User Interface Layer

Financial summary cards with validation and categorization controls

📊 Financial Summary Cards

  • Revenue Card: Annual revenue (£450K), YoY growth (+18%), verified badge
  • Expenses Card: Total expenses (£312K), profit margin (30.6%), breakdown available
  • Profit Card: Net profit (£137,700), margin visualization, trend indicators
  • Cash Flow Card: Current balance, average monthly balance, liquidity score
  • Data Source Badges: Shows Xero UK, TrueLayer, Companies House with timestamps

✅ Validation Controls

  • View Source Button: Opens modal showing raw data from TrueLayer/Xero
  • Details Button: Expands card to show monthly breakdown
  • AI Confidence Badge: Shows data quality score (98%)
  • Verification Checkmarks: Green badges for validated data points

🔄 Re-categorization Interface

  • Expense Breakdown Modal: Shows categorized transactions with AI suggestions
  • Flagged Items: £2,500 equipment purchase incorrectly categorized
  • Re-categorize Options: Dropdown to move from OpEx to CapEx
  • Reason Field: Text input for customer explanation
  • Real-time Update: Numbers recalculate immediately after change

💬 AI Chat Response

  • Success Message: "Great! I've gathered your financial data from Xero UK and TrueLayer"
  • Summary Statement: Brief overview of data sources and records pulled
  • Verification Prompt: "Please review and let me know if anything needs adjustment"

🎨 UI Events Captured

  • onViewSource() → Opens modal with raw data from external API
  • onViewDetails() → Expands card to show monthly trends
  • onRecategorize() → Opens categorization modal for transaction adjustment
  • onConfirmData() → Validates all fields and proceeds to CM03
2

API / Backend for Frontend (BFF)

Data aggregation endpoint with multi-source integration

🔌 Primary API Endpoint

  • Method: GET
  • URL: /api/v1/applications/{application_id}/financial-summary
  • Authentication: Bearer token (JWT)
  • Response Time: 2-3 seconds (multi-source aggregation)
Request
GET /api/v1/applications/app_20251222_143023_usr123/financial-summary
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
Accept: application/json

// Query Parameters (optional)
?include_transactions=true
&period_months=24
Response (200 OK)
{
  "request_id": "req_cm02_20251222_143125",
  "status": "success",
  "data_quality_score": 0.98,
  "financial_summary": {
    "revenue": {
      "annual": 450000,
      "formatted": "£450,000",
      "yoy_growth": 0.18,
      "monthly_average": 37500,
      "source": "xero_uk",
      "verified": true
    },
    "expenses": {
      "annual": 312300,
      "formatted": "£312,300",
      "breakdown": {
        "operating": 285000,
        "capital": 27300
      },
      "flagged_items": [
        {
          "transaction_id": "txn_espresso_2500",
          "amount": 2500,
          "description": "Commercial espresso machine",
          "current_category": "operating_expense",
          "suggested_category": "capital_expenditure",
          "confidence": 0.94
        }
      ],
      "source": "xero_uk"
    },
    "profit": {
      "net": 137700,
      "formatted": "£137,700",
      "margin": 0.306,
      "trend": "increasing"
    },
    "cashflow": {
      "current_balance": 48200,
      "average_balance_12m": 42500,
      "source": "truelayer",
      "account_id": "acc_***4532"
    }
  },
  "data_sources": {
    "xero": {
      "last_sync": "2025-12-22T09:18:43Z",
      "records_pulled": 3487,
      "period_months": 36
    },
    "truelayer": {
      "last_sync": "2025-12-22T09:18:52Z",
      "transactions": 1847,
      "period_months": 24
    },
    "companies_house": {
      "company_number": "12345678",
      "company_name": "Smith's Artisan Café Ltd",
      "status": "active",
      "incorporation_date": "2022-03-15"
    }
  },
  "processing_time_ms": 2847,
  "timestamp": "2025-12-22T14:31:25Z"
}

🔄 Re-categorization Endpoint

  • Method: PATCH
  • URL: /api/v1/applications/{application_id}/transactions/{transaction_id}/category
  • Purpose: Update transaction category based on customer input
PATCH Request
PATCH /api/v1/applications/.../transactions/txn_espresso_2500/category

{
  "new_category": "capital_expenditure",
  "reason": "New espresso machine purchase - capital asset",
  "updated_by": "customer"
}

// Response: 200 OK with updated financial summary

🔒 Security & Caching

  • JWT Validation: Verify customer owns this application
  • Redis Caching: Cache financial summary for 5 minutes to reduce API calls
  • Rate Limiting: Max 20 requests per minute per application
  • Data Encryption: All sensitive financial data encrypted at rest and in transit
3

Business Logic & AI Orchestration

Data Aggregator AI agent with multi-source normalization

🤖 AI Agent: Data Aggregator

  • Model Type: Custom ML pipeline + GPT-4 for categorization
  • Training Data: 100,000+ financial statements from UK SMBs
  • Processing Time: ~2.5 seconds for multi-source aggregation
  • Accuracy: 96.3% on transaction categorization
Data Aggregator Pipeline
// Data Aggregator Processing Steps

STEP 1: Parallel API Calls
├─ TrueLayer API    → Banking transactions (24 months)
├─ Xero API         → P&L, invoices, expenses (36 months)
└─ Companies House  → Company verification data

STEP 2: Data Normalization
├─ Standardize date formats (ISO 8601)
├─ Convert currencies to GBP
├─ Normalize account categories
└─ Remove duplicates across sources

STEP 3: Transaction Categorization
├─ ML model classifies each transaction
├─ Categories: Revenue, OpEx, CapEx, Cost of Goods, Taxes
├─ Confidence scoring per transaction
└─ Flag ambiguous items (confidence < 0.80)

STEP 4: Financial Metrics Calculation
├─ Annual revenue (sum all revenue transactions)
├─ Total expenses (OpEx + CapEx + COGS)
├─ Net profit (revenue - expenses)
├─ Profit margin (profit / revenue)
├─ YoY growth ((current - previous) / previous)
└─ Cashflow analysis (TrueLayer balance data)

STEP 5: Data Quality Scoring
├─ Completeness check (all required fields present?)
├─ Consistency check (Xero vs TrueLayer reconciliation)
├─ Recency check (data within last 90 days?)
└─ Overall score: 0.00 - 1.00

STEP 6: Anomaly Detection
├─ Flag unusual transactions (>3 std deviations)
├─ Identify potential miscategorizations
└─ Generate customer review suggestions

📊 Transaction Categorization Logic

  • Revenue: Keyword matching (invoice, payment received, sales) + merchant category codes
  • Operating Expenses: Recurring expenses (rent, utilities, payroll, supplies)
  • Capital Expenditure: Large one-time purchases of assets (equipment, vehicles, property improvements)
  • Cost of Goods: Direct production costs (ingredients, raw materials, inventory purchases)
  • Threshold Logic: Purchases >£1,000 analyzed for CapEx vs OpEx classification
ML Categorization Model
// Example: £2,500 espresso machine transaction

INPUT: {
  description: "Commercial espresso machine",
  amount: 2500,
  merchant: "CoffeeQuip Ltd",
  date: "2025-08-15"
}

ML ANALYSIS:
- Keywords: "commercial", "machine" → Equipment
- Amount: £2,500 → Above £1K threshold
- Merchant: Coffee equipment supplier
- Expected lifespan: >1 year (asset)
- Usage: Business operations (not resale)

PREDICTED CATEGORY: Capital Expenditure
CONFIDENCE: 0.94 (94%)

CURRENT XERO CATEGORY: Operating Expense
ACTION: Flag for customer review

✅ Data Quality Scoring

  • Completeness (40%): All required fields present? Missing data percentage?
  • Consistency (30%): Xero and TrueLayer data reconciled? Discrepancy <5%?
  • Recency (20%): Data updated within last 90 days? Real-time sync available?
  • Accuracy (10%): Validation checks passed? Known error patterns detected?
Quality Score Example
// Olivia's café data quality assessment

Completeness: 1.00 (100% - all fields present)
Consistency:  0.97 (97% - £3K discrepancy in 36 months)
Recency:      1.00 (100% - synced today)
Accuracy:     0.95 (95% - 3 flagged transactions)

OVERALL SCORE:
(1.00 × 0.40) + (0.97 × 0.30) + (1.00 × 0.20) + (0.95 × 0.10) = 0.98 (98%)

🔄 Re-categorization Workflow

  • Customer Review: Flagged items presented in modal with AI suggestion
  • Customer Choice: Accept suggestion or manually select new category
  • Reason Capture: Optional text field for customer explanation
  • Immediate Recalculation: Financial metrics updated in real-time
  • Audit Trail: All changes logged with timestamp and reason
4

Integration & Middleware Layer

API Gateway, caching, and data transformation

🌐 API Gateway Orchestration

  • Technology: AWS API Gateway with Lambda functions
  • Routing Logic: Routes to Data Aggregator service based on application_id
  • Request Queuing: SQS for handling multiple simultaneous data pulls
  • Timeout Handling: 30-second timeout with graceful fallback

💾 Redis Caching Strategy

  • Cache Key: financial_summary:{application_id}
  • TTL: 5 minutes (300 seconds)
  • Invalidation: Clear cache on customer re-categorization
  • Cache Hit Rate: ~85% (reduces external API calls significantly)
Redis Cache Structure
KEY: financial_summary:app_20251222_143023_usr123
VALUE: // Full JSON response from aggregator
TTL: 300 seconds

// Cache invalidation triggers:
- Customer re-categorizes transaction
- Customer requests manual refresh
- External API data updated (webhook)

🔐 Data Transformation Layer

  • Normalization: Converts all external API responses to standard format
  • Currency Conversion: Ensures all amounts in GBP
  • Date Standardization: ISO 8601 format across all sources
  • PII Masking: Account numbers partially masked (***4532)

⚡ Performance Optimization

  • Parallel API Calls: TrueLayer, Xero, Companies House called simultaneously
  • Response Streaming: Send partial data as soon as available
  • CDN Caching: Static assets (icons, labels) cached at edge
  • Connection Pooling: Reuse database and API connections
5

External Systems Integration

Third-party financial data providers

🏦
TrueLayer
Type: Open Banking API (PSD2 compliant)
Data: Bank account transactions, balances, direct debits
Endpoint: GET /data/v1/transactions
Auth: OAuth 2.0 with customer consent (90-day refresh token)
Rate Limit: 100 requests/minute per client
Response Time: 800-1200ms average
Example Call:
GET https://api.truelayer.com/data/v1/transactions
?from=2023-12-01&to=2025-12-22
Authorization: Bearer {access_token}

// Returns 1,847 transactions
📊
Xero UK
Type: Cloud accounting platform API
Data: P&L statements, invoices, expenses, balance sheet
Endpoints: /Reports/ProfitAndLoss, /Invoices
Auth: OAuth 2.0 with customer consent (60-minute tokens)
Rate Limit: 60 requests/minute per organization
Response Time: 1200-1800ms average
Example Call:
GET https://api.xero.com/api.xro/2.0/Reports/ProfitAndLoss
?fromDate=2022-12-01&toDate=2025-12-22
Authorization: Bearer {access_token}
Xero-tenant-id: {tenant_id}

// Returns 36 months P&L data
🏢
Companies House
Type: UK government company registry API
Data: Company details, directors, filing history, SIC codes
Endpoint: GET /company/{company_number}
Auth: API key (free tier: 600 requests/5 minutes)
Rate Limit: 600 requests per 5-minute window
Response Time: 400-600ms average
Example Call:
GET https://api.company-information.service.gov.uk/company/12345678
Authorization: Basic {api_key_base64}

// Returns: Smith's Artisan Café Ltd
// Status: Active, Incorporated: 2022-03-15

🔄 Webhook Integrations

  • TrueLayer Webhooks: Real-time notifications for new transactions
  • Xero Webhooks: Notifications when invoices or expenses are updated
  • Cache Invalidation: Webhooks trigger Redis cache clear for affected applications

⚠️ Error Handling & Fallbacks

  • API Unavailable: Use cached data if available, show warning to customer
  • Partial Failure: Show available data (e.g., Xero works, TrueLayer fails)
  • Timeout: Retry once after 5 seconds, then fail gracefully
  • Rate Limit Hit: Queue request for retry in 60 seconds
6

Data Persistence Layer

Financial data storage and audit logging

💾 Update Applications Table

  • Database: PostgreSQL
  • Table: applications
  • Operation: UPDATE with financial summary data
  • Status Change: "intent_captured" → "data_reviewed"
SQL Update Statement
UPDATE applications
SET 
  status = 'data_reviewed',
  annual_revenue = 450000,
  annual_expenses = 312300,
  net_profit = 137700,
  profit_margin = 0.306,
  data_quality_score = 0.98,
  data_sources = 'xero_uk,truelayer,companies_house',
  last_data_sync = '2025-12-22 14:31:25',
  updated_at = '2025-12-22 14:31:25'
WHERE application_id = 'app_20251222_143023_usr123';

📊 Store Financial Transactions

  • Database: PostgreSQL
  • Table: financial_transactions
  • Records: All 1,847 TrueLayer + 3,487 Xero transactions stored
  • Purpose: Enable detailed analysis, re-categorization, audit trail
Transactions Table Schema
CREATE TABLE financial_transactions (
  transaction_id      VARCHAR(50) PRIMARY KEY,
  application_id      VARCHAR(50) REFERENCES applications,
  source              VARCHAR(20),  -- 'xero' or 'truelayer'
  transaction_date    DATE,
  amount              DECIMAL(12,2),
  description         TEXT,
  category            VARCHAR(50),
  ai_category         VARCHAR(50),  -- AI suggested category
  confidence_score    DECIMAL(3,2),
  flagged             BOOLEAN DEFAULT FALSE,
  customer_reviewed   BOOLEAN DEFAULT FALSE,
  created_at          TIMESTAMP
);

📋 Audit Trail Logging

  • Database: Elasticsearch
  • Index: application-audit-logs
  • Events Logged: Data pull, categorization, customer review, re-categorization
Audit Log Entries
// Log 1: Data aggregation completed
{
  "log_id": "log_cm02_20251222_143125",
  "event_type": "financial_data_aggregated",
  "screen": "CM02",
  "user_id": "usr_olivia_thompson",
  "application_id": "app_20251222_143023_usr123",
  "timestamp": "2025-12-22T14:31:25Z",
  "details": {
    "sources": ["xero_uk", "truelayer", "companies_house"],
    "total_transactions": 5334,
    "data_quality_score": 0.98,
    "flagged_items": 1,
    "processing_time_ms": 2847
  }
}

// Log 2: Customer re-categorization
{
  "log_id": "log_cm02_20251222_143247",
  "event_type": "transaction_recategorized",
  "screen": "CM02",
  "user_id": "usr_olivia_thompson",
  "timestamp": "2025-12-22T14:32:47Z",
  "details": {
    "transaction_id": "txn_espresso_2500",
    "old_category": "operating_expense",
    "new_category": "capital_expenditure",
    "reason": "New espresso machine - capital asset",
    "amount": 2500
  }
}

🗄️ Data Retention Policy

  • Applications Table: Retained indefinitely (active records)
  • Financial Transactions: 7 years (regulatory requirement)
  • Audit Logs: 7 years in Elasticsearch
  • Redis Cache: 5 minutes TTL (ephemeral)

🔄 Multi-Source Data Aggregation Flow

📱
Customer
1. Navigate to CM02
🌐
Mobile App
🌐
Mobile App