Maintenance Schedule

This page contains the recurring maintenance checklists for the SONAN DIGITAL CRM. These tasks keep the platform secure, performant, and up to date. Assign each checklist to a named engineer at the start of each cycle.

Overview

Frequency	When	Approximate Time
Monthly	First Monday of each month	1–2 hours
Quarterly	First Monday of Jan, Apr, Jul, Oct	3–4 hours
Annual	January each year	1 day

Monthly Checklist

Perform on the first Monday of each month. Document the outcome of each item as a comment or in a shared log.

Error Tracking

[ ] Review Sentry issues — open Sentry → Issues → filter to "Last 30 days". For each open issue:
Assign to an engineer if unowned.
Resolve issues where the underlying bug has been fixed.
Mark as "Ignored" (with a comment) for known non-actionable errors.
Create a bug ticket for any issue that is recurring and not yet addressed.
[ ] Check Sentry error rate trend — is the 30-day error rate higher than last month? If yes, investigate the cause before closing the review.

Performance

[ ] Check Vercel function execution times — go to Vercel Dashboard → Analytics → Web Analytics or the Logs tab. Look for any routes where response times have increased compared to last month. Flag routes averaging > 2 seconds for investigation.
[ ] Check Supabase database size — Supabase Dashboard → Database → Database Settings → Database Size. If the database is growing faster than expected, review which tables are growing and whether old records need archiving.
[ ] Check Supabase query performance — Supabase Dashboard → Database → Query Performance (pg_stat_statements). Look for queries with high mean_exec_time (> 500ms). Create indexes or optimize queries as needed.

Security

[ ] Run npm audit — in the project root: bash npm audit Review the output for high and critical severity vulnerabilities. Apply patches: bash npm audit fix # For breaking changes, review manually: npm audit fix --force Commit and deploy any security patches immediately.

Operations

[ ] Verify cron jobs ran correctly — check Vercel Dashboard → Settings → Cron Jobs for each job. Confirm all runs in the past 30 days succeeded. For any failures, confirm they were addressed at the time or re-run manually.
[ ] Review and rotate secrets if schedule is due — check the Secrets Inventory table. If any secret has passed its rotation interval, perform rotation per the rotation schedule.
[ ] Test backup restore — Supabase Pro includes daily database backups. Once a month, spot-check the restore:
Supabase Dashboard → Database → Backups.
Note the most recent backup timestamp.
Use the Point in Time Recovery feature to restore a single table to a scratch environment (or simply confirm the backup is listed and Supabase reports it as valid).
Document: backup date verified, table checked, restore tested successfully.

Access Review

[ ] Review user access — go to the CRM admin panel → Settings → Team Management. Review the list of active staff accounts:
Deactivate any accounts for employees who have left the company.
Verify that all active accounts have the correct role (employee vs. admin).
Confirm no unexpected admin accounts exist.
[ ] Review Vercel team access — Vercel Dashboard → Team Settings → Members. Remove any former team members.
[ ] Review Supabase project access — Supabase Dashboard → Project Settings → Team. Remove any former team members.

Quarterly Checklist

Perform in January, April, July, and October. This is a deeper review than the monthly cadence.

Dependencies

[ ] Full dependency update — review and update all npm packages: bash npx npm-check-updates -u npm install npm run build npx tsc --noEmit Test thoroughly on the dev branch before merging to main. Pay particular attention to:
Next.js version (read the migration guide for any minor version bumps).
Supabase JS client (@supabase/supabase-js).
Stripe JS (stripe, @stripe/stripe-js).
Sentry (@sentry/nextjs).
[ ] Review breaking changes in updated packages. Check each package's changelog for deprecations or API changes that affect the codebase.

Security

[ ] Review RLS policies for new tables — for every new database table added since the last quarterly review, verify a row-level security policy exists: sql -- List tables with RLS disabled SELECT schemaname, tablename, rowsecurity FROM pg_tables WHERE schemaname = 'public' AND rowsecurity = false; Any table listed here that is not intentionally public must have RLS enabled and appropriate policies applied.
[ ] Security audit of new API routes — for every new /api/ route added since the last review:
Confirm authentication check is present (requireAdminWithTenant() or equivalent).
Confirm input validation is applied to all request body fields.
Confirm export const runtime = 'edge' is present.
Confirm no secrets are logged or returned in responses.
[ ] Review Stripe webhook event handling — check the Stripe changelog for new API versions or deprecated events. Ensure the webhook handler covers all event types the application subscribes to. Update the Stripe API version in vercel.json / Stripe client initialization if needed.

Infrastructure

[ ] Review Supabase SDK version — check the current @supabase/supabase-js version in package.json against the latest release on npm. Supabase sometimes introduces behavior changes in minor versions — read the release notes.
[ ] Performance review — run a full Lighthouse audit:
Open the production app in Chrome Incognito.
Open DevTools → Lighthouse → run for Desktop and Mobile.
Document scores for: Performance, Accessibility, Best Practices, SEO.
Address any Performance or Accessibility regressions from the previous quarter.
[ ] API response time review — using Vercel Analytics or a manual sampling tool, record the p50 and p95 response times for the 5 most-used API routes. Flag any that have degraded since last quarter.

Documentation

[ ] Review this documentation portal — verify the docs reflect the current state of the system. Update any pages that describe outdated features, removed routes, or changed processes.

Annual Tasks

Perform in January each year. These tasks require dedicated time — block a full day.

Security

[ ] Full security review — conduct a systematic review of the entire application:
All API routes: authentication, authorization, input validation, output sanitization.
All RLS policies: do they correctly enforce tenant isolation?
All file upload paths: are file types validated? Are uploads scoped to the correct tenant?
All environment variables: are any secrets inadvertently exposed?
Review the OWASP Top 10 and verify the application addresses each category.
[ ] Consider external penetration test — for high-growth years or when handling particularly sensitive client data, engage an external security firm for a penetration test. Document findings and remediation actions.

Disaster Recovery

[ ] Review and update the Disaster Recovery (DR) plan — the DR plan should document:
RTO (Recovery Time Objective): target time to restore service after a major outage.
RPO (Recovery Point Objective): maximum acceptable data loss (time window).
Steps to restore from Supabase backup.
Steps to redeploy the application from scratch (fork → env vars → deploy).
Contact list for all critical vendors. Update this plan to reflect any architectural changes made during the year.
[ ] Run a DR drill — simulate a catastrophic failure in a staging environment:
Restore the Supabase database from a backup to a new project.
Deploy the application to a new Vercel project pointing at the restored database.
Verify the application is functional.
Document the time taken and any gaps found in the DR plan.

Documentation and Compliance

[ ] Update all documentation — review every page in this documentation portal. Update for any architectural changes, new modules, removed features, or changed processes introduced during the year.
[ ] Review SLA commitments — compare the year's actual uptime data (from UptimeRobot monthly reports) against the 99.5% SLA target. If SLA was missed in any month, document the cause and ensure the prevention action from the relevant post-mortem has been implemented.

Infrastructure

[ ] SSL certificate review — Cloudflare handles SSL certificate auto-renewal. Verify auto-renewal is still enabled and that the certificate expiry dates are at least 60 days out: Cloudflare Dashboard → SSL/TLS → Edge Certificates.
[ ] Cost review — review monthly spend for all services:
Vercel Team plan
Supabase Pro plan
Resend plan
Sentry Team plan
UptimeRobot Are costs in line with usage? Are there unused features in paid plans? Should plans be upgraded or downgraded?
[ ] Vendor contract review — review terms of service and pricing for all vendors. Note any upcoming price changes or plan discontinuations.

💡

Schedule ahead

Add the quarterly and annual maintenance tasks to the team calendar at the start of each year. Blocked time is the only reliable way to ensure these tasks don't get skipped during busy periods.

Cron Jobs

This page documents all scheduled background jobs running in production — their schedules, what they do, how they are authenticated, and how to test and troubleshoot them.

Overview

The CRM runs scheduled background jobs via Vercel Cron. These jobs are triggered by Vercel's internal scheduler, which calls a designated API route on the configured schedule.

Key points: - All cron endpoints require a CRON_SECRET bearer token — requests without it are rejected with 401 Unauthorized. - Jobs are designed to be idempotent — running them twice produces the same result as running them once. - Cron jobs are defined in vercel.json at the project root.

Authentication

Every cron endpoint checks for the CRON_SECRET bearer token:

// Pattern used in all cron routes
const authHeader = req.headers.get('authorization')
if (authHeader !== `Bearer ${process.env.CRON_SECRET}`) {
  return NextResponse.json({ error: 'Unauthorized' }, { status: 401 })
}

When Vercel calls a cron route, it automatically attaches this header. When triggering manually (for testing), you must include it yourself:

curl -X POST https://yourdomain.com/api/admin/invoices/cron \
  -H "Authorization: Bearer {{ CRON_SECRET }}" \
  -H "Content-Type: application/json"

⚠️

Never expose CRON_SECRET

The CRON_SECRET is a server-only environment variable. Never log it, include it in client code, or share it outside the engineering team.

Job 1: Invoice Overdue Check

Property	Value
Schedule	`0 2 * * *` — 02:00 UTC daily
Route	`POST /api/admin/invoices/cron`
Action	`overdue` (passed as query param or body field)
Idempotent	Yes

What It Does

Queries the invoices table for all records where:
status = 'sent'
due_date < CURRENT_DATE
For each matching invoice, updates status to 'overdue'.
Sends an overdue reminder email to the client via Resend, using the client's primary contact email.
Creates an in-app notification for the assigned account manager.

Invoices already in 'overdue', 'paid', or 'cancelled' status are not touched — the query specifically filters for status = 'sent', making the job safe to re-run.

Example Manual Trigger

curl -X POST "https://yourdomain.com/api/admin/invoices/cron?action=overdue" \
  -H "Authorization: Bearer {{ CRON_SECRET }}"

Expected Response

{
  "success": true,
  "processed": 3,
  "message": "3 invoices marked as overdue"
}

Job 2: Recurring Invoice Generation

Property	Value
Schedule	`0 3 * * *` — 03:00 UTC daily
Route	`POST /api/admin/invoices/cron`
Action	`recurring` (passed as query param or body field)
Idempotent	Yes

What It Does

Queries the invoices table (or a recurring_invoice_templates table) for records where:
is_recurring = true
next_invoice_date <= CURRENT_DATE
status is not 'cancelled'
For each match, creates a new invoice as a copy of the template (same client, line items, amounts, payment terms).
Sets the new invoice's status to 'draft' (requires admin review before sending) or 'sent' depending on the template's auto_send flag.
Updates next_invoice_date on the template to the next occurrence (e.g., adds 1 month for monthly recurring).

The idempotency guarantee comes from the next_invoice_date update — once updated, the same template will not match the query again until the next cycle.

Example Manual Trigger

curl -X POST "https://yourdomain.com/api/admin/invoices/cron?action=recurring" \
  -H "Authorization: Bearer {{ CRON_SECRET }}"

Expected Response

{
  "success": true,
  "generated": 2,
  "message": "2 recurring invoices generated"
}

Job 3: Appointments Cron (Reference)

Property	Value
Route	`POST /api/admin/appointments/cron`
Purpose	Sends appointment reminder emails and/or marks past appointments as completed
Auth	`Authorization: Bearer {{ CRON_SECRET }}`

This job follows the same authentication and response pattern as the invoice cron jobs. Refer to the appointments module source code for its specific schedule and logic.

Vercel Cron Configuration

Cron jobs are declared in vercel.json at the project root:

{
  "crons": [
    {
      "path": "/api/admin/invoices/cron?action=overdue",
      "schedule": "0 2 * * *"
    },
    {
      "path": "/api/admin/invoices/cron?action=recurring",
      "schedule": "0 3 * * *"
    },
    {
      "path": "/api/admin/appointments/cron",
      "schedule": "0 6 * * *"
    }
  ]
}

ℹ️

Vercel Cron and Edge Runtime

Vercel Cron invocations go through the standard HTTP request path — the cron route is a normal edge function. The CRON_SECRET authorization check is what distinguishes a scheduled invocation from an unauthorized external request.

Schedule format: Standard 5-field cron (minute hour day-of-month month day-of-week). All times are UTC.

Field	Values
minute	0–59
hour	0–23 (UTC)
day-of-month	1–31
month	1–12
day-of-week	0–7 (0 and 7 = Sunday)

Testing Crons Locally

To test a cron job during local development:

1. Set CRON_SECRET in .env.local:

CRON_SECRET=local-dev-secret-change-me

2. Start the dev server:

npm run dev

3. Trigger the cron manually via curl:

# Invoice overdue check
curl -X POST "http://localhost:3000/api/admin/invoices/cron?action=overdue" \
  -H "Authorization: Bearer local-dev-secret-change-me"

# Recurring invoice generation
curl -X POST "http://localhost:3000/api/admin/invoices/cron?action=recurring" \
  -H "Authorization: Bearer local-dev-secret-change-me"

4. Check the terminal output for the function's log output and the returned JSON.

Monitoring Cron Jobs

After each expected run, verify:

Vercel Dashboard → Project → Settings → Cron Jobs — check the run history. Green = success, red = failure.
Run the query manually in Supabase SQL editor to confirm the expected DB changes: sql -- For overdue check: how many were marked overdue today? SELECT COUNT(*) FROM invoices WHERE status = 'overdue' AND updated_at::date = CURRENT_DATE;
Check Resend Dashboard for outbound overdue reminder emails sent around 02:00 UTC.

What to Do If a Cron Fails

1. Check the Vercel Cron logs

Go to Vercel Dashboard → Project → Settings → Cron Jobs → [job name] → View Logs. Look for the HTTP status code and response body.

Status	Likely Cause
`401 Unauthorized`	`CRON_SECRET` env var missing or changed — verify in Vercel environment variables
`500 Internal Server Error`	Application error in the cron handler — check Sentry and Vercel edge logs
`504 Gateway Timeout`	The cron job is processing too many records — may need pagination
No log at all	Vercel scheduler may have missed the run — check if the app was in a failed deployment state

2. Trigger the job manually

Once the root cause is identified and fixed, trigger the missed run manually via curl (see Testing Crons Locally above, using the production URL and production CRON_SECRET).

⚠️

Running missed crons on production

When manually triggering a cron on production, be aware of side effects: the overdue cron will send emails to real clients. If the missed run is being re-triggered after a delay of more than a day, verify whether the email sends are still appropriate before triggering.

3. Document the failure

Record in the incident log: - Which cron job failed - The scheduled run time - The actual failure time and error - Whether data was affected (e.g., invoices not marked overdue) - Whether the manual re-trigger resolved the data inconsistency - Root cause and fix applied

4. Verify data integrity after recovery

For the invoice overdue cron, run a reconciliation query in Supabase:

-- Find any invoices that should be overdue but are still 'sent'
SELECT id, client_id, due_date, status
FROM invoices
WHERE status = 'sent'
  AND due_date < CURRENT_DATE
ORDER BY due_date;

If rows are returned, the overdue cron missed them. Update them manually or re-trigger the cron after verifying the fix.

Monitoring & Logging

This page covers how to monitor the SONAN DIGITAL CRM in production — error tracking via Sentry, log access via Vercel, uptime monitoring, alert escalation, and the weekly review process.

1. Sentry — Error Tracking

Sentry captures all unhandled exceptions from both server-side (edge functions) and client-side (browser) code.

Dashboard: https://sentry.io/organizations/sonan-digital/

What to Check

Signal	Meaning	Action
New issue (unseen)	A new error class appeared since last review	Triage immediately — check if it is user-impacting
Issue spike	An existing error is occurring at higher than normal frequency	Investigate root cause; likely triggered by a recent deployment or data edge case
Unhandled promise rejection	Async code path without error handling	Low urgency unless frequency is high
`TypeError: Cannot read properties of undefined`	Usually a Supabase join typed as a single object — use `[0]` indexing	Fix in next sprint
500 errors on API routes	Edge function throwing — check function logs	P2 if frequent; P3 if rare

How to Triage an Issue

Click the issue in Sentry.
Read the stack trace — check if source maps are loaded (if you see minified code, SENTRY_AUTH_TOKEN may be missing or the build did not upload maps).
Check Breadcrumbs — the sequence of events leading to the error.
Check Tags: url, method, user.id — this tells you which route and which user was affected.
Check First seen / Last seen / Times seen — a fresh issue occurring once may be a data anomaly; 500 occurrences in the last hour is a live incident.
Cross-reference the timestamp with Vercel Deployments — if it started after a deployment, the deployment is the likely cause.

Resolving Issues

Fix and resolve: Merge the fix, confirm the error stops occurring in Sentry, then click Resolve. Sentry will reopen the issue automatically if the same error recurs.
Accept / Ignore: For known non-actionable errors (e.g., browser extension interference), click Ignore and document the reason in the Sentry comment field.
Assign: Assign unresolved issues to the responsible engineer so they are not forgotten.

💡

Sentry Alerts

Configure Sentry Alert Rules to send an email or Slack notification when: - A new issue is first seen. - Any issue's frequency exceeds 10 occurrences in 1 hour. Go to Sentry → Alerts → Create Alert Rule.

2. Vercel Logs

Vercel provides real-time and historical logs for all edge function invocations.

Accessing logs:

Go to Vercel Dashboard → Project → Logs tab.
Select Runtime Logs (not Build Logs).
Filter by Edge (not Functions — all routes use edge runtime).

Filtering Logs

Filter	How
By route	Type the path in the search box, e.g. `/api/admin/invoices`
By time range	Use the time picker — logs are retained for 1 hour in real-time view; longer with Vercel Pro log draining
By status code	Search for `status:500` or `status:4` to find errors
By request ID	If a user reports an error, ask them for the time — search around that timestamp

Downloading Logs

For incidents requiring detailed post-mortem analysis, export logs:

Vercel Dashboard → Logs → Export button (top right).
Download as .ndjson (newline-delimited JSON).
Process locally with jq: bash cat logs.ndjson | jq 'select(.statusCode == 500)'

ℹ️

Edge log latency

Edge logs in Vercel can have a 10–30 second delay before appearing in the dashboard. If you are investigating a live issue, wait a moment before concluding a log is absent.

3. Uptime Monitoring

Recommended: UptimeRobot

Set up the following monitors at uptimerobot.com:

Monitor	URL	Check Interval	Alert When
API Health Check	`https://yourdomain.com/api/health`	Every 1 minute	Response is not 200
Homepage	`https://yourdomain.com`	Every 5 minutes	Response is not 200
Client Portal Login	`https://yourdomain.com/auth/login`	Every 5 minutes	Response is not 200
Admin Dashboard	`https://yourdomain.com/admin`	Every 5 minutes	Response is not 200 or 302

Alert notifications: Configure UptimeRobot to send alerts to: - Primary engineer's email - A dedicated #alerts Slack channel (via UptimeRobot Slack integration) - A status page (UptimeRobot provides a hosted status page — share the URL with clients)

Health Check Endpoint

The /api/health route should return:

{
  "status": "ok",
  "timestamp": "2026-06-30T12:00:00.000Z",
  "version": "1.0.0"
}

If this endpoint returns a non-200 status or times out, the application is not serving traffic correctly.

4. Alert Escalation

When a monitoring alert fires, follow this escalation path:

UptimeRobot / Sentry Alert
        │
        ▼
1. Engineer on duty checks within 5 minutes
        │
        ├── Resolved in 15 min? → Document in incident log (P3/P4)
        │
        ▼
2. Issue persists > 15 min → Engineer escalates to Tech Lead
        │
        ├── Resolved in 60 min? → Document + post-mortem if P2
        │
        ▼
3. Issue persists > 1 hour (P1) → Tech Lead contacts CTO
   + Operations Manager notified for client communication
        │
        ▼
4. External vendor support engaged if platform-level (Vercel/Supabase/Stripe)

💡

Slack Integration

Connect both Sentry and UptimeRobot to a dedicated #crm-alerts Slack channel. This ensures the whole team sees alerts without relying on one person checking email.

5. Cron Job Monitoring

Cron jobs run silently — there is no push notification when they succeed. Monitoring must be proactive.

After each expected cron run:

Go to Vercel Dashboard → Project → Settings → Cron Jobs tab.
Click the cron job to see its run history — green checkmarks = success, red X = failure.
Click a specific run to view the response body and status code.
Cross-check the database: for the overdue invoice cron (runs at 02:00 UTC), check in Supabase that invoices with due_date < today and old status = 'sent' now have status = 'overdue'.

If a cron shows a failure: - Check the Vercel function logs around the scheduled run time. - Trigger the cron manually (see Cron Jobs for curl commands). - Verify the CRON_SECRET environment variable is correctly set if you see 401 responses.

6. Weekly Monitoring Review

Perform this review every Monday morning (or the first business day of each week):

[ ] Sentry: Open Sentry → Issues → filter to "Last 7 days". Review all new and unresolved issues. Assign owners to any unowned issues.
[ ] Sentry error rate: Check the Sentry project overview for error rate trends — is the rate increasing week over week?
[ ] Vercel deployments: Review the deployments from the past week. Any failed deployments? Any rollbacks?
[ ] Cron job logs: Confirm cron jobs ran successfully every day this week. Note any failures.
[ ] Uptime: Check UptimeRobot for any downtime incidents in the past 7 days. What was the total uptime percentage?
[ ] Stripe: Check Stripe Dashboard → Developers → Events for any failed webhook deliveries.
[ ] Resend: Check Resend Dashboard → Emails for any bounce or delivery failure spikes.
[ ] Supabase: Check Supabase Dashboard → Database → Reports for query performance — any slow queries (>500ms)?

Document the review outcome in the team's weekly ops note (Slack, Notion, or equivalent).

7. SLA Targets

Metric	Target	Measured How
Production uptime	≥ 99.5% per month	UptimeRobot monthly report
API health check response time	< 500ms (p95)	Vercel Analytics or UptimeRobot response time graph
Unhandled error rate	< 0.1% of requests	Sentry error rate vs. Vercel request count
Cron job success rate	100% (zero missed runs per month)	Vercel cron run history
P1 incident response time	< 15 minutes	Incident log

ℹ️

99.5% uptime

99.5% uptime allows approximately 3.65 hours of downtime per month. Vercel's SLA is 99.99% for the Edge network. Supabase Pro's SLA is 99.9%. The combined practical uptime target is set conservatively at 99.5% to account for application-level incidents.

Maintenance Schedules

Maintenance Schedule

Overview

Monthly Checklist

Error Tracking

Performance

Security

Operations

Access Review

Quarterly Checklist

Dependencies

Security

Infrastructure

Documentation

Annual Tasks

Security

Disaster Recovery

Documentation and Compliance

Infrastructure

Cron Jobs

Overview

Authentication

Job 1: Invoice Overdue Check

What It Does

Example Manual Trigger

Expected Response

Job 2: Recurring Invoice Generation

What It Does

Example Manual Trigger

Expected Response

Job 3: Appointments Cron (Reference)

Vercel Cron Configuration

Testing Crons Locally

Monitoring Cron Jobs

What to Do If a Cron Fails

1. Check the Vercel Cron logs

2. Trigger the job manually

3. Document the failure

4. Verify data integrity after recovery

Monitoring & Logging

1. Sentry — Error Tracking

What to Check

How to Triage an Issue

Resolving Issues

2. Vercel Logs

Filtering Logs

Downloading Logs

3. Uptime Monitoring

Recommended: UptimeRobot

Health Check Endpoint

4. Alert Escalation

5. Cron Job Monitoring

6. Weekly Monitoring Review

7. SLA Targets