← Mission Control

Operator Runbook

Step-by-step playbooks for common platform operations.

Playbook 1

Promote a Lambda Alias

AWS Console Requires T-JOSH approval

Update the prod Lambda alias to point to a new published version. Always verify the new version in staging before promoting.

  1. 1
    Open Lambda in AWS Console Navigate to AWS Console → Lambda → Functions → mission-control-api.
  2. 2
    Identify the target version Click Versions tab. Note the version number you want to promote (e.g. v43). Confirm it has passed QA and is listed in DEPLOY-LOG.md.
  3. 3
    Edit the prod alias Click Aliases tab → select prodEdit. Set Function version to the target version number.
  4. 4
    Save and verify Click Save. Return to Mission Control and confirm the alias parity card shows Parity aligned with the new version number. aws lambda get-alias --function-name mission-control-api --name prod
  5. 5
    Record in deploy log Add an entry to coordination/DEPLOY-LOG.md with the version, operator name, and UTC timestamp. Commit and push.
Playbook 2

Rotate a Duckling API Key

POST /beak/admin action=rotate_key

Invalidate and regenerate the API key for a specific duckling. Use when a key is suspected to be compromised or during a scheduled rotation cycle.

  1. 1
    Identify the duckling Find the duckling_id from the Admin panel or the support request. Confirm the account is still active before rotating.
  2. 2
    Issue the rotate_key action Send an authenticated POST request to the admin endpoint: POST /beak/admin {"action":"rotate_key","duckling_id":"<DUCKLING_ID>"}
  3. 3
    Confirm response Expected response: {"status":"ok","action":"rotate_key","duckling_id":"..."}. A 200 with status ok indicates the key has been rotated in DynamoDB.
  4. 4
    Notify the duckling If the rotation was operator-initiated (not self-service), contact the duckling via their registered email with the new key or instructions to retrieve it from their dashboard.
  5. 5
    Audit verification Confirm a duck.key_rotated audit event appears in the Mission Control Recent Events panel within 60 seconds.
Playbook 3

Revoke a Compromised Connection

POST /beak/admin action=revoke_connection

Immediately terminate a spaceduck connection that is suspected to be compromised, abusive, or unauthorised. This is a destructive action — obtain T-JOSH approval for production revocations.

  1. 1
    Identify the connection Locate the connection_id (or spaceduck_id) from the Admin deep-link search or the incident report. Note the associated duckling.
  2. 2
    Issue the revoke_connection action Send an authenticated POST to the admin endpoint: POST /beak/admin {"action":"revoke_connection","connection_id":"<CONNECTION_ID>"}
  3. 3
    Confirm response Expected: {"status":"ok","action":"revoke_connection","connection_id":"..."}. The connection record is marked REVOKED in DynamoDB and the spaceduck will be unable to pulse.
  4. 4
    Verify in Admin panel Open /admin.html and locate the duckling. The connection should no longer appear as active. Confirm the spaceduck_id is absent from the bonded list.
  5. 5
    Log the incident Add an entry to coordination/OPERATOR-NOTES.md with the connection_id, revocation reason, operator name, and UTC timestamp. Confirm audit event is captured.
Playbook 4

Check DynamoDB Table Health

AWS Console CloudWatch

Verify that all DynamoDB tables are healthy, provisioned correctly, and not showing consumed capacity alerts.

  1. 1
    Open DynamoDB in AWS Console Navigate to AWS Console → DynamoDB → Tables. Filter by region us-east-1. Confirm all expected tables are present: spaceduck-eggs, spaceduck-ducklings, spaceduck-certs, spaceduck-connections, spaceduck-audit.
  2. 2
    Check table status Each table should show Status: ACTIVE. Any table in CREATING, UPDATING, or DELETING state requires investigation. Note billing mode — expect PAY_PER_REQUEST for all tables.
  3. 3
    Review CloudWatch metrics Open CloudWatch → Metrics → DynamoDB. Check ConsumedReadCapacityUnits, ConsumedWriteCapacityUnits, and SystemErrors for each table over the last 24h. Any SystemErrors > 0 requires escalation.
  4. 4
    Verify row counts via Mission Control Return to Mission Control. The Database Posture card should reflect live row counts. Compare counts with expected growth trajectory from the Database Growth Trend panel. GET /beak/system/status → database.*
  5. 5
    Check CloudWatch alarms Open CloudWatch → Alarms and filter for DynamoDB. All alarms should be in OK state. Acknowledge and escalate any ALARM state immediately.
Playbook 5

Force-Expire a Cert

POST /beak/admin action=expire_cert

Immediately mark a birth certificate as expired. Use when a cert was issued in error, when a duckling account is terminated, or when instructed by a compliance event.

  1. 1
    Identify the cert Retrieve the cert_id from the Admin duckling investigation drawer or the birth-certificate.html lookup. Confirm this is the correct cert for the correct duckling.
  2. 2
    Issue the expire_cert action Send an authenticated POST to the admin endpoint: POST /beak/admin {"action":"expire_cert","cert_id":"<CERT_ID>"}
  3. 3
    Confirm response Expected: {"status":"ok","action":"expire_cert","cert_id":"..."}. The cert record is updated with status: EXPIRED and an expiry timestamp in DynamoDB.
  4. 4
    Verify cert inventory update Return to Mission Control and refresh. The Cert Inventory panel should reflect the updated pending/issued counts. The expired cert should no longer appear in the certified total.
  5. 5
    Notify and document If cert expiry affects a live duckling, notify them via their registered email. Log the cert_id, reason, and operator in coordination/OPERATOR-NOTES.md with UTC timestamp. Confirm duck.cert_expired audit event is recorded.