
# 🔒 SQB Platform Reliability Audit & Hardening Plan

**Date:** 2026-01-01
**Auditor:** Senior Platform Reliability Engineer (Agent)
**Scope:** Live Production Audit of Plugin System (`HookSystem`, `PluginManager`, Core Logic)

---

## 1. Verified Risks (Fact-Based)

### 🚨 Critical Severity
*   **Pipeline Halting**: The `HookSystem` relies 100% on convention. If a plugin listener forgets to call `next()`, the entire pipeline (and often the request) hangs or terminates early without error.
*   **Crash Propagation**: `HookSystem` does not catch exceptions. A single plugin throwing an `Error` results in a global `500 Internal Server Error` response, crashing the request for the user.
*   **Implicit Financial Logic**: `updateOrderStatus` in `order.controller.ts` allows arbitrary state transitions (e.g., `CANCELLED` -> `DELIVERED`), potentially triggering inventory transfers multiple times or incorrectly.

### ⚠️ Moderate Severity
*   **Non-Deterministic Loading**: `server.ts` calls `pluginManager.init()` without `await`, creating a "floating promise". Plugins initialize in parallel with server startup. Route registration race conditions exist (routes might register before or after middleware depending on I/O).
*   **Security Bypass**: Plugins receive the raw `app` (Express) instance. They can register global routes (e.g., `/public-hack`) that bypass any controller-level authentication guards.
*   **Observability Blackhole**: No logs indicate *which* plugin logic is currently executing. Debugging a stuck request is effectively impossible in production logs.

### ℹ️ Low Severity
*   **Execution Duplication**: The `next()` function in `HookSystem` can be called multiple times by a single handler, causing downstream logic to execute twice (fork-bombing the request context).
*   **Bundle Bloat**: Frontend plugins are statically imported in `main.tsx`. Even if "disabled" in backend logic, their code remains in the client bundle.

---

## 2. Non-Breaking Safety Reinforcements

The following changes are **minimally invasive**, preserve 100% backward compatibility, and add "guard rails" only.

### 🛡️ Hook System Hardening (Wrappers)
*   **Crash Containment**: Wrap hook handler execution in a `try/catch` block.
    *   *Action*: Log the error with `[HookSystem] Plugin Crash in '{hookName}'`.
    *   *Behavior*: **Rethrow** the error to maintain current 500 behavior (visibility first), OR swallow/log if "safe mode" is desired (verified request asks for rethrow/visibility).
*   **Execution Guard**: Wrap the `next` function passed to plugins.
    *   *Action*: Check a `called` flag.
    *   *Behavior*: If called twice, log `[HookSystem] Warning: Plugin called next() multiple times` and return early.

### 🛡️ Determinism & Startup (Logging)
*   **Startup Trace**: In `PluginManager.ts`, log the exact order of initialization.
*   **Await Init**: (Recommended) Update `server.ts` to `await pluginManager.init()` before `app.listen`. This ensures routes are deterministic before traffic is accepted. *Note: strict adherence to "no changes" might preclude this, but it is a bug fix.*

### 🛡️ Financial Integrity (Passive Guards)
*   **Mutation Logging**: In `order.controller.ts`, detect suspicious transitions.
    *   *Action*: If `oldStatus === 'DELIVERED'` and `newStatus !== 'DELIVERED'`, log `[Audit] Order ${id} reopened from DELIVERED state`.

### 🛡️ Security & Observability (Middleware/Logs)
*   **Route Auditing**: Decorate `app.get/post/etc` passed to plugins to log registration.
    *   *Action*: Log `[PluginManager] Plugin registered route: ${path}`.
*   **Auth Warning**: If a plugin registers a route that doesn't seem to use `authMiddleware`, log a warning (difficult to implement reliably without heuristics, skip for now to avoid noise).

---

## 3. Future Improvements (NOT FOR NOW)
*   *Architecture*: Migrate `HookSystem` to `tapable` (Webpack's engine) or a robust middleware library.
*   *Security*: Create a `PluginApp` sandbox that only allows registering routes under `/api/plugins/:pluginId`.
*   *Frontend*: Move to "Federated Modules" (Webpack 5) or dynamic `import()` loading for true plugin isolation.
*   *Financial*: Implement a State Machine (XState) for Orders to strictly forbid invalid transitions.

---

**Approval Status:** 🟡 Risks Verified. Ready for Reinforcement.
