A Canadian direct-to-consumer insurance provider came to Armakuni for devOps Care POD to diagnose recurring Apache thread exhaustion on a LAMP stack, recommend an observability layer on a previously dark production environment, and deliver reliability fixes that protect atomic policy binding and payment transactions.
The challenge
Square One Insurance engaged Armakuni to diagnose and resolve a recurring production outage pattern that had resisted resolution for 6 to 10 months. The 60-hour Flexible Engineering POD brought senior DevOps and full-stack engineers against the production environment under a retainer model Square One can extend into further data and application engineering work.
Scale: Approximately 1,000 policies and 2,000+ quotes processed daily across 2 production servers running a LAMP stack on AWS, with approximately 800 MySQL queries per request.
What we built
Armakuni delivered devOps Care POD to diagnose recurring Apache thread exhaustion on a LAMP stack, recommend an observability layer on a previously dark production environment, and deliver reliability fixes that protect atomic policy binding and payment transactions. The engagement ran as single Flexible Engineering POD, retainer-based, consulting scope. 30 hours allocated upfront, 30 hours reserved for the customer at their discretion. AWS funded the work at $2,400m.
The outcomes
4 measurable outcomes shipped across the engagement. The ones that moved the business the most:
Diagnostic evidence preserved from the August 1 outage. rather than erased by restart, giving Square One a concrete explanation for a production pattern that had been opaque for 6 to 10 months. The team now has an artifact they can point to rather than a guess they have to defend at every recurrence.
Observability stack scoped specifically for the 800-queries-per-request pattern, including RDS Performance Insights at the database layer and. application-level request tracing, closing the visibility gap that made every previous incident a guessing game.
Reliability recommendations engineered for atomic transaction safety. with connection pooling and session timeout approaches that protect policy binding and payment flows throughout the fix window. The platform gets reliability improvements without taking on transactional risk.
60-hour Flexible Engineering POD established as an ongoing retainer. with remaining hours and follow-on months available for any further DevOps, database, or full-stack engineering work Square One needs. The engagement is structured as a capacity relationship Square One can extend, not a fixed-scope project that ends at sign-off.
Built on AWS
The production environment runs on 4 first-party AWS services. Delivered under our AWS Not specified in source documents. Confirm before publishing. competency.
What's next
Implementation of the recommended observability stack and reliability fixes. The retainer POD remains available for further data engineering, DevOps, and application engineering work at Square One's discretion.