KubernetesAdvancedPar time: 8:00

The OOM That Lies

Checkout pods are being OOMKilled but the app's memory usage looks normal.

The Scenario

Checkout pods have been OOMKilled three times in the last hour. The application metrics show steady 200MB heap usage - well inside limits. But the pods keep dying. A closer look reveals a logging sidecar that was recently given permission to buffer request bodies. Under load, it's accumulating hundreds of megabytes in memory. Because cgroup limits apply to the whole pod, the kernel OOM killer targets whichever process uses the most at the moment it fires - often the main app, not the real culprit.

What You'll Learn

Why pod-level cgroup limits mean any container can trigger OOMKill for the whole pod

How to read kubectl top pod --containers to isolate per-container memory

Diagnosing sidecar memory leaks that hide behind healthy-looking app metrics

Setting per-container resource limits to contain blast radius

Tools You'll Use

kubectlkubectl topContainer logscAdvisor metrics

Real-World Context

Sidecar-caused OOMKills are notoriously hard to diagnose because APM tools only instrument the main application. This pattern appears frequently with log shippers, service mesh proxies, and tracing agents.

Ready to debug this?

Free account required - sign up with GitHub or Google in 10 seconds

Play The OOM That Lies