1. 问题描述
突然发现宿主机节点出现严重内存不足的情况。起初判断不太合理——当前仅运行少量容器,按容器资源统计,内存使用量最多约十余 GB,理论上仍有数十 GB 空闲。基于这一异常判断,随即对主机内存使用情况进行了排查,并发现了如下信息。
从进程视角看,业务与系统进程内存占用并不高,最大进程仅 5GB 左右,明显无法解释整机内存消耗。进一步查看 /proc/meminfo 发现 Slab 内存占用高达 38G,且几乎全部为不可回收的 SUnreclaim,属于明显异常。
从 ps 结果看,内存占用最高的进程仅为 mysqld(约 5GB),其余如 Elasticsearch、kube-apiserver、Ceph OSD 等占用也有限,前几十个进程的 RSS 总和远不足以解释整机 50+GB 的内存消耗,问题显然不在应用或容器进程层面。
[root@tanqidi ~]# free -h
total used free shared buff/cache available
Mem: 62G 56G 1.5G 18M 4.4G 5.5G
Swap: 0B 0B 0B
[root@tanqidi ~]# cat /proc/meminfo
MemTotal: 65674112 kB
MemFree: 1586104 kB
MemAvailable: 5799208 kB
Buffers: 2570496 kB
Cached: 1102840 kB
SwapCached: 0 kB
Active: 21484124 kB
Inactive: 3171600 kB
Active(anon): 20635988 kB
Inactive(anon): 1964 kB
Active(file): 848136 kB
Inactive(file): 3169636 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 872 kB
Writeback: 0 kB
AnonPages: 20796084 kB
Mapped: 638320 kB
Shmem: 19080 kB
KReclaimable: 921308 kB
Slab: 38854192 kB
SReclaimable: 921308 kB
SUnreclaim: 37932884 kB
KernelStack: 96368 kB
PageTables: 106724 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 32837056 kB
Committed_AS: 57620592 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 117424 kB
VmallocChunk: 0 kB
Percpu: 46912 kB
HardwareCorrupted: 0 kB
AnonHugePages: 10602496 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
FileHugePages: 0 kB
FilePmdMapped: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kB
DirectMap4k: 51476352 kB
DirectMap2M: 14583808 kB
DirectMap1G: 3145728 kB
[root@tanqidi ~]# ps -eo pid,ppid,user,cmd,%mem,rss --sort=-%mem | head -40
PID PPID USER CMD %MEM RSS
172023 171990 1001 mysqld --wsrep_start_positi 7.8 5166000
275014 274962 centos /app/elasticsearch/jdk/bin/ 3.0 1987664
134972 134880 root kube-apiserver --advertise- 2.6 1710564
3863 3841 167 ceph-osd --foreground --id 1.8 1187456
16160 16139 root java -server -Xms512m -Xmx1 1.7 1134292
109023 109003 167 ceph-mon --fsid=45492291-00 1.5 1024888
112063 112025 centos /opt/jdk-12/bin/java -Xms1g 1.3 868776
141425 141394 root ks-apiserver --logtostderr= 0.9 649948
267940 267893 root java -server -Xms512m -Xmx1 0.9 644640
501764 501731 root java -server -Xms512m -Xmx2 0.8 581900
134924 1 root ./titanagent -d -b /etc/tit 0.6 410000
102189 102168 root /home/weave/scope --mode=pr 0.5 361308
508992 508950 root kube-controller-manager --a 0.4 287928
134987 134855 root etcd --advertise-client-url 0.3 236280
179495 1 root /usr/bin/dockerd -H fd:// - 0.3 204248
182113 181958 root /usr/local/bin/cephcsi --no 0.2 178208
133496 1 root /usr/bin/kubelet --bootstra 0.2 176096
355222 355159 167 ceph-mds --fsid=45492291-00 0.2 149400
69171 69149 root /home/weave/scope --mode=ap 0.2 149220
183555 183480 root /usr/local/bin/cephcsi --no 0.1 124068
143161 143138 root /app/redis/src/redis-server 0.1 118976
158491 158460 root /home/weave/scope --mode=pr 0.1 105316
509044 509000 root kube-scheduler --authentica 0.1 89160
443163 1 root /usr/bin/monitor-agent -con 0.1 85264
418389 418361 root /usr/local/bin/cephcsi --no 0.1 78404
2654 1 root /bin/sh /opt/monitor/osw/el 0.1 69080
409574 409359 centos haproxy -W -db -f /usr/loca 0.1 67100
409359 409234 centos haproxy -W -db -f /usr/loca 0.1 67088
1562 1 root /usr/bin/containerd 0.1 66308
236769 236741 root ganesha.nfsd -F -L STDERR - 0.0 65384
69705 69682 65532 /velero server --features= 0.0 59688
204312 204300 root calico-node -felix 0.0 59424
161760 161722 root /usr/local/bin/cephcsi --no 0.0 56380
379559 379539 root /bin/thanos rule --data-dir 0.0 53900
433477 1 root /usr/bin/containerd-shim-ru 0.0 50836
433499 1 root /usr/bin/containerd-shim-ru 0.0 49568
36217 1 root /usr/bin/containerd-shim-ru 0.0 49184
433858 1 root /usr/bin/containerd-shim-ru 0.0 48044
159452 159329 root /vminsert-prod --storageNod 0.0 475802. 解决方式
由于该机器是 Kubernetes 工作节点,已长期运行且从未重启,内存异常增长的具体时间点已无法确认。考虑通过重启主机释放内存,但节点上运行着 Ceph OSD 与 Mysql 等存储组件,若直接重启存在较高风险,严重时甚至可能导致节点启动失败或存储异常。因此最终选择在 应用层面逐步回收资源,对主机进行相对温和的重启处理,以降低对存储和集群稳定性的影响。
先将 Ceph OSD 与 Mysql 等块存储相关容器的副本数调整为 0,确保其安全退出;随后执行 systemctl stop kubelet,再通过 systemctl restart docker 重启容器运行时,以实现对应用层的温和重启;待 Docker 重启完成后,再启动 kubelet,逐步恢复节点上的工作负载。
在完成 Docker 与 kubelet 的重启后,SUnreclaim 不可回收内存成功被释放,节点可用内存恢复至约 48GB。后续将持续观察该节点内存变化情况,若再次出现 SUnreclaim 异常增长,再进一步深入定位具体存在内存泄露的组件或内核模块。
[root@tanqidi appuser]# free -h
total used free shared buff/cache available
Mem: 62G 13G 41G 17M 7.0G 48G
Swap: 0B 0B 0B
[root@tanqidi appuser]# cat /proc/meminfo
MemTotal: 65674112 kB
MemFree: 43915596 kB
MemAvailable: 50777080 kB
Buffers: 1056344 kB
Cached: 5524412 kB
SwapCached: 0 kB
Active: 14991032 kB
Inactive: 4338480 kB
Active(anon): 12506288 kB
Inactive(anon): 3892 kB
Active(file): 2484744 kB
Inactive(file): 4334588 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 520 kB
Writeback: 0 kB
AnonPages: 12068124 kB
Mapped: 1513104 kB
Shmem: 18100 kB
KReclaimable: 768128 kB
Slab: 1893868 kB
SReclaimable: 768128 kB
SUnreclaim: 1125740 kB
KernelStack: 58224 kB
PageTables: 60516 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 32837056 kB
Committed_AS: 36981664 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 75824 kB
VmallocChunk: 0 kB
Percpu: 46912 kB
HardwareCorrupted: 0 kB
AnonHugePages: 5595136 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
FileHugePages: 0 kB
FilePmdMapped: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kB
DirectMap4k: 51476352 kB
DirectMap2M: 14583808 kB
DirectMap1G: 3145728 kB
[root@tanqidi appuser]#
评论