If you use ARCCONF for monitoring your Adaptec RAID controller, you may face a bug where Adaptec CIM Provider does not fully clean it’s temporary files and fills up the root RAM disk of the ESXi server.
- Controller: Adaptec 51245
- Controller firmware: Build 18948 (latest firmware for Adaptec 51245 as of 02-Oct-2013)
- VMware ESXi: 5.x
- Driver version: aacraid-esxi5.0-220.127.116.11100 (latest driver for Adaptec 51245 as of 02-Oct-2013)
- CIM provider version: v7.31.18856 (latest CIM provider for Adaptec 51245 as of 02-Oct-2013)
- ARCCONF Client version: Version 7.31 (B18856)
This is a confirmed system configuration. One may experience this bug with other versions of the software or hardware. The primary suspect for the bug is the CIM provider.
You are using ARCCONF to monitor Adaptec Controller on your ESXi server and you start to receive one or more of following errors in ESXi:
- The VMRC console has disconnected…attempting to reconnect
- unable to connect to the MKS: a general error occurred: internal error
- ESXi logs have RAM disk is full errors.
- vdf -h command in SSH show’s ram disk root as 99%-100% used:
Ramdisk Size Used Available Use% Mounted on root 32M 32M 0M 100% -- etc 28M 280K 27M 0% -- tmp 192M 112K 191M 0% -- hostdstats 249M 4M 244M 2% --
When querying ARCCONF GETCONFIG a log file /var/log/arcconf.log is created on the ESXi server. This log file is always appended and never cleaned by the driver.
RAM disk default size is 32Mb. The speed at which the RAM disk becomes full depends on the monitoring intervals and the actual config of the controller. In our previous configuration, it took 60 days to fill up the disk. As our monitoring became more complex and with shorter intervals, it took 7 days. Keep in mind that the log is deleted if the server restarts. So, depending on circumstances, you may never notice the bug.
There is no known official fix as of 02-Oct-2013.
The workaround is to clean the arcconf.log manually or using cron job. We use a cron job that cleans arcconf.log every two minutes.
*/2 * * * * /bin/echo > /var/log/arcconf.log
For the cron to be persistent across reboots, add following lines to the /etc/rc.local.d/local.sh
/bin/kill $(cat /var/run/crond.pid) /bin/echo "*/2 * * * * /bin/echo > /var/log/arcconf.log" >> /var/spool/cron/crontabs/root /usr/lib/vmware/busybox/bin/busybox crond
First line kills crond, second adds our ECHO command and third restarts crond.
UPD 18-10-2013: fixed typo in the crond schedule.
UPD 7-07-2014: fixed another typo in the crond schedule description.