vCenter Server Certificate Status Alarm
Earlier this week I began receiving alarms from my vCenter regarding a Certificate Status Alarm. I had not run into this particular error before and it took me a bit of investigation to get it solved.
An unusual certificate description
The alert kept repeating on an hourly interval and I didn’t have a clue what this certificate was for:
This email is to notify you that an alarm has been triggered in your vCenter: [Critical] Alarm alarm.CertificateStatusAlarm on Folder Datacenters because Certificate 'CN=Synology,OU=Synology,O=Synology,L=Unknown,ST=Unknown,C=Unknown' from 'SMS' expires on 2022-08-01 08:35:11.000. Alarm name alarm.CertificateStatusAlarm Description alarm.CertificateStatusAlarm Target Folder Datacenters Status Critical (previous status: Unset) Triggered time 09/15/2022 08:45:42 PM
I started by looking at Certificate Management in the vSphere Web Client (top menu -> Administration -> Certificate Management)
I had recently replaced all of the vCenter self-signed certificates with new versions from my internal certificate authority. As I expected – all of these certificates are valid with expiration dates well into the future.
While I did use a Synology NAS for storage, I was pretty sure that neither adding an NFS mount or iSCSI connection would install a certificate.
Finding the root cause
After a bit of research, I found KB 2015600 which provides some procedures for determining expired certificates in vCenter. This article provided a script which was able to list all of the certificates in vCenter and helped me find the root cause of my issue.
I connected to the VCSA (SSH as root) and executed the following script. This gave me a detailed list of all certificates stored on the system.
for store in $(/usr/lib/vmware-vmafd/bin/vecs-cli store list | grep -v TRUSTED_ROOT_CRLS); do echo "[*] Store :" $store; /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store $store --text | grep -ie "Alias" -ie "Not After";done;
As you can see, there are 2 expired certificates located in Certificate Stores that are not visible in the main Certificate Management panel:
- The first culprit (HTTPS alias) was from a Synology Storage Console for VMware that I had tested out earlier.
- The second expired certificate was in a Backup Certificate Store which the certificate authority service in vCenter uses. This was most likely the original self-signed certificate before I replaced it with new CA-signed certificates.
Removing the expired and unused certificates
I felt comfortable removing these since I was 100% certain that neither of these certificates were in use and they were not used by core vCenter Services.
?Before proceeding further ?
- Make sure you understand the process and risks involved with these tasks. There can be many different issues for expired certificates. KB 2015600 provides a lot of detail on how to work through these.
- The VECS-CLI command reference helped me to find the commands I used to safely remove the expired and unused certificates.
- Make sure you take the appropriate steps to provide a rollback in case something doesn’t work as intended. Specifically – take a Snapshot of the VCSA – the Right Way
and Configure your vCenter Server Backup
- If you have doubts – reach out to VMware Global Support
While connected to the VCSA as root, I executed the following commands to list the certificates in the “sms” store. This helped confirm that I was looking in the right place and had the proper names and commands
/usr/lib/vmware-vmafd/bin/vecs-cli entry list --store sms
I now executed the following command to remove the expired certificate
/usr/lib/vmware-vmafd/bin/vecs-cli entry delete --store sms --alias https://10.129.4.196:8443/vasa/services/vasaService
I re-ran the entry list command from earlier and sure enough the expired and unused certificate had been removed.
These commands were used to remove the expired certificate in the Backup store. The KB article mentioned these could cause alarms as well.
In vCenter I reset the alert for this issue to green and the alarms ceased.
To be safe, I restarted vCenter to ensure everything was in working order and then removed the snapshot.