Actuator Health & Nacos

Add Actuator

添加 spring-boot-starter-actuator 依赖

GET http://localhost:8080/actuator/health
cache-control: no-cache

---
{"status":"UP"}
  

展示 /health 细节

添加如下配置,/health 将返回消息信息。

management:
  endpoint:
    health:
      show-details: always
  

默认的 Health Indicators

  • DataSourceHealthIndicator
  • MongoHealthIndicator
  • Neo4jHealthIndicator
  • CassandraHealthIndicator
  • RedisHealthIndicator
  • CassandraHealthIndicator
  • RabbitHealthIndicator
  • CouchbaseHealthIndicator
  • DiskSpaceHealthIndicator
  • ElasticsearchHealthIndicator
  • InfluxDbHealthIndicator
  • JmsHealthIndicator
  • MailHealthIndicator
  • SolrHealthIndicator

自定义 Health Indicator

@Component
public class ServiceAHealthIndicator implements HealthIndicator {
        private final String message_key = "Service A";
    @Override
    public Health health() {
        if (!isRunningServiceA()) {
            return Health.down().withDetail(message_key, "Not Available").build();
        }
        return Health.up().withDetail(message_key, "Available").build();
    }

    private Boolean isRunningServiceA() {
        Boolean isRunning = true;
        // Logic Skipped

        return isRunning;
    }
}
  

自定义 Health Indicator

@Override
public Health health() {
    // check cache is available
    Cache cache = cacheManager.getCache("mycache");
    if (cache == null) {
        LOG.warn("Cache not available");
        return Health.down().withDetail("smoke test", "cache not available").build();
    }

    // check db available
    try (Connection connection = dataSource.getConnection()) {
    } catch (SQLException e) {
        LOG.warn("DB not available");
        return Health.down().withDetail("smoke test", e.getMessage()).build();
    }

    // check some service url is reachable
    try {
        URL url = new URL(resUrl);
        int port = url.getPort();
        if (port == -1) {
            port = url.getDefaultPort();
        }

        try (Socket socket = new Socket(url.getHost(), port)) {
        } catch (IOException e) {
            LOG.warn("Failed to open socket to " + resUrl);
            return Health.down().withDetail("smoke test", e.getMessage()).build();
        }
    } catch (MalformedURLException e1) {
        LOG.warn("Malformed URL: " + resUrl);
        return Health.down().withDetail("smoke test", e1.getMessage()).build();
    }

    return Health.up().build();
}
  

NacosConfigHealthIndicator

添加 Nacos 配置中心后,默认会自动添加 nacosConfig 组件的健康检查。

package com.alibaba.cloud.nacos.endpoint;

import com.alibaba.nacos.api.config.ConfigService;

import org.springframework.boot.actuate.health.AbstractHealthIndicator;
import org.springframework.boot.actuate.health.Health;

/**
 * @author xiaojing
 */
public class NacosConfigHealthIndicator extends AbstractHealthIndicator {

    private final ConfigService configService;

    public NacosConfigHealthIndicator(ConfigService configService) {
        this.configService = configService;
    }

    @Override
    protected void doHealthCheck(Health.Builder builder) throws Exception {
        builder.up();

        String status = configService.getServerStatus();
        builder.status(status);
    }
}
  

Nacos 能正常访问时:

{
    "status": "UP",
    "components": {
        "nacosConfig": {
            "status": "UP"
        }
    }
}
  

Nacos 不能正常访问时:

{
    "status": "DOWN",
    "components": {
        "nacosConfig": {
            "status": "DOWN"
        }
    }
}
  

Nacos 配置服务区宕机时,服务中长连接会失败并触发重试:

com.alibaba.nacos.client.config.http.ServerHttpAgent [NACOS ConnectException httpPost] currentServerAddr: http://127.0.0.1:8001, err : Connection refused: connect

当达到最大重试次数时,会报如下错误,同时健康检查会变为 DOWN

com.alibaba.nacos.client.config.impl.ClientWorker longPolling error :
java.net.ConnectException: [NACOS HTTP-POST] The maximum number of tolerable server reconnection errors has been reached

由于在 K8S 使用健康检查的 /health 接口来验证服务的存活,本以为 Naocs 宕机时仅会影响之后创建的服务容器,使用了 /health 做健康检查之后,容器的健康检查会失败,并导致服务自动重启。而启动时由于连不上 Nacos 服务器,进而导致服务启动失败,如此往复。

引用

  1. Custom Health Check in Spring Boot Actuator
  2. Custom Health Checks Using Spring Boot Actuator