机房搭建的是一Master多Node的K8S集群,机器重启后我执行kubectl get nodes ,一直提示apiserver连接不上,我直接进入到master节点了
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
$ kubectl get nodes The connection to the server 127.0.0.1:8443 was refused - did you specify the right host or port?
看来是apiserver启动失败,查看apiserver的日志
$ docker ps -a|grep apiserver CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES eb44e1f500f0 registry.aliyuncs.com/kubeadm-ha/pause:3.2 "/pause" 4 seconds ago Exited (0) 2 seconds ago k8s_POD_kube-apiserver-xxx_kube-system_613abe9b942b6bc621a9d098614e6fe8_2
$ tail -f /var/log/message
Node NotReady: kubelet suddenly giving error Failed to list ****: an error on the server ("") has prevented the request from succeeding
E0305 11:14:38.953687 1000 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://127.0.0.1:8443/api/v1/services?limit=500&resourceVersion=0": dial tcp 127.0.0.1:8443: connect: connection refused
E0305 12:56:46.903299 1031 pod_workers.go:191] Error syncing pod 1916a059cb13894ce9385e67335904e2 ("etcd-xxx_kube-system(1916a059cb13894ce9385e67335904e2)"), skipping: failed to "StartContainer" for "etcd" with CrashLoopBackOff: "back-off 10s restarting failed container=etcd pod=etcd-xxx_kube-system(1916a059cb13894ce9385e67335904e2)"
先重启试下,重启完仍然报错
1 2 3
Unable to register node with API server: Post "https://127.0.0.1:8443/api/v1/nodes": EOF
kubelet won't restart after reboot - Unable to register node with API server: connection refused
发现apiserver连不上是因为etcd没有启动成功
查看etcd报错
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead 2022-03-05 11:39:10.063604 I | etcdmain: etcd Version: 3.4.13 2022-03-05 11:39:10.063656 I | etcdmain: Git SHA: ae9734ed2 2022-03-05 11:39:10.063660 I | etcdmain: Go Version: go1.12.17 2022-03-05 11:39:10.063664 I | etcdmain: Go OS/Arch: linux/amd64 2022-03-05 11:39:10.063668 I | etcdmain: setting maximum number of CPUs to 6, total number of available CPUs is 6 2022-03-05 11:39:10.063716 N | etcdmain: the server is already initialized as member before, starting as etcd member... [WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead 2022-03-05 11:39:10.063752 I | embed: peerTLS: cert = /etc/kubernetes/pki/etcd/peer.crt, key = /etc/kubernetes/pki/etcd/peer.key, trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file = 2022-03-05 11:39:10.064357 I | embed: name = etcd-xxx 2022-03-05 11:39:10.064368 I | embed: data dir = /var/lib/etcd 2022-03-05 11:39:10.064373 I | embed: member dir = /var/lib/etcd/member 2022-03-05 11:39:10.064383 I | embed: heartbeat = 100ms 2022-03-05 11:39:10.064386 I | embed: election = 1000ms 2022-03-05 11:39:10.064390 I | embed: snapshot count = 10000 2022-03-05 11:39:10.064399 I | embed: advertise client URLs = https://xxx:2379 2022-03-05 11:39:10.064404 I | embed: initial advertise peer URLs = https://xxx:2380 2022-03-05 11:39:10.064409 I | embed: initial cluster = 2022-03-05 11:39:10.250331 C | etcdmain: wal: crc mismatch
[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead 2022-03-05 11:58:46.691947 I | etcdmain: etcd Version: 3.4.13 2022-03-05 11:58:46.692039 I | etcdmain: Git SHA: ae9734ed2 2022-03-05 11:58:46.692051 I | etcdmain: Go Version: go1.12.17 2022-03-05 11:58:46.692063 I | etcdmain: Go OS/Arch: linux/amd64 2022-03-05 11:58:46.692074 I | etcdmain: setting maximum number of CPUs to 6, total number of available CPUs is 6 2022-03-05 11:58:46.727766 N | etcdmain: the server is already initialized as member before, starting as etcd member... [WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead 2022-03-05 11:58:46.727900 I | embed: peerTLS: cert = /etc/kubernetes/pki/etcd/peer.crt, key = /etc/kubernetes/pki/etcd/peer.key, trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file = 2022-03-05 11:58:46.729577 I | embed: name = etcd-xxx 2022-03-05 11:58:46.729606 I | embed: data dir = /var/lib/etcd 2022-03-05 11:58:46.729619 I | embed: member dir = /var/lib/etcd/member 2022-03-05 11:58:46.729629 I | embed: heartbeat = 100ms 2022-03-05 11:58:46.729637 I | embed: election = 1000ms 2022-03-05 11:58:46.729647 I | embed: snapshot count = 10000 2022-03-05 11:58:46.729669 I | embed: advertise client URLs = https://xxx:2379 2022-03-05 11:58:46.765737 C | etcdmain: cannot fetch cluster info from peer urls: could not retrieve cluster information from the given URLs