As with any program, you might run into an error installing or running kubeadm. This page lists some common failure scenarios and have provided steps that can help you understand and fix the problem.
If your problem is not listed below, please follow the following steps:
If you think your problem is a bug with kubeadm:
If you are unsure about how kubeadm works, you can ask on Slack in #kubeadm, or open a question on StackOverflow. Please include
relevant tags like #kubernetes
and #kubeadm
so folks can help you.
ebtables
or some similar executable not found during installationRunContainerError
, CrashLoopBackOff
or Error
statecoredns
(or kube-dns
) is stuck in the Pending
stateHostPort
services do not workcoredns
pods have CrashLoopBackOff
or Error
stateebtables
or some similar executable not found during installationIf you see the following warnings while running kubeadm init
[preflight] WARNING: ebtables not found in system path
[preflight] WARNING: ethtool not found in system path
Then you may be missing ebtables
, ethtool
or a similar executable on your node. You can install them with the following commands:
apt install ebtables ethtool
.yum install ebtables ethtool
.If you notice that kubeadm init
hangs after printing out the following line:
[apiclient] Created API client, waiting for the control plane to become ready
This may be caused by a number of problems. The most common are:
/var/log/message
) or examine the output from journalctl -u kubelet
. If you see something like the following: error: failed to run Kubelet: failed to create kubelet:
misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
There are two common ways to fix the cgroup driver problem:
docker ps
and investigating each container by running docker logs
.The following could happen if Docker halts and does not remove any Kubernetes-managed containers:
sudo kubeadm reset
[preflight] Running pre-flight checks
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Removing kubernetes-managed containers
(block)
A possible solution is to restart the Docker service and then re-run kubeadm reset
:
sudo systemctl restart docker.service
sudo kubeadm reset
Inspecting the logs for docker may also be useful:
journalctl -ul docker
RunContainerError
, CrashLoopBackOff
or Error
stateRight after kubeadm init
there should not be any pods in these states.
kubeadm init
, please open an
issue in the kubeadm repo. coredns
(or kube-dns
) should be in the Pending
state
until you have deployed the network solution.RunContainerError
, CrashLoopBackOff
or Error
state
after deploying the network solution and nothing happens to coredns
(or kube-dns
),
it’s very likely that the Pod Network solution and nothing happens to the DNS server, it’s very
likely that the Pod Network solution that you installed is somehow broken. You
might have to grant it more RBAC privileges or use a newer version. Please file
an issue in the Pod Network providers’ issue tracker and get the issue triaged there.MountFlags=slave
option
when booting dockerd
with systemd
and restart docker
. You can see the MountFlags in /usr/lib/systemd/system/docker.service
.
MountFlags can interfere with volumes mounted by Kubernetes, and put the Pods in CrashLoopBackOff
state.
The error happens when Kubernetes does not find var/run/secrets/kubernetes.io/serviceaccount
files.coredns
(or kube-dns
) is stuck in the Pending
stateThis is expected and part of the design. kubeadm is network provider-agnostic, so the admin
should install the pod network solution
of choice. You have to install a Pod Network
before CoreDNS may deployed fully. Hence the Pending
state before the network is set up.
HostPort
services do not workThe HostPort
and HostIP
functionality is available depending on your Pod Network
provider. Please contact the author of the Pod Network solution to find out whether
HostPort
and HostIP
functionality are available.
Calico, Canal, and Flannel CNI providers are verified to support HostPort.
For more information, see the CNI portmap documentation.
If your network provider does not support the portmap CNI plugin, you may need to use the NodePort feature of
services or use HostNetwork=true
.
Many network add-ons do not yet enable hairpin mode which allows pods to access themselves via their Service IP. This is an issue related to CNI. Please contact the network add-on provider to get the latest status of their support for hairpin mode.
If you are using VirtualBox (directly or via Vagrant), you will need to
ensure that hostname -i
returns a routable IP address. By default the first
interface is connected to a non-routable host-only network. A work around
is to modify /etc/hosts
, see this Vagrantfile
for an example.
The following error indicates a possible certificate mismatch.
# kubectl get pods
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
$HOME/.kube/config
file contains a valid certificate, and
regenerate a certificate if necessary. The certificates in a kubeconfig file
are base64 encoded. The base64 -d
command can be used to decode the certificate
and openssl x509 -text -noout
can be used for viewing the certificate information.kubeconfig
for the “admin” user: mv $HOME/.kube $HOME/.kube.bak
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
The following error might indicate that something was wrong in the pod network:
Error from server (NotFound): the server could not find the requested resource
Vagrant typically assigns two interfaces to all VMs. The first, for which all hosts are assigned the IP address 10.0.2.15
, is for external traffic that gets NATed.
This may lead to problems with flannel, which defaults to the first interface on a host. This leads to all hosts thinking they have the same public IP address. To prevent this, pass the --iface eth1
flag to flannel so that the second interface is chosen.
In some situations kubectl logs
and kubectl run
commands may return with the following errors in an otherwise functional cluster:
Error from server: Get https://10.19.0.41:10250/containerLogs/default/mysql-ddc65b868-glc5m/mysql: dial tcp 10.19.0.41:10250: getsockopt: no route to host
eth0
as well as a private one to be used internally as anchor for their floating IP feature, yet kubelet
will pick the latter as the node’s InternalIP
instead of the public one.Use ip addr show
to check for this scenario instead of ifconfig
because ifconfig
will not display the offending alias IP address. Alternatively an API endpoint specific to Digital Ocean allows to query for the anchor IP from the droplet:
curl http://169.254.169.254/metadata/v1/interfaces/public/0/anchor_ipv4/address
The workaround is to tell kubelet
which IP to use using --node-ip
. When using Digital Ocean, it can be the public one (assigned to eth0
) or the private one (assigned to eth1
) should you want to use the optional private network. The KubeletExtraArgs
section of the kubeadm NodeRegistrationOptions
structure can be used for this.
Then restart kubelet
:
systemctl daemon-reload
systemctl restart kubelet
On nodes where the hostname for the kubelet is overridden using the --hostname-override
option, kube-proxy will default to treating 127.0.0.1 as the node IP, which results in rejecting connections for Services configured for externalTrafficPolicy=Local
. This situation can be verified by checking the output of kubectl -n kube-system logs <kube-proxy pod name>
:
W0507 22:33:10.372369 1 server.go:586] Failed to retrieve node info: nodes "ip-10-0-23-78" not found
W0507 22:33:10.372474 1 proxier.go:463] invalid nodeIP, initializing kube-proxy with 127.0.0.1 as nodeIP
A workaround for this is to modify the kube-proxy DaemonSet in the following way:
kubectl -n kube-system patch --type json daemonset kube-proxy -p "$(cat <<'EOF'
[
{
"op": "add",
"path": "/spec/template/spec/containers/0/env",
"value": [
{
"name": "NODE_NAME",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "spec.nodeName"
}
}
}
]
},
{
"op": "add",
"path": "/spec/template/spec/containers/0/command/-",
"value": "--hostname-override=${NODE_NAME}"
}
]
EOF
)"
coredns
pods have CrashLoopBackOff
or Error
stateIf you have nodes that are running SELinux with an older version of Docker you might experience a scenario
where the coredns
pods are not starting. To solve that you can try one of the following options:
coredns
deployment to set allowPrivilegeEscalation
to true
:kubectl -n kube-system get deployment coredns -o yaml | \
sed 's/allowPrivilegeEscalation: false/allowPrivilegeEscalation: true/g' | \
kubectl apply -f -
Another cause for CoreDNS to have CrashLoopBackOff
is when a CoreDNS Pod deployed in Kubernetes detects a loop. A number of workarounds
are available to avoid Kubernetes trying to restart the CoreDNS Pod every time CoreDNS detects the loop and exits.
Warning: Disabling SELinux or settingallowPrivilegeEscalation
totrue
can compromise the security of your cluster.
Was this page helpful?
Thanks for the feedback. If you have a specific, answerable question about how to use Kubernetes, ask it on Stack Overflow. Open an issue in the GitHub repo if you want to report a problem or suggest an improvement.