Configure Runtime Hooks

Configure Runtime Hooks

See Tetragon Runtime Hooks, for an introduction to the topic.

Install Tetragon with Runtime Hooks

We use minikube as the example platform because it supports both cri-o and containerd, but the same steps can be applied to other platforms.

Setup Helm

helm repo add cilium https://helm.cilium.io
helm repo update

Setup cluster

minikube start --driver=kvm2 --container-runtime=cri-o
minikube start --driver=kvm2 --container-runtime=containerd

Tetragon Runtime Hooks use NRI. NRI is enabled by default starting from containerd version 2.0. For version 1.7, however, it needs to be enabled in the configuration.

This requires a section such as:

[plugins."io.containerd.nri.v1.nri"]
  disable = false
  disable_connections = false
  plugin_config_path = "/etc/nri/conf.d"
  plugin_path = "/opt/nri/plugins"
  plugin_registration_timeout = "5s"
  plugin_request_timeout = "2s"
  socket_path = "/var/run/nri/nri.sock"

To be present in containerd’s configuration (e.g., /etc/containerd/config.toml).

You can use the tetragon-oci-hook-setup to patch the configuration file:

minikube ssh cat /etc/containerd/config.toml > /tmp/old-config.toml
./contrib/tetragon-rthooks/tetragon-oci-hook-setup patch-containerd-conf enable-nri --config-file=/tmp/old-config.toml --output=/tmp/new-config.toml
diff -u /tmp/old-config.toml /tmp/new-config.toml

Output should be something like:

--- /tmp/old-config.toml        2024-07-02 11:51:23.893382357 +0200
+++ /tmp/new-config.toml        2024-07-02 11:51:52.841533035 +0200
@@ -67,3 +67,11 @@
     mutation_threshold = 100
     schedule_delay = "0s"
     startup_delay = "100ms"
+  [plugins."io.containerd.nri.v1.nri"]
+    disable = false
+    disable_connections = false
+    plugin_config_path = "/etc/nri/conf.d"
+    plugin_path = "/opt/nri/plugins"
+    plugin_registration_timeout = "5s"
+    plugin_request_timeout = "2s"
+    socket_path = "/var/run/nri/nri.sock"

Install the new configuration file and restart containerd

minikube cp /tmp/new-config.toml /etc/containerd/config.toml
minikube ssh sudo systemctl restart containerd

Install Tetragon

helm install \
   --namespace kube-system \
   --set rthooks.enabled=true \
   --set rthooks.interface=oci-hooks \
   tetragon ./install/kubernetes/tetragon
helm install \
   --namespace kube-system \
   --set rthooks.enabled=true \
   --set rthooks.interface=nri-hook \
   tetragon ./install/kubernetes/tetragon
kubecl -n kube-system get pods | grep tetragon

With output similar to:

tetragon-hpjwq                       2/2     Running   0          2m42s
tetragon-operator-664ddc8957-9lmd2   1/1     Running   0          2m42s
tetragon-rthooks-m24xr               1/1     Running   0          2m42s

Test Runtime Hooks

Start a pod:

kubectl run test --image=debian  --rm -it -- /bin/bash

Check logs:

minikube ssh 'tail -1 /opt/tetragon/tetragon-oci-hook.log'

Output:

{"time":"2024-07-01T10:57:21.435689144Z","level":"INFO","msg":"hook request to agent succeeded","hook":"create-container","start-time":"2024-07-01T10:57:21.433755984Z","req-cgroups":"/kubepods/besteffort/podd4e74de2-0db8-4143-ae55-695b2489c727/crio-828977b42e3149b502b31708778d0c057efbce038af80d0882ed3e0cb0ff8796","req-rootdir":"/run/containers/storage/overlay-containers/828977b42e3149b502b31708778d0c057efbce038af80d0882ed3e0cb0ff8796/userdata","req-containerName":"test"}

Configuring Runtime Hooks installation

Installation directory (installDir)

For tetragon runtime hooks to work, a binary (tetragon-oci-hook) needs to be installed on the host. Installation happens by the tetragon-rthooks daemonset and the binary is installed in /opt/tetragon by default.

In some systems, however, the /opt directory is mounted read-only. This will result in errors such as:

Warning  FailedMount  8s (x5 over 15s)  kubelet            MountVolume.SetUp failed for volume "oci-hook-install-path" : mkdir /opt/tetragon: read-only file system                                                                                                                                     (6 results) [48/6775]

You can use the rthooks.installDir helm variable to select a different location. For example:

--set rthooks.installDir=/run/tetragon

Failure check (failAllowNamespaces)

By default, tetragon-oci-hook logs information to /opt/tetragon/tetragon-oci-hook.log. Inspecting this file we get the following messages.

{"time":"2024-03-05T15:18:52.669044463Z","level":"WARN","msg":"hook request to the agent failed","hook":"create-container","start-time":"2024-03-05T15:18:42.667916779Z","req-cgroups":"/kubepods/besteffort/pod43ec7f32-3c9f-429f-a01c-fbaafff9f8e1/crio-1d18fd58f0879f6152a1c421f8f1e0987845394ee17001a16bee2df441c112f3","req-rootdir":"/run/containers/storage/overlay-containers/1d18fd58f0879f6152a1c421f8f1e0987845394ee17001a16bee2df441c112f3/userdata","err":"connecting to agent (context deadline exceeded) failed: unix:///var/run/cilium/tetragon/tetragon.sock"}
{"time":"2024-03-05T15:18:52.66912411Z","level":"INFO","msg":"failCheck determined that we should not fail this container, even if there was an error","hook":"create-container","start-time":"2024-03-05T15:18:42.667916779Z"}
{"time":"2024-03-05T15:18:53.01093915Z","level":"WARN","msg":"hook request to the agent failed","hook":"create-container","start-time":"2024-03-05T15:18:43.01005032Z","req-cgroups":"/kubepods/burstable/pod60f971e6-ac38-4aa0-b2d3-549333b2c803/crio-c0bf4e38bfa4ed5c58dd314d505f8b6a0f513d2f2de4dc4aa86a55c7c3e963ab","req-rootdir":"/run/containers/storage/overlay-containers/c0bf4e38bfa4ed5c58dd314d505f8b6a0f513d2f2de4dc4aa86a55c7c3e963ab/userdata","err":"connecting to agent (context deadline exceeded) failed: unix:///var/run/cilium/tetragon/tetragon.sock"}
{"time":"2024-03-05T15:18:53.010999098Z","level":"INFO","msg":"failCheck determined that we should not fail this container, even if there was an error","hook":"create-container","start-time":"2024-03-05T15:18:43.01005032Z"}
{"time":"2024-03-05T15:19:04.034580703Z","level":"WARN","msg":"hook request to the agent failed","hook":"create-container","start-time":"2024-03-05T15:18:54.033449685Z","req-cgroups":"/kubepods/besteffort/pod43ec7f32-3c9f-429f-a01c-fbaafff9f8e1/crio-d95e61f118557afdf3713362b9034231fee9bd7033fc8e7cc17d1efccac6f54f","req-rootdir":"/run/containers/storage/overlay-containers/d95e61f118557afdf3713362b9034231fee9bd7033fc8e7cc17d1efccac6f54f/userdata","err":"connecting to agent (context deadline exceeded) failed: unix:///var/run/cilium/tetragon/tetragon.sock"}
{"time":"2024-03-05T15:19:04.03463995Z","level":"INFO","msg":"failCheck determined that we should not fail this container, even if there was an error","hook":"create-container","start-time":"2024-03-05T15:18:54.033449685Z"}

To understand these messages, consider what tetragon-oci-hook should do if it cannot contact the Tetragon agent.

You may want to stop certain workloads from running. For other workloads (for example, the tetragon pod itself) you probably want to do the opposite and let the them start. To this end, tetragon-oci-hook checks the container annotations, and by default does not fail a container if it belongs in the same namespace as Tetragon. The previous messages concern the tetragon containers (tetragon-operator and tetragon) and they indicate that the choice was made not to fail this container from starting.

Furthermore, users may specify additional namespaces where the container will not fail if the tetragon agent cannot be contacted via the rthooks.failAllowNamespaces option.

For example:

rthooks:
  enabled: true
  failAllowNamespaces: namespace1,namespace2
Last modified November 8, 2024: [docs] Fix typo (de1803b59)