On 2018-01-13 08:12, Daniel Imberman <danielryan2...@gmail.com> wrote: > @jordan can you turn delete mode off and post the kubectl describe results > for the workers? Already had delete mode turned off. This was a really useful command. I can see the basic logs in the k8s dashboard: + airflow run jordan_dag_3 run_this_1 2018-01-15T10:00:00 --local -sd /root/airflow/dags/jordan3.py [2018-01-15 19:40:52,978] {__init__.py:46} INFO - Using executor LocalExecutor [2018-01-15 19:40:53,012] {models.py:187} INFO - Filling up the DagBag from /root/airflow/dags/jordan3.py Traceback (most recent call last): File "/usr/local/bin/airflow", line 27, in <module> args.func(args) File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 350, in run dag = get_dag(args) File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 128, in get_dag 'parse.'.format(args.dag_id)) airflow.exceptions.AirflowException: dag_id could not be found: jordan_dag_3. Either the dag did not exist or it failed to parse.
I know the DAG is there in the scheduler and the webserver. I have reason to believe the git-sync init container in the worker isn't checking out the files in a way that the worker can use. Here's the info you requested: Name: jordandag3runthis1-8df809f80c874d6ca50acb0d0480307c Namespace: default Node: minikube/192.168.99.100 Start Time: Mon, 15 Jan 2018 11:59:57 -0800 Labels: airflow-slave= dag_id=jordan_dag_3 execution_date=2018-01-15T19_59_54.838835 task_id=run_this_1 Annotations: pod.alpha.kubernetes.io/init-container-statuses=[{"name":"git-sync-clone","state":{"terminated":{"exitCode":0,"reason":"Completed","startedAt":"2018-01-15T19:59:58Z","finishedAt":"2018-01-15T19:59:59Z... pod.alpha.kubernetes.io/init-containers=[{"name":"git-sync-clone","image":"gcr.io/google-containers/git-sync-amd64:v2.0.5","env":[{"name":"GIT_SYNC_REPO","value":"<our git repo>... pod.beta.kubernetes.io/init-container-statuses=[{"name":"git-sync-clone","state":{"terminated":{"exitCode":0,"reason":"Completed","startedAt":"2018-01-15T19:59:58Z","finishedAt":"2018-01-15T19:59:59Z"... pod.beta.kubernetes.io/init-containers=[{"name":"git-sync-clone","image":"gcr.io/google-containers/git-sync-amd64:v2.0.5","env":[{"name":"GIT_SYNC_REPO","value":"https://github.com/pubnub/caravan.git"... Status: Failed IP: Init Containers: git-sync-clone: Container ID: docker://c3dcc435d18362271fe5ab8098275d082c01ab36fc451d695e6e0e54ad71132a Image: gcr.io/google-containers/git-sync-amd64:v2.0.5 Image ID: docker-pullable://gcr.io/google-containers/git-sync-amd64@sha256:904833aedf3f14373e73296240ed44d54aecd4c02367b004452dfeca2465e5bf Port: <none> State: Terminated Reason: Completed Exit Code: 0 Started: Mon, 15 Jan 2018 11:59:58 -0800 Finished: Mon, 15 Jan 2018 11:59:59 -0800 Ready: True Restart Count: 0 Environment: GIT_SYNC_REPO: <dag repo> GIT_SYNC_BRANCH: master GIT_SYNC_ROOT: /tmp GIT_SYNC_DEST: dags GIT_SYNC_ONE_TIME: true GIT_SYNC_USERNAME: jzucker2 GIT_SYNC_PASSWORD: <password> Mounts: /root/airflow/airflow.cfg from airflow-config (ro) /root/airflow/dags/ from airflow-dags (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-0bq1k (ro) Containers: base: Container ID: <container id> Image: <image> Image ID: <our image id> Port: <none> Command: bash -cx -- Args: airflow run jordan_dag_3 run_this_1 2018-01-15T19:59:54.838835 --local -sd /root/airflow/dags/jordan3.py State: Terminated Reason: Error Exit Code: 1 Started: Mon, 15 Jan 2018 12:00:00 -0800 Finished: Mon, 15 Jan 2018 12:00:01 -0800 Ready: False Restart Count: 0 Environment: AIRFLOW__CORE__AIRFLOW_HOME: /root/airflow AIRFLOW__CORE__EXECUTOR: LocalExecutor AIRFLOW__CORE__DAGS_FOLDER: /tmp/dags SQL_ALCHEMY_CONN: <set to the key 'sql_alchemy_conn' in secret 'airflow-secrets'> Optional: false GIT_SYNC_USERNAME: <set to the key 'username' in secret 'gitsecret'> Optional: false GIT_SYNC_PASSWORD: <set to the key 'password' in secret 'gitsecret'> Optional: false AIRFLOW_CONN_PORTAL_DB_URI: <set to the key 'portal_mysql_conn' in secret 'portaldbsecret'> Optional: false AIRFLOW_CONN_OVERMIND_DB_URI: <set to the key 'overmind_mysql_conn' in secret 'overminddbsecret'> Optional: false Mounts: /root/airflow/airflow.cfg from airflow-config (ro) /root/airflow/dags/ from airflow-dags (ro) /var/run/secrets/kubernetes.io/serviceaccount from default-token-0bq1k (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: airflow-dags: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: airflow-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: airflow-configmap Optional: false default-token-0bq1k: Type: Secret (a volume populated by a Secret) SecretName: default-token-0bq1k Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 3m default-scheduler Successfully assigned jordandag3runthis1-8df809f80c874d6ca50acb0d0480307c to minikube Normal SuccessfulMountVolume 3m kubelet, minikube MountVolume.SetUp succeeded for volume "airflow-dags" Normal SuccessfulMountVolume 3m kubelet, minikube MountVolume.SetUp succeeded for volume "airflow-config" Normal SuccessfulMountVolume 3m kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-0bq1k" Normal Pulled 3m kubelet, minikube Container image "gcr.io/google-containers/git-sync-amd64:v2.0.5" already present on machine Normal Created 3m kubelet, minikube Created container Normal Started 3m kubelet, minikube Started container Normal Pulled 3m kubelet, minikube Container image "artifactnub1-docker-local.jfrog.io/pubnub/pnairflow:0.1.6" already present on machine Normal Created 3m kubelet, minikube Created container Normal Started 3m kubelet, minikube Started container > > On Sat, Jan 13, 2018, 3:20 AM Koen Mevissen <kmevis...@travix.com> wrote: > > > Are you using kubernetes on Google Cloud Platform? (GKE) > > > > You should be able to capture the logs from your nodes. In case you run GKE > > with logging automatically deployed then deamonsets with fluentd will ship > > logs from /var/log/containers on the nose to Google Cloud Logging. > > > > Koen > > > > > > > > > > > > > > Op za 13 jan. 2018 om 01:18 schreef Anirudh Ramanathan > > <ramanath...@google.com.invalid> > > > > > > Any good way to debug this? > > > > > > One way might be reading the events from "kubectl get events". That > > should > > > reveal some information about the pod removal event. > > > This brings up another question - should errored pods be persisted for > > > debugging? > > > > > > On Fri, Jan 12, 2018 at 3:07 PM, jordan.zuc...@gmail.com < > > > jordan.zuc...@gmail.com> wrote: > > > > > > > I'm trying to use Airflow and Kubernetes and having trouble using git > > > sync > > > > to pull DAGs into workers. > > > > > > > > I use a git sync init container on the scheduler to pull in DAGs > > > initially > > > > and that works. But when worker pods are spawned, the workers terminate > > > > almost immediately because they cannot find the DAGs. But since the > > > workers > > > > terminate so quickly, I can't even inspect the file structure to see > > > where > > > > the DAGs ended up during the workers git sync init container. > > > > > > > > I noticed that the git sync init container for the workers is hard > > coded > > > > into /tmp/dags and there is a git_subpath config setting as well. But I > > > > can't understand how the git synced DAGs ever end up in > > > /root/airflow/dags > > > > > > > > I am successfully using a git sync init container for the scheduler, > > so I > > > > know my git credentials are valid. Any good way to debug this? Or an > > > > example of how to set this up correctly? > > > > > > > > > > > > > > > > -- > > > Anirudh Ramanathan > > > > > -- > > Kind regards, > > Met vriendelijke groet, > > > > *Koen Mevissen* > > Principal BI Developer > > > > > > *Travix Nederland B.V.* > > Piet Heinkade 55 > > 1019 GM Amsterdam > > The Netherlands > > > > T. +31 (0)20 203 3241 > > E: kmevis...@travix.com > > www.travix.com > > > > *Brands: * CheapTickets | Vliegwinkel | Vayama | BudgetAir | > > Flugladen > > >