albertusk95 commented on issue #7075: [AIRFLOW-6212] SparkSubmitHook resolve connection URL: https://github.com/apache/airflow/pull/7075#issuecomment-571470394 @tooptoop4 I think you might want to try this sample DAG to reproduce the issue. a) create an environment var for spark connection. ``` export AIRFLOW_CONN_SPARK_DEFAULT='{"conn_type": spark, "host":<host>, "port":<port>}' ``` b) create a DAG file to run ```python from airflow import DAG from airflow.contrib.operators.spark_submit_operator import SparkSubmitOperator from datetime import datetime, timedelta import os job_file = 'path/to/job/file' default_args = { 'depends_on_past': False, 'start_date': <fill_start_date>, 'retries': <fill_retries>, 'retry_delay': <fill_retry_delay> } dag = DAG('spark-submit-hook', default_args=default_args, schedule_interval=<fill_interval>) avg = SparkSubmitOperator(task_id=<fill_task_id>, dag=dag, application=job_file, spark_binary='path/to/spark-submit') ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services