Hi I am trying to specify this env variable in
dag = DAG( dag_id='SparkJDBC', schedule_interval=dt.timedelta(hours=4), start_date=airflow.utils.dates.days_ago(2),) *os.environ['HADOOP_CONF_DIR'] = "/etc/hadoop/conf" os.environ['YARN_CONF_DIR'] = "/etc/hadoop/conf"* task1 = SparkJDBCOperator( task_id='SparkJDBC', dag=dag, spark_app_name="TEST", cmd_type="jdbc_to_spark", conn_id="spark_default", spark_conn_id="spark_default", jdbc_conn_id="oracle_jdbc_test", jdbc_table="TEST",* env_vars='{"HADOOP_CONF_DIR":"/etc/hadoop/conf","YARN_CONF_DIR":"/etc/hadoop/conf"}',* metastore_table="TEST", verbose=True) but I have this error 019-02-06 14:42:01,043] {{base_task_runner.py:101}} INFO - Job 13534: Subtask SparkJDBC [2019-02-06 14:42:01,041] {{spark_submit_hook.py:283}} INFO - Spark-Submit cmd: ['spark-submit', '--master', 'yarn', '--name', 'TEST', '--verbose', '--queue', 'root.default', '/opt/conda/miniconda/envs/airflow-dask/lib/python3.6/site-packages/airflow/contrib/hooks/spark_jdbc_script.py', '-cmdType', 'jdbc_to_spark', '-url', 'jdbc:oracle:thin:@//*****/******/', '-user', '*****', '-password', '*******', '-metastoreTable', 'TEST', '-jdbcTable', 'TEST'] [2019-02-06 14:42:01,951] {{base_task_runner.py:101}} INFO - Job 13534: Subtask SparkJDBC [2019-02-06 14:42:01,950] {{spark_submit_hook.py:415}} INFO - Using properties file: null [2019-02-06 14:42:01,969] {{base_task_runner.py:101}} INFO - Job 13534: Subtask SparkJDBC [2019-02-06 14:42:01,969] {{spark_submit_hook.py:415}} INFO - Exception in thread "main" org.apache.spark.SparkException: When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment. ................................. ................................. Could you help me ? Thanks!!! Regards, Iván Robla