Luiz Svoboda created AIRFLOW-4289:
-------------------------------------

             Summary: spark_binary argument in SparkSubmitHook is ignored when 
building the connection_cmd
                 Key: AIRFLOW-4289
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4289
             Project: Apache Airflow
          Issue Type: Bug
          Components: contrib, hooks
    Affects Versions: 1.10.3
            Reporter: Luiz Svoboda


When using the SparkSubmitOperator, although it is possible to specify the 
parameter _spark_binary_, its value is ignored during the creation of the 
_connection_cmd_. Instead, the value used for this property is extracted from 
the connection parameters, or it defaults to _spark-submit_ as can be seen in 
[spark_submit_hook 
line:190|https://github.com/apache/airflow/blob/1.10.3/airflow/contrib/hooks/spark_submit_hook.py#L190]

Actually, this configuration is a bit confusing as the user can configure it 
via _connection_ or directly when creating the operator instance. I suggest 
keeping only one option, and in this case, [IMHO] the connection approach seems 
to be better as it is already used to configure some other options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to