huage1994 opened a new pull request #19449:
URL: https://github.com/apache/airflow/pull/19449


   The log of SparkSql Operator will go to infinite loop, forever printing 
`INFO - b''`.
   
   The reason is at line 185 in SparkSql.py : 
   `subprocess.Popen(spark_sql_cmd, stdout=subprocess.PIPE, 
stderr=subprocess.STDOUT, **kwargs)`
   It should pass `universal_newlines=True` into this function, as what 
spark_submit.py do. 
   
   The default values of this function are: universal_newlines=False,meaning 
input/output is accepted as bytes, not Unicode strings plus the universal 
newlines mode handling.
   
   [As is discussed in 
stackoverflow](https://stackoverflow.com/questions/38181494/what-is-the-difference-between-using-universal-newlines-true-with-bufsize-1-an):
 
   > If universal_newlines=False then `for line in iter(self._sp.stdout)` 
iterates over b'\n'-separated lines. If the process uses non-ascii encoding 
e.g., UTF-16 for its output then even if os.linesep == '\n' on your system; you 
may get a wrong result. If you want to consume text lines, use the text mode: 
pass universal_newlines=True or use io.TextIOWrapper(process.stdout) explicitly.
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to