Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19665#discussion_r149030699
  
    --- Diff: dev/run-tests.py ---
    @@ -289,7 +289,7 @@ def exec_sbt(sbt_args=()):
                                     stdin=echo_proc.stdout,
                                     stdout=subprocess.PIPE)
         echo_proc.wait()
    -    for line in iter(sbt_proc.stdout.readline, ''):
    +    for line in iter(sbt_proc.stdout.readline, b''):
    --- End diff --
    
    
    This previous code causes an infinite loop in Python 3 because `''` is 
`str`; however, `sbt_proc.stdout.readline()` returns `b''`, `bytes` at the end:
    
    This can be tested as below:
    
    ```python
    import subprocess
    sbt_proc = subprocess.Popen(["ls"], stdout=subprocess.PIPE)
    print(type(sbt_proc.stdout.readline()))
    ```
    
    In Python 2:
    
    ```
    >>> import subprocess
    >>> sbt_proc = subprocess.Popen(["ls"], stdout=subprocess.PIPE)
    >>> print(type(sbt_proc.stdout.readline()))
    <type 'str'>
    ```
    
    In Python 3:
    
    ```
    >>> import subprocess
    >>> sbt_proc = subprocess.Popen(["ls"], stdout=subprocess.PIPE)
    >>> print(type(sbt_proc.stdout.readline()))
    <class 'bytes'>
    ```
    
    however,
    
    In Python 2:
    
    ```python
    >>> b'' == ''
    True
    >>> print(type(b''), type(''))
    (<type 'str'>, <type 'str'>)
    ```
    
    In Python 3:
    
    ```python
    >>> b'' == ''
    False
    >>> print(type(b''), type(''))
    <class 'bytes'> <class 'str'>
    ```
    
    The infinite loop can be tested as below, in Python 3:
    
    ```python
    import subprocess
    
    sbt_proc = subprocess.Popen(["ls"], stdout=subprocess.PIPE)
    for line in iter(sbt_proc.stdout.readline, ''):
        print(line)
    ```
    
    In Python 2, the codes above does not cause the infinite loop. This is also 
fine if we use `b''` for the sentinel, because `bytes` is an alias for `str` in 
Python 2.
    
    ```python
    import subprocess
    
    sbt_proc = subprocess.Popen(["ls"], stdout=subprocess.PIPE)
    for line in iter(sbt_proc.stdout.readline, b''):
        print(line)
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to