Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r71871037
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
 ---
    @@ -269,19 +273,11 @@ class StreamExecution(
        * batchId counter is incremented and a new log entry is written with 
the newest offsets.
        */
       private def constructNextBatch(): Unit = {
    -    // There is a potential dead-lock in Hadoop "Shell.runCommand" before 
2.5.0 (HADOOP-10622).
    -    // If we interrupt some thread running Shell.runCommand, we may hit 
this issue.
    -    // As "FileStreamSource.getOffset" will create a file using HDFS API 
and call "Shell.runCommand"
    -    // to set the file permission, we should not interrupt 
"microBatchThread" when running this
    -    // method. See SPARK-14131.
    -    //
         // Check to see what new data is available.
         val hasNewData = {
           awaitBatchLock.lock()
           try {
    -        val newData = microBatchThread.runUninterruptibly {
    -          uniqueSources.flatMap(s => s.getOffset.map(o => s -> o))
    -        }
    +        val newData = uniqueSources.flatMap(s => s.getOffset.map(o => s -> 
o))
    --- End diff --
    
    Just a single line but takes a while to figure out what it does. I'd 
rewrite it to:
    
    ```
    uniqueSources.map(s => (s, s.getOffset))...
    ```
    
    and would do more transformation depending on the types (didn't check in 
IDE) Just an idea to untangle the knots :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to