[ 
https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493760#comment-15493760
 ] 

Gabor Szadovszky commented on HIVE-14714:
-----------------------------------------

The original problem was the listed exception and that beeline exited only 
after 10s.

The root cause of the 10s delay was that in many cases the spark-submit process 
does not end even in the case of the RemoteDriver has ended on the other side. 
Therefore, the driverThread.join(10000) really waits for 10s and then we are 
interrupting it. Here comes the root cause of the logged exception. If we are 
interrupting child.waitFor() the redirector threads gets IOExceptions in the 
next readLine() as the related streams got closed.

I've redesigned the Redirector class therefore, it does not use any IO which 
might hang the thread in case of interruption (e.g. BufferedReader.readLine() 
cannot be interrupted, it waits for infinity if the related stream is open but 
no input appears). After this redesign we are able to simply interrupt the 
driver thread and let it keep working in the background until we have some 
outputs to be gathered or the related timeout occurs. We do not have to hang 
the client side to wait for all the threads to be finished.

Then came the unit test failure. The root cause was that protocol.endSession() 
only sends a job via rpc asynchronously to close the session on the other side. 
As there is no 10s delay anymore the unit tests executed each after another run 
into the issue that the previous session is not closed properly. Therefore I've 
implemented some trick make the end session synchronous.

Hope it describes my change properly and with my code comments makes it 
understandable.
Any comments here or on the review board are more than welcome. :)

> Finishing Hive on Spark causes "java.io.IOException: Stream closed"
> -------------------------------------------------------------------
>
>                 Key: HIVE-14714
>                 URL: https://issues.apache.org/jira/browse/HIVE-14714
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 1.1.0
>            Reporter: Gabor Szadovszky
>            Assignee: Gabor Szadovszky
>         Attachments: HIVE-14714.2.patch, HIVE-14714.patch
>
>
> After execute hive command with Spark, finishing the beeline session or
> even switch the engine causes IOException. The following executed Ctrl-D to
> finish the session but "!quit" or even "set hive.execution.engine=mr;" causes
> the issue.
> From HS2 log:
> {code}
> 2016-09-06 16:15:12,291 WARN  org.apache.hive.spark.client.SparkClientImpl: 
> [HiveServer2-Handler-Pool: Thread-106]: Timed out shutting down remote 
> driver, interrupting...
> 2016-09-06 16:15:12,291 WARN  org.apache.hive.spark.client.SparkClientImpl: 
> [Driver]: Waiting thread interrupted, killing child process.
> 2016-09-06 16:15:12,296 WARN  org.apache.hive.spark.client.SparkClientImpl: 
> [stderr-redir-1]: Error in redirector thread.
> java.io.IOException: Stream closed
>         at 
> java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162)
>         at java.io.BufferedInputStream.read1(BufferedInputStream.java:272)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>         at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
>         at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
>         at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
>         at java.io.InputStreamReader.read(InputStreamReader.java:184)
>         at java.io.BufferedReader.fill(BufferedReader.java:154)
>         at java.io.BufferedReader.readLine(BufferedReader.java:317)
>         at java.io.BufferedReader.readLine(BufferedReader.java:382)
>         at 
> org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to