[ 
https://issues.apache.org/jira/browse/SPARK-15905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330482#comment-15330482
 ] 

Tejas Patil commented on SPARK-15905:
-------------------------------------

@zsxwing

>> Do you have the whole jstack output?

I will not be able to share it as is .. but then looking at the entire 7k lines 
of jstack file and removing stuff like ip address or any company internal stuff 
seems to be lot of work to me.

>> Could you check you disk? Maybe some bad disks cause the hang.

At the time this happened, I did not notice any problems with disk on the box. 
However, will keep an eye about that next time.

>> By the way, how did you use Spark? Did you just run it or call it via some 
>> Process APIs?

We run spark jobs directly via spark-shell

> Driver hung while writing to console progress bar
> -------------------------------------------------
>
>                 Key: SPARK-15905
>                 URL: https://issues.apache.org/jira/browse/SPARK-15905
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.6.1
>            Reporter: Tejas Patil
>            Priority: Minor
>
> This leads to driver being not able to get heartbeats from its executors and 
> job being stuck. After looking at the locking dependency amongst the driver 
> threads per the jstack, this is where the driver seems to be stuck.
> {noformat}
> "refresh progress" #113 daemon prio=5 os_prio=0 tid=0x00007f7986cbc800 
> nid=0x7887d runnable [0x00007f6d3507a000]
>    java.lang.Thread.State: RUNNABLE
>         at java.io.FileOutputStream.writeBytes(Native Method)
>         at java.io.FileOutputStream.write(FileOutputStream.java:326)
>         at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>         - locked <0x00007f6eb81dd290> (a java.io.BufferedOutputStream)
>         at java.io.PrintStream.write(PrintStream.java:482)
>        - locked <0x00007f6eb81dd258> (a java.io.PrintStream)
>         at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
>         at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
>         at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104)
>         - locked <0x00007f6eb81dd400> (a java.io.OutputStreamWriter)
>         at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:185)
>         at java.io.PrintStream.write(PrintStream.java:527)
>         - locked <0x00007f6eb81dd258> (a java.io.PrintStream)
>         at java.io.PrintStream.print(PrintStream.java:669)
>         at 
> org.apache.spark.ui.ConsoleProgressBar.show(ConsoleProgressBar.scala:99)
>         at 
> org.apache.spark.ui.ConsoleProgressBar.org$apache$spark$ui$ConsoleProgressBar$$refresh(ConsoleProgressBar.scala:69)
>         - locked <0x00007f6ed33b48a0> (a 
> org.apache.spark.ui.ConsoleProgressBar)
>         at 
> org.apache.spark.ui.ConsoleProgressBar$$anon$1.run(ConsoleProgressBar.scala:53)
>         at java.util.TimerThread.mainLoop(Timer.java:555)
>         at java.util.TimerThread.run(Timer.java:505)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to