[jira] [Comment Edited] (SPARK-2677) BasicBlockFetchIterator#next can wait forever
[ https://issues.apache.org/jira/browse/SPARK-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076033#comment-14076033 ] Guoqiang Li edited comment on SPARK-2677 at 7/28/14 3:00 PM: - [~pwendell] , [~sarutak] How about the following solution? https://github.com/apache/spark/pull/1619 was (Author: gq): [~pwendell] , [~sarutak] How about the following solution? https://github.com/witgo/spark/compare/SPARK-2677 BasicBlockFetchIterator#next can wait forever - Key: SPARK-2677 URL: https://issues.apache.org/jira/browse/SPARK-2677 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 0.9.2, 1.0.0, 1.0.1 Reporter: Kousuke Saruta Priority: Blocker In BasicBlockFetchIterator#next, it waits fetch result on result.take. {code} override def next(): (BlockId, Option[Iterator[Any]]) = { resultsGotten += 1 val startFetchWait = System.currentTimeMillis() val result = results.take() val stopFetchWait = System.currentTimeMillis() _fetchWaitTime += (stopFetchWait - startFetchWait) if (! result.failed) bytesInFlight -= result.size while (!fetchRequests.isEmpty (bytesInFlight == 0 || bytesInFlight + fetchRequests.front.size = maxBytesInFlight)) { sendRequest(fetchRequests.dequeue()) } (result.blockId, if (result.failed) None else Some(result.deserialize())) } {code} But, results is implemented as LinkedBlockingQueue so if remote executor hang up, fetching Executor waits forever. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (SPARK-2677) BasicBlockFetchIterator#next can wait forever
[ https://issues.apache.org/jira/browse/SPARK-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075673#comment-14075673 ] Guoqiang Li edited comment on SPARK-2677 at 7/27/14 6:17 PM: - If {{yarn.scheduler.fair.preemption}} is set to {{true}} in yarn, This issue will appear frequently. was (Author: gq): If {{yarn.scheduler.fair.preemption}} is set to true in yarn, This issue will appear frequently. BasicBlockFetchIterator#next can wait forever - Key: SPARK-2677 URL: https://issues.apache.org/jira/browse/SPARK-2677 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 0.9.2, 1.0.0, 1.0.1 Reporter: Kousuke Saruta Priority: Blocker In BasicBlockFetchIterator#next, it waits fetch result on result.take. {code} override def next(): (BlockId, Option[Iterator[Any]]) = { resultsGotten += 1 val startFetchWait = System.currentTimeMillis() val result = results.take() val stopFetchWait = System.currentTimeMillis() _fetchWaitTime += (stopFetchWait - startFetchWait) if (! result.failed) bytesInFlight -= result.size while (!fetchRequests.isEmpty (bytesInFlight == 0 || bytesInFlight + fetchRequests.front.size = maxBytesInFlight)) { sendRequest(fetchRequests.dequeue()) } (result.blockId, if (result.failed) None else Some(result.deserialize())) } {code} But, results is implemented as LinkedBlockingQueue so if remote executor hang up, fetching Executor waits forever. -- This message was sent by Atlassian JIRA (v6.2#6252)