Kousuke Saruta created SPARK-2677:
-------------------------------------

             Summary: BasicBlockFetchIterator#next can be wait forever
                 Key: SPARK-2677
                 URL: https://issues.apache.org/jira/browse/SPARK-2677
             Project: Spark
          Issue Type: Bug
    Affects Versions: 1.0.0
            Reporter: Kousuke Saruta
            Priority: Critical


In BasicBlockFetchIterator#next, it waits fetch result on result.take.

{code}

    override def next(): (BlockId, Option[Iterator[Any]]) = {
      resultsGotten += 1
      val startFetchWait = System.currentTimeMillis()
      val result = results.take()
      val stopFetchWait = System.currentTimeMillis()
      _fetchWaitTime += (stopFetchWait - startFetchWait)
      if (! result.failed) bytesInFlight -= result.size
      while (!fetchRequests.isEmpty &&
        (bytesInFlight == 0 || bytesInFlight + fetchRequests.front.size <= 
maxBytesInFlight)) {
        sendRequest(fetchRequests.dequeue())
      }
      (result.blockId, if (result.failed) None else Some(result.deserialize()))
    }
{code}

But, results is implemented as LinkedBlockingQueue so if remote executor hang 
up, fetching Executor waits forever.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to