[ 
https://issues.apache.org/jira/browse/SPARK-10898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097671#comment-15097671
 ] 

Praveen Devarao commented on SPARK-10898:
-----------------------------------------

Hi [~mark.goodall]

I tried to repro the issue with the code as in this gist 
https://gist.github.com/praveend/2ca728ca3839a09875e1 but don't see any issue 
of data being lost/Block deleted.

Could you check the program in the gist and let me know if there is something 
that I am missing.

I am running this against SPARK 1.6

Thanks

Praveen

> Setting spark.streaming.concurrentJobs causes blocks to be deleted before read
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-10898
>                 URL: https://issues.apache.org/jira/browse/SPARK-10898
>             Project: Spark
>          Issue Type: Bug
>          Components: Streaming
>    Affects Versions: 1.4.1
>         Environment: CentOS 6.6
>            Reporter: Mark Goodall
>
> The scheduler deletes the block literally just before it is used first time. 
> The input is set to mem and disk ser.
> 15/10/01 15:10:04 INFO scheduler.InputInfoTracker: remove old batch metadata: 
> 1443708599000 ms 1443708602000 ms 1443708601000 ms 1443708600000 ms
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708601800 on discos8.localdomain:45076 in memory (size: 8.7 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708602000 on discos8.localdomain:45076 in memory (size: 8.7 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708602200 on discos8.localdomain:45076 in memory (size: 7.3 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708602400 on discos8.localdomain:45076 in memory (size: 5.7 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708602600 on discos8.localdomain:45076 in memory (size: 2.6 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708599800 on discos8.localdomain:45076 in memory (size: 5.8 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708600000 on discos8.localdomain:45076 in memory (size: 6.4 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708600200 on discos8.localdomain:45076 in memory (size: 7.0 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708600400 on discos8.localdomain:45076 in memory (size: 6.9 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708600600 on discos8.localdomain:45076 in memory (size: 3.8 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708600800 on discos8.localdomain:45076 in memory (size: 4.2 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708601000 on discos8.localdomain:45076 in memory (size: 4.7 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708601200 on discos8.localdomain:45076 in memory (size: 5.4 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708601400 on discos8.localdomain:45076 in memory (size: 5.5 MB, 
> free: 3.0 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708601600 on discos8.localdomain:45076 in memory (size: 8.9 MB, 
> free: 3.1 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708598800 on discos8.localdomain:45076 in memory (size: 8.1 MB, 
> free: 3.1 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708599000 on discos8.localdomain:45076 in memory (size: 7.8 MB, 
> free: 3.1 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708599200 on discos8.localdomain:45076 in memory (size: 5.9 MB, 
> free: 3.1 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708599400 on discos8.localdomain:45076 in memory (size: 6.0 MB, 
> free: 3.1 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Removed 
> input-0-1443708599600 on discos8.localdomain:45076 in memory (size: 6.6 MB, 
> free: 3.1 GB)
> 15/10/01 15:10:04 INFO storage.BlockManagerInfo: Added input-0-1443708604600 
> in memory on discos8.localdomain:45076 (size: 8.7 MB, free: 3.1 GB)
> 15/10/01 15:10:04 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 6.0 
> (TID 84, discos1.localdomain): java.lang.Exception: Could not compute split, 
> block input-0-1443708599800 not found
>       at org.apache.spark.rdd.BlockRDD.compute(BlockRDD.scala:51)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>       at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>       at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>       at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>       at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>       at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>       at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>       at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>       at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
>       at org.apache.spark.scheduler.Task.run(Task.scala:70)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to