Could you post the codes of MessageRetriever? And by the way, could you post the screenshot of tasks for a batch and check the input size of these tasks? Considering there are so many events, there should be a lot of blocks as well as a lot of tasks.
On Thu, Jan 14, 2016 at 4:34 PM, Lin Zhao <l...@exabeam.com> wrote: > Hi Shixiong, > > I tried this but it still happens. If it helps, it's 1.6.0 and runs on > YARN. Batch duration is 20 seconds. > > Some logs seemingly related to block manager: > > 16/01/15 00:31:25 INFO receiver.BlockGenerator: Pushed block > input-0-1452817873000 > 16/01/15 00:31:27 INFO storage.MemoryStore: Block input-0-1452817879000 > stored as bytes in memory (estimated size 60.1 MB, free 1563.4 MB) > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 31 > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 30 > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 29 > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 28 > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 27 > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 26 > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 25 > 16/01/15 00:31:28 INFO receiver.BlockGenerator: Pushed block > input-0-1452817879000 > 16/01/15 00:31:28 INFO context.TimeSpanHDFSReader: descr="Waiting to read > [win] message file(s) for 2015-12-17T21:00:00.000." > module=TIMESPAN_HDFS_READER > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 38 > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 37 > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 36 > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 35 > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 34 > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 33 > 16/01/15 00:31:28 INFO storage.BlockManager: Removing RDD 32 > 16/01/15 00:31:29 INFO context.MessageRetriever: descr="Processed 4,636,461 > lines (80 s); 0 event/s (42,636 incoming msg/s) at 2015-12-17T21" > module=MESSAGE_RETRIEVER > 16/01/15 00:31:30 INFO storage.MemoryStore: Block input-0-1452817879200 > stored as bytes in memory (estimated size 93.3 MB, free 924.9 MB) > 16/01/15 00:31:31 INFO storage.BlockManager: Removing RDD 45 > 16/01/15 00:31:31 INFO storage.BlockManager: Removing RDD 44 > 16/01/15 00:31:31 INFO storage.BlockManager: Removing RDD 43 > 16/01/15 00:31:31 INFO storage.BlockManager: Removing RDD 42 > 16/01/15 00:31:31 INFO storage.BlockManager: Removing RDD 41 > 16/01/15 00:31:31 INFO storage.BlockManager: Removing RDD 40 > 16/01/15 00:31:31 INFO storage.BlockManager: Removing RDD 39 > > > From: "Shixiong(Ryan) Zhu" <shixi...@databricks.com> > Date: Thursday, January 14, 2016 at 4:13 PM > To: Lin Zhao <l...@exabeam.com> > Cc: user <user@spark.apache.org> > Subject: Re: Spark Streaming: custom actor receiver losing vast majority > of data > > MEMORY_AND_DISK_SER_2 >