Hi,

I see a lot of data getting filled locally as shown below from my streaming
job. I have my checkpoint set to hdfs. But, I still see the following data
filling my local nodes. Any idea if I can make this stored in hdfs instead
of storing the data locally?

-rw-r--r--  1        520 Sep 17 18:43 shuffle_23119_5_0.index
-rw-r--r--  1 180564255 Sep 17 18:43 shuffle_23129_2_0.data
-rw-r--r--  1 364850277 Sep 17 18:45 shuffle_23145_8_0.data
-rw-r--r--  1  267583750 Sep 17 18:46 shuffle_23105_4_0.data
-rw-r--r--  1  136178819 Sep 17 18:48 shuffle_23123_8_0.data
-rw-r--r--  1  159931184 Sep 17 18:48 shuffle_23167_8_0.data
-rw-r--r--  1        520 Sep 17 18:49 shuffle_23315_7_0.index
-rw-r--r--  1        520 Sep 17 18:50 shuffle_23319_3_0.index
-rw-r--r--  1   92240350 Sep 17 18:51 shuffle_23305_2_0.data
-rw-r--r--  1   40380158 Sep 17 18:51 shuffle_23323_6_0.data
-rw-r--r--  1  369653284 Sep 17 18:52 shuffle_23103_6_0.data
-rw-r--r--  1  371932812 Sep 17 18:52 shuffle_23125_6_0.data
-rw-r--r--  1   19857974 Sep 17 18:53 shuffle_23291_19_0.data
-rw-r--r--  1  55342005 Sep 17 18:53 shuffle_23305_8_0.data
-rw-r--r--  1   92920590 Sep 17 18:53 shuffle_23303_4_0.data


Thanks,
Swetha



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-job-filling-a-lot-of-data-in-local-spark-nodes-tp24846.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to