Pause Spark Streaming reading or sampling streaming data

2015-08-05 Thread Heath Guo
Hi, I have a question about sampling Spark Streaming data, or getting part of the data. For every minute, I only want the data read in during the first 10 seconds, and discard all data in the next 50 seconds. Is there any way to pause reading and discard data in that period? I'm doing this to

Re: Pause Spark Streaming reading or sampling streaming data

2015-08-05 Thread Heath Guo
... On Thu, Aug 6, 2015 at 12:50 AM, Heath Guo heath...@fb.commailto:heath...@fb.com wrote: Hi, I have a question about sampling Spark Streaming data, or getting part of the data. For every minute, I only want the data read in during the first 10 seconds, and discard all data in the next 50 seconds

Re: Spark Streaming reads from stdin or output from command line utility

2015-06-12 Thread Heath Guo
Yes, it is lots of data, and the utility I'm working with prints out infinite real time data stream. Thanks. From: Tathagata Das t...@databricks.commailto:t...@databricks.com Date: Thursday, June 11, 2015 at 11:43 PM To: Heath Guo heath...@fb.commailto:heath...@fb.com Cc: user user

Re: Spark Streaming reads from stdin or output from command line utility

2015-06-11 Thread Heath Guo
To: Heath Guo heath...@fb.commailto:heath...@fb.com Cc: user user@spark.apache.orgmailto:user@spark.apache.org Subject: Re: Spark Streaming reads from stdin or output from command line utility Are you going to receive data from one stdin from one machine, or many stdins on many machines? On Thu, Jun