Hi, I have a question about sampling Spark Streaming data, or getting part of
the data. For every minute, I only want the data read in during the first 10
seconds, and discard all data in the next 50 seconds. Is there any way to pause
reading and discard data in that period? I'm doing this to
...
On Thu, Aug 6, 2015 at 12:50 AM, Heath Guo
heath...@fb.commailto:heath...@fb.com wrote:
Hi, I have a question about sampling Spark Streaming data, or getting part of
the data. For every minute, I only want the data read in during the first 10
seconds, and discard all data in the next 50 seconds
Yes, it is lots of data, and the utility I'm working with prints out infinite
real time data stream. Thanks.
From: Tathagata Das t...@databricks.commailto:t...@databricks.com
Date: Thursday, June 11, 2015 at 11:43 PM
To: Heath Guo heath...@fb.commailto:heath...@fb.com
Cc: user user
To: Heath Guo heath...@fb.commailto:heath...@fb.com
Cc: user user@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: Spark Streaming reads from stdin or output from command line
utility
Are you going to receive data from one stdin from one machine, or many stdins
on many machines?
On Thu, Jun