val aDstream = ...
val distinctStream = aDstream.transform(_.distinct())
but the elements in distinctStream are not distinct.
Did I use it wrong?
Thanks, Shao
On Wed, Mar 18, 2015 at 3:34 PM, Shao, Saisai saisai.s...@intel.com wrote:
Yeah, as I said your job processing time is much larger than the sliding
window, and streaming job is executed one by one in sequence, so the next
job will wait until the first job is finished, so the
, Darren Hoo darren@gmail.com wrote:
Thanks, Shao
On Wed, Mar 18, 2015 at 3:34 PM, Shao, Saisai saisai.s...@intel.com
wrote:
Yeah, as I said your job processing time is much larger than the
sliding window, and streaming job is executed one by one in sequence, so
the next job will wait until
On Wed, Mar 18, 2015 at 8:31 PM, Shao, Saisai saisai.s...@intel.com wrote:
From the log you pasted I think this (-rw-r--r-- 1 root root 80K Mar
18 16:54 shuffle_47_519_0.data) is not shuffle spilled data, but the
final shuffle result.
why the shuffle result is written to disk?
As I
is larger
than the sliding window, so maybe you computation power cannot reach to the
qps you wanted.
I think you need to identify the bottleneck at first, and then trying to
tune your code, balance the data, add more computation resources.
Thanks
Jerry
*From:* Darren Hoo