I'm glad you solved this issue but have a followup question for you.
Wouldn't Akhil's solution be better for you after all? I run similar
computation where a large set of data gets reduced to a much smaller
aggregate in an interval. If you do saveAsText without coalescing, I
believe you'd get the
Hi,
I applied it as fallows:
eegStreams(a) = KafkaUtils.createStream(ssc, zkQuorum, group, Map(args(a) -
1),StorageLevel.MEMORY_AND_DISK_SER).map(_._2)val counts = eegStreams(a).map(x
= math.round(x.toDouble)).countByValueAndWindow(Seconds(4), Seconds(4))
val sortedCounts =
Hi,
It worked out as this.
val topCounts = sortedCounts.transform(rdd = rdd.zipWithIndex().filter(x=x._2
=10))
Regards,Laeeq
On Wednesday, January 7, 2015 5:25 PM, Laeeq Ahmed
laeeqsp...@yahoo.com.INVALID wrote:
Hi Yana,
I also think thatval top10 = your_stream.mapPartitions(rdd =
Oh yeah. In that case you can simply repartition it into 1 and do
mapPartition.
val top10 = mysream.repartition(1).mapPartitions(rdd = rdd.take(10))
On 7 Jan 2015 21:08, Laeeq Ahmed laeeqsp...@yahoo.com wrote:
Hi,
I applied it as fallows:
eegStreams(a) = KafkaUtils.createStream(ssc,
Hi Yana,
I also think thatval top10 = your_stream.mapPartitions(rdd = rdd.take(10))
will give top 10 from each partition. I will try your code.
Regards,Laeeq
On Wednesday, January 7, 2015 5:19 PM, Yana Kadiyska
yana.kadiy...@gmail.com wrote:
My understanding is that
val top10 =
You can try something like:
*val top10 = your_stream.mapPartitions(rdd = rdd.take(10))*
Thanks
Best Regards
On Mon, Jan 5, 2015 at 11:08 PM, Laeeq Ahmed laeeqsp...@yahoo.com.invalid
wrote:
Hi,
I am counting values in each window and find the top values and want to
save only the top 10
Hi,
I am counting values in each window and find the top values and want to save
only the top 10 frequent values of each window to hdfs rather than all the
values.
eegStreams(a) = KafkaUtils.createStream(ssc, zkQuorum, group, Map(args(a) -
1),StorageLevel.MEMORY_AND_DISK_SER).map(_._2)val