Unsubscribe
, krishna ramachandran <ram...@s1776.com>
wrote:
> ok i will share a simple example soon. meantime you will be able to see
> this behavior using example here,
>
>
> https://github.com/apache/spark/blob/branch-1.2/examples/src/main/scala/org/apache/spark/examples/mllib/Streami
is happening from what you
> posted so far.
>
> On Fri, Feb 19, 2016 at 10:40 AM, krishna ramachandran <ram...@s1776.com>
> wrote:
>
>> Hi Bryan
>> Agreed. It is a single statement to print the centers once for *every*
>> streaming batch (4 secs) - remember this
Hi Bryan
Agreed. It is a single statement to print the centers once for *every*
streaming batch (4 secs) - remember this is in streaming mode and the
receiver has fresh data every batch. That is, as the model is trained
continuously so I expect the centroids to change with incoming streams (at
I tried these 2 global settings (and restarted the app) after enabling
cache for stream1
conf.set("spark.streaming.unpersist", "true")
streamingContext.remember(Seconds(batchDuration * 4))
batch duration is 4 sec
Using spark-1.4.1. The application runs for about 4-5 hrs then see out of
memory
We have a streaming application containing approximately 12 stages every
batch, running in streaming mode (4 sec batches). Each stage persists
output to cassandra
the pipeline stages
stage 1
---> receive Stream A --> map --> filter -> (union with another stream B)
--> map --> groupbykey -->