Unsubscribe

2016-12-16 Thread krishna ramachandran
Unsubscribe

Re: StreamingKMeans does not update cluster centroid locations

2016-02-19 Thread krishna ramachandran
, krishna ramachandran <ram...@s1776.com> wrote: > ok i will share a simple example soon. meantime you will be able to see > this behavior using example here, > > > https://github.com/apache/spark/blob/branch-1.2/examples/src/main/scala/org/apache/spark/examples/mllib/Streami

Re: StreamingKMeans does not update cluster centroid locations

2016-02-19 Thread krishna ramachandran
is happening from what you > posted so far. > > On Fri, Feb 19, 2016 at 10:40 AM, krishna ramachandran <ram...@s1776.com> > wrote: > >> Hi Bryan >> Agreed. It is a single statement to print the centers once for *every* >> streaming batch (4 secs) - remember this

Re: StreamingKMeans does not update cluster centroid locations

2016-02-19 Thread krishna ramachandran
Hi Bryan Agreed. It is a single statement to print the centers once for *every* streaming batch (4 secs) - remember this is in streaming mode and the receiver has fresh data every batch. That is, as the model is trained continuously so I expect the centroids to change with incoming streams (at

Re: adding a split and union to a streaming application cause big performance hit

2016-02-18 Thread krishna ramachandran
I tried these 2 global settings (and restarted the app) after enabling cache for stream1 conf.set("spark.streaming.unpersist", "true") streamingContext.remember(Seconds(batchDuration * 4)) batch duration is 4 sec Using spark-1.4.1. The application runs for about 4-5 hrs then see out of memory

streaming application redundant dag stage execution/performance/caching

2016-02-16 Thread krishna ramachandran
We have a streaming application containing approximately 12 stages every batch, running in streaming mode (4 sec batches). Each stage persists output to cassandra the pipeline stages stage 1 ---> receive Stream A --> map --> filter -> (union with another stream B) --> map --> groupbykey -->