Re: Spark 2.3.0 and Custom Sink

2018-06-21 Thread Yogesh Mahajan
Since ForeachWriter works at a record level so you cannot do bulk ingest into KairosDB, which supports bulk inserts. This will be slow. Instead, you can have your own Sink implementation which is a batch (DataFrame) level. Thanks, http://www.snappydata.io/blog On Thu, Jun

Inefficient state management in stream to stream join in 2.3

2018-02-13 Thread Yogesh Mahajan
In 2.3, stream to stream joins(both Inner and Outer) are implemented using symmetric hash join(SHJ) algorithm, and that is a good choice and I am sure you had compared with other family of algorithms like XJoin and non-blocking sort based algorithms like progressive merge join (PMJ

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-02-13 Thread Yogesh Mahajan
I had a similar issue and i think that’s where the structured streaming design lacks. Seems like Question#2 in your email is a viable workaround for you. In my case, I have a custom Sink backed by an efficient in-memory column store suited for fast ingestion. I have a Kafka stream coming from

Re: Max number of streams supported ?

2018-01-31 Thread Yogesh Mahajan
Thanks Michael, TD for quick reply. It was helpful. I will let you know the numbers(limit) based on my experiments. On Wed, Jan 31, 2018 at 3:10 PM, Tathagata Das wrote: > Just to clarify a subtle difference between DStreams and Structured > Streaming. Multiple

Re: Do I need to install Cassandra node on Spark Master node to work with Cassandra?

2016-05-04 Thread Yogesh Mahajan
You can have a Spark master where Cassandra is not running locally. I have tried this before. Spark cluster and Cassandra cluster could be on two different hosts, but to colocate, you can have both the executor and Cassandra node on same host. Thanks, http://www.snappydata.io/blog

Re: [Streaming] textFileStream has no events shown in web UI

2016-04-11 Thread Yogesh Mahajan
Yes, this has observed in my case also. The Input Rate is 0 even in case of rawSocketStream. Is there a way we can enable the Input Rate for these types of streams ? Thanks, http://www.snappydata.io/blog On Wed, Mar 16, 2016 at 4:21 PM, Hao Ren wrote: >

Re: Scala types to StructType

2016-02-11 Thread Yogesh Mahajan
CatatlystTypeConverters.scala has all types of utility methods to convert from Scala to row and vice a versa. On Fri, Feb 12, 2016 at 12:21 AM, Rishabh Wadhawan wrote: > I had the same issue. I resolved it in Java, but I am pretty sure it would > work with scala too. Its

Re: Scala types to StructType

2016-02-11 Thread Yogesh Mahajan
Right, Thanks Ted. On Fri, Feb 12, 2016 at 10:21 AM, Ted Yu <yuzhih...@gmail.com> wrote: > Minor correction: the class is CatalystTypeConverters.scala > > On Thu, Feb 11, 2016 at 8:46 PM, Yogesh Mahajan <ymaha...@snappydata.io> > wrote: > >> CatatlystTypeConverte

Re: Spark Streaming : Limiting number of receivers per executor

2016-02-10 Thread Yogesh Mahajan
Hi Ajay, Have you overridden Receiver#preferredLocation method in your custom Receiver? You can specify hostname for your Receiver. Check the ReceiverSchedulingPolicy#scheduleReceivers, it should honor your preferredLocation value for Receiver scheduling. On Wed, Feb 10, 2016 at 4:04 PM, ajay

Re: Explaination for info shown in UI

2016-02-01 Thread Yogesh Mahajan
Spark jobs per batch dstream1.foreachRDD { rdd => rdd.count } ; dstream2.foreachRDD { rdd => rdd.count } // TWO Spark jobs per batch Regards, Yogesh Mahajan SnappyData Inc (snappydata.io) On Thu, Jan 28, 2016 at 4:30 PM, Sachin Aggarwal <different.sac...@gmail.com > wrote: >

Re: 答复: spark streaming context trigger invoke stop why?

2016-01-13 Thread Yogesh Mahajan
racefully) from shutdown hook") // Do not stop SparkContext, let its own shutdown hook stop it stop(stopSparkContext = false, stopGracefully = stopGracefully) } Regards, Yogesh Mahajan, SnappyData Inc, snappydata.io On Thu, Jan 14, 2016 at 8:55 AM, Triones,Deng(vip.com) < triones.

Re: Manipulate Twitter Stream Filter on runtime

2016-01-13 Thread Yogesh Mahajan
Hi Alem, I haven't tried it, but can you give a try and TwitterStream.clenup and add your modified filter if it works ? I am using twitter4j 4.0.4 with spark streaming. Regards, Yogesh Mahajan SnappyData Inc, snappydata.io On Mon, Jan 11, 2016 at 6:43 PM, Filli Alem <alem.fi...@ti8m.ch>

Re: 答复: spark streaming context trigger invoke stop why?

2016-01-13 Thread Yogesh Mahajan
All the action happens in ApplicationMaster expecially in run method Check ApplicationMaster#startUserApplication : userThread(Driver) which invokes ApplicationMaster#finish method. You can also try System.exit in your program Regards, Yogesh Mahajan, SnappyData Inc, snappydata.io On Thu, Jan 14

New spark meetup

2015-09-30 Thread Yogesh Mahajan
Hi, Can you please get this new spark meetup listed on the spark community page - http://spark.apache.org/community.html#events Here is a link for the meetup in Pune, India : http://www.meetup.com/Pune-Apache-Spark-Meetup/ Thanks, Yogesh Sent from my iPhone