Inefficient state management in stream to stream join in 2.3

2018-02-13 Thread Yogesh Mahajan
In 2.3, stream to stream joins(both Inner and Outer) are implemented using symmetric hash join(SHJ) algorithm, and that is a good choice and I am sure you had compared with other family of algorithms like XJoin and non-blocking sort based algorithms like progressive merge join (PMJ

StateStoreRestoreExec in case of executor/driver failures

2018-01-31 Thread Yogesh Mahajan
Hi, In case of structured streaming StateStore, if I plug-in an embedded store like, RocksDB to maintain per partition aggregation state inside my executors and state will be maintained per partitionId and operatorId. How do I reuse/map to the old state for the cases of executor failures or

Any plans for making StateStore pluggable?

2017-05-09 Thread Yogesh Mahajan
Hi Team, Any plans to make the StateStoreProvider/StateStore in structured streaming pluggable ? Currently StateStore#loadedProviders has only one HDFSBackedStateStoreProvider and it's not configurable. If we make this configurable, users can bring in their own implementation of StateStore.

Re: statistics collection and propagation for cost-based optimizer

2016-11-14 Thread Yogesh Mahajan
? “spark.sql.cbo" is false by default as it's just experimental ? 8. ANALYZE TABLE, analyzeColumns etc ... all look good. 9. From the release point of view, how this is planned ? Will all this be implemented in one go or in phases? Thanks, Yogesh Mahajan http://www.snappydata.io/blog <http://snapp

Re: [Streaming] textFileStream has no events shown in web UI

2016-04-11 Thread Yogesh Mahajan
Yes, this has observed in my case also. The Input Rate is 0 even in case of rawSocketStream. Is there a way we can enable the Input Rate for these types of streams ? Thanks, http://www.snappydata.io/blog On Wed, Mar 16, 2016 at 4:21 PM, Hao Ren wrote: >

Streaming : stopping output transformations explicitly

2015-11-24 Thread Yogesh Mahajan
Hi, Is there a way to stop output transformations on a stream without stopping streamingContext ? Yogesh Mahajan SnappayData Inc, OLTP+OLAP inside Spark for real time analytics

Re: CQs on WindowedStream created on running StreamingContext

2015-10-06 Thread Yogesh Mahajan
Anyone knows about this ? TD ? -yogesh > On 30-Sep-2015, at 1:25 pm, Yogs wrote: > > Hi, > > We intend to run adhoc windowed continuous queries on spark streaming data. > The queries could be registered/deregistered dynamically or can be submitted > through