Re: [Structured Streaming] Custom StateStoreProvider

2018-07-12 Thread Jungtaek Lim
Girish, I think reading through implementation of HDFSBackedStateStoreProvider as well as relevant traits should bring the idea to you how to implement custom one. HDFSBackedStateStoreProvider is not that complicated to read and understand. You just need to deal with your underlying storage

Re: [Structured Streaming] Custom StateStoreProvider

2018-07-10 Thread Tathagata Das
Note that this is not public API yet. Hence this is not very documented. So use it at your own risk :) On Tue, Jul 10, 2018 at 11:04 AM, subramgr wrote: > Hi, > > This looks very daunting *trait* is there some blog post or some articles > which explains on how to implement this *trait* > >

Re: [Structured Streaming] Custom StateStoreProvider

2018-07-10 Thread subramgr
Hi, This looks very daunting *trait* is there some blog post or some articles which explains on how to implement this *trait* Thanks Girish -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe

Re: [Structured Streaming] Custom StateStoreProvider

2018-07-10 Thread Stefan Van Wouw
Hi Girish, You can implement a custom state store provider by implementing the StateStore trait ( https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala ) and setting the correct Spark configuration accordingly:

[Structured Streaming] Custom StateStoreProvider

2018-07-10 Thread subramgr
Hi, Currently we are using HDFS for our checkpointing but we are having issues maintaining a HDFS cluster. We tried glusterfs in the past for checkpointing but in our setup glusterfs does not work well. We are evaluating using Cassandra for storing the checkpoint data. Has any one implemented