[ 
https://issues.apache.org/jira/browse/SPARK-34198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276893#comment-17276893
 ] 

L. C. Hsieh commented on SPARK-34198:
-------------------------------------

Thanks [~kabhwan] for your point.

Besides the maintenance cost of extra code, I remember one concern of adding 
it, is the rocksdb dependency. I think the concern is valid and so it actually 
does have some differences between putting in sql core module or as an external 
module. IIUC, that is why we have external modules.

If raising a discussion in dev mailing list helps, I think I will do it.

The RocksDB StateStore we are working with, is also based on the existing 
implementation with our bug fix. So I think the review cost should be as lower 
as possible even we submit the changed code. Of course if the original author 
can contribute the code, it will be great too. And sure, this depends on what 
the consensus we get eventually.











> Add RocksDB StateStore as external module
> -----------------------------------------
>
>                 Key: SPARK-34198
>                 URL: https://issues.apache.org/jira/browse/SPARK-34198
>             Project: Spark
>          Issue Type: New Feature
>          Components: Structured Streaming
>    Affects Versions: 3.2.0
>            Reporter: L. C. Hsieh
>            Assignee: L. C. Hsieh
>            Priority: Major
>
> Currently Spark SS only has one built-in StateStore implementation 
> HDFSBackedStateStore. Actually it uses in-memory map to store state rows. As 
> there are more and more streaming applications, some of them requires to use 
> large state in stateful operations such as streaming aggregation and join.
> Several other major streaming frameworks already use RocksDB for state 
> management. So it is proven to be good choice for large state usage. But 
> Spark SS still lacks of a built-in state store for the requirement.
> We would like to explore the possibility to add RocksDB-based StateStore into 
> Spark SS. For the concern about adding RocksDB as a direct dependency, our 
> plan is to add this StateStore as an external module first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to