[ 
https://issues.apache.org/jira/browse/SPARK-34198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276875#comment-17276875
 ] 

Cheng Su commented on SPARK-34198:
----------------------------------

Sorry I think my question is not very clear. I am aware of 
[https://github.com/apache/spark/pull/24922] and what was the concern back 
then. I think I am not sure what's external module here. Do you mind explaining 
a bit more of what's external module meaning here (or is there an existing 
example in current codebase I can refer to)?

More context is we are working on RocksDB state store internally as well based 
on above PR. So would like to check if there's any good things here we need to 
watch out for backport, thanks.

> Add RocksDB StateStore as external module
> -----------------------------------------
>
>                 Key: SPARK-34198
>                 URL: https://issues.apache.org/jira/browse/SPARK-34198
>             Project: Spark
>          Issue Type: New Feature
>          Components: Structured Streaming
>    Affects Versions: 3.2.0
>            Reporter: L. C. Hsieh
>            Assignee: L. C. Hsieh
>            Priority: Major
>
> Currently Spark SS only has one built-in StateStore implementation 
> HDFSBackedStateStore. Actually it uses in-memory map to store state rows. As 
> there are more and more streaming applications, some of them requires to use 
> large state in stateful operations such as streaming aggregation and join.
> Several other major streaming frameworks already use RocksDB for state 
> management. So it is proven to be good choice for large state usage. But 
> Spark SS still lacks of a built-in state store for the requirement.
> We would like to explore the possibility to add RocksDB-based StateStore into 
> Spark SS. For the concern about adding RocksDB as a direct dependency, our 
> plan is to add this StateStore as an external module first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to