Thank you, Jungtaek, for your answers! It's clear now.

+1 for me. It seems like a prerequisite for further ops-related
improvements for the state store management. I mean especially here the
state rebalancing that could rely on this read+write state store API. I
don't mean here the dynamic state rebalancing that could probably be
implemented with a lower latency directly in the stateful API. Instead I'm
thinking more of an offline job to rebalance the state and later restart
the stateful pipeline with the changed number of shuffle partitions.

Best,
Bartosz.

On Mon, Oct 16, 2023 at 6:19 PM Jungtaek Lim <kabhwan.opensou...@gmail.com>
wrote:

> bump for better reach
>
> On Thu, Oct 12, 2023 at 4:26 PM Jungtaek Lim <kabhwan.opensou...@gmail.com>
> wrote:
>
>> Sorry, please use this link instead for SPIP doc:
>> https://docs.google.com/document/d/1_iVf_CIu2RZd3yWWF6KoRNlBiz5NbSIK0yThqG0EvPY/edit?usp=sharing
>>
>>
>> On Thu, Oct 12, 2023 at 3:58 PM Jungtaek Lim <
>> kabhwan.opensou...@gmail.com> wrote:
>>
>>> Hi dev,
>>>
>>> I'd like to start a discussion on "State Data Source - Reader".
>>>
>>> This proposal aims to introduce a new data source "statestore" which
>>> enables reading the state rows from existing checkpoint via offline (batch)
>>> query. This will enable users to 1) create unit tests against stateful
>>> query verifying the state value (especially flatMapGroupsWithState), 2)
>>> gather more context on the status when an incident occurs, especially for
>>> incorrect output.
>>>
>>> *SPIP*:
>>> https://docs.google.com/document/d/1HjEupRv8TRFeULtJuxRq_tEG1Wq-9UNu-ctGgCYRke0/edit?usp=sharing
>>> *JIRA*: https://issues.apache.org/jira/browse/SPARK-45511
>>>
>>> Looking forward to your feedback!
>>>
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>> ps. The scope of the project is narrowed to the reader in this SPIP,
>>> since the writer requires us to consider more cases. We are planning on it.
>>>
>>

-- 
Bartosz Konieczny
freelance data engineer
https://www.waitingforcode.com
https://github.com/bartosz25/
https://twitter.com/waitingforcode

Reply via email to