+1 overall and a big +1 to keeping offline state-rebalancing as a primary
use case.

Raghu.

On Mon, Oct 16, 2023 at 11:25 AM Bartosz Konieczny <bartkoniec...@gmail.com>
wrote:

> Thank you, Jungtaek, for your answers! It's clear now.
>
> +1 for me. It seems like a prerequisite for further ops-related
> improvements for the state store management. I mean especially here the
> state rebalancing that could rely on this read+write state store API. I
> don't mean here the dynamic state rebalancing that could probably be
> implemented with a lower latency directly in the stateful API. Instead I'm
> thinking more of an offline job to rebalance the state and later restart
> the stateful pipeline with the changed number of shuffle partitions.
>
> Best,
> Bartosz.
>
> On Mon, Oct 16, 2023 at 6:19 PM Jungtaek Lim <kabhwan.opensou...@gmail.com>
> wrote:
>
>> bump for better reach
>>
>> On Thu, Oct 12, 2023 at 4:26 PM Jungtaek Lim <
>> kabhwan.opensou...@gmail.com> wrote:
>>
>>> Sorry, please use this link instead for SPIP doc:
>>> https://docs.google.com/document/d/1_iVf_CIu2RZd3yWWF6KoRNlBiz5NbSIK0yThqG0EvPY/edit?usp=sharing
>>>
>>>
>>> On Thu, Oct 12, 2023 at 3:58 PM Jungtaek Lim <
>>> kabhwan.opensou...@gmail.com> wrote:
>>>
>>>> Hi dev,
>>>>
>>>> I'd like to start a discussion on "State Data Source - Reader".
>>>>
>>>> This proposal aims to introduce a new data source "statestore" which
>>>> enables reading the state rows from existing checkpoint via offline (batch)
>>>> query. This will enable users to 1) create unit tests against stateful
>>>> query verifying the state value (especially flatMapGroupsWithState), 2)
>>>> gather more context on the status when an incident occurs, especially for
>>>> incorrect output.
>>>>
>>>> *SPIP*:
>>>> https://docs.google.com/document/d/1HjEupRv8TRFeULtJuxRq_tEG1Wq-9UNu-ctGgCYRke0/edit?usp=sharing
>>>> *JIRA*: https://issues.apache.org/jira/browse/SPARK-45511
>>>>
>>>> Looking forward to your feedback!
>>>>
>>>> Thanks,
>>>> Jungtaek Lim (HeartSaVioR)
>>>>
>>>> ps. The scope of the project is narrowed to the reader in this SPIP,
>>>> since the writer requires us to consider more cases. We are planning on it.
>>>>
>>>
>
> --
> Bartosz Konieczny
> freelance data engineer
> https://www.waitingforcode.com
> https://github.com/bartosz25/
> https://twitter.com/waitingforcode
>
>

Reply via email to