Re: [DISCUSS] Introduce a location-oriented two-stage query mechanism toimprove the queryable state.

vino yang Fri, 25 Oct 2019 00:52:44 -0700

Hi Jiayi,

Thanks for your valuable feedback and suggestions.


In our production env, we still have many applications wrote by DataStream
API.

Currently, we have some requirements that require Adhoc query for the
runtime Flink job. The existing query interface is very difficult to use.
This improvement is to enhance the usability of the queryable state.
Currently, I only limit its boundaries to improved design. It does not
involve the exploration of the scope of capabilities and use scenarios. Of
course, these directions are very interesting and I hope to think further.

As you said, the queryable state and state-processing-api are used to
handle state. From this perspective, they seem to be able to merge or
integrate modules in some way (to avoid the fast-expanding Flink modules).


However, the queryable state queries the online KV state at a certain point
in time, while the state-processing-api uses the DataSet API to read,
write, and analyze the offline savepoint. There is a big difference.

How to look at the online queryable state and the offline
state-processing-api may require further discussion in the community. So, I
pinged people who might be related to these modules to get more valuable
feedback.

Best,
Vino



bupt_ljy <[email protected]> 于2019年10月24日周四 下午9:04写道：

> Hi vino,
>
> +1 for improvement on queryable state feature. This reminds me of the
> state-processing-api module, which is very helpful when we analyze state in
> offline. However currently we don’t have many ways to know what is
> happening about the state inside a running application, which makes me feel
> that this has a good potential. Since these two modules are seperate but
> doing the similar work(anaylyzing state), maybe we have to think more about
> their orientation, or maybe integrate them in a graceful way in the future.
>
> Anyway, this is a great work and it’d be better if we can hear more
> thoughts and use cases.
>
> Best Regards,
> Jiayi Liao
>
>  Original Message
> *Sender:* vino yang<[email protected]>
> *Recipient:* [email protected]<[email protected]>
> *Date:* Tuesday, Oct 22, 2019 15:42
> *Subject:* [DISCUSS] Introduce a location-oriented two-stage query
> mechanism toimprove the queryable state.
>
> Hi guys,
>
>
> Currently, queryable state's client is hard to use. Because it requires
> users to know the address of TaskManager and the port of the proxy.
> Actually, most users who do not have good knowledge about the Flink's inner
> and runtime in production. The queryable state clients directly interact
> with query state client proxies which host on each TaskExecutor. This
> design requires users to know too much detail.
>
>
>
> We introduce a location service component to improve the architecture of
> the queryable state and hide the details of the task executors. We first
> give a brief introduction to our design in Section 2 and then detail the
> implementation in Section 3. At last, we describe some future work that can
> be done.
>
>
> [image: Screen Shot 2019-10-22 at 10.05.11 AM.png]
>
>
> I have given an initialized implementation in my Flink repository[2]. One
> thing that needs to be stated is that we have not changed the existing
> solution, so it still works according to the previous modes.
>
> The design documentation is here[3].
>
> Any suggestion and feedback are welcome and appriciated.
>
> [1]: https://statefun.io/
> [2]: https://github.com/yanghua/flink/tree/improve-queryable-state-master
> [3]:
> https://docs.google.com/document/d/181qYVIiHQGrc3hCj3QBn1iEHF4bUztdw4XO8VSaf_uI/edit?usp=sharing
>
> Best,
> Vino
>

Re: [DISCUSS] Introduce a location-oriented two-stage query mechanism toimprove the queryable state.

Reply via email to