Thanks a lot Pierre!

Kostas

> On Aug 27, 2018, at 2:16 PM, Pierre Zemb <pierre.zemb.i...@gmail.com> wrote:
> 
> Hi!
> Just created the JIRA (https://issues.apache.org/jira/browse/FLINK-10225 
> <https://issues.apache.org/jira/browse/FLINK-10225>).
> 
> Thanks for your reply,
> Pierre
> 
> Le jeu. 23 août 2018 à 14:31, Kostas Kloudas <k.klou...@data-artisans.com 
> <mailto:k.klou...@data-artisans.com>> a écrit :
> Hi Pierre,
> 
> You are right that this should not happen.
> It seems like a bug.
> Could you open a JIRA and post it here?
> 
> Thanks,
> Kostas
> 
> 
>> On Aug 21, 2018, at 9:35 PM, Pierre Zemb <pierre.zemb.i...@gmail.com 
>> <mailto:pierre.zemb.i...@gmail.com>> wrote:
>> 
>> Hi!
>> 
>> I’ve started to deploy a small Flink cluster (4tm and 1jm for now on 1.6.0), 
>> and deployed a small job on it. Because of the current load, job is 
>> completely handled by a single tm. I’ve created a small proxy that is using 
>> QueryableStateClient 
>> <https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/queryablestate/client/QueryableStateClient.html>
>>  to access the current state. It is working nicely, except under certain 
>> circumstances. It seems to me that I can only access the state through a 
>> node that is holding a part of the job. Here’s an example:
>> 
>> job on tm1. Pointing QueryableStateClient to tm1. State accessible
>> job still on tm1. Pointing QueryableStateClient to tm2 (for example). State 
>> inaccessible
>> killing tm1, job is now on tm2. State accessible
>> job still on tm2. Pointing QueryableStateClient to tm3. State inaccessible
>> adding some parallelism to spread job on tm1 and tm2. Pointing 
>> QueryableStateClient to either tm1 and tm2 is working
>> job still on tm1 and tm2. Pointing QueryableStateClient to tm3. State 
>> inaccessible
>> When the state is inaccessible, I can see this (generated here 
>> <https://github.com/apache/flink/blob/release-1.6/flink-queryable-state/flink-queryable-state-runtime/src/main/java/org/apache/flink/queryablestate/client/proxy/KvStateClientProxyHandler.java#L228>):
>> 
>> java.lang.RuntimeException: Failed request 0.
>>  Caused by: 
>> org.apache.flink.queryablestate.exceptions.UnknownLocationException: Could 
>> not retrieve location of state=repo-status of 
>> job=3ac3bc00b2d5bc0752917186a288d40a. Potential reasons are: i) the state is 
>> not ready, or ii) the job does not exist.
>>     at 
>> org.apache.flink.queryablestate.client.proxy.KvStateClientProxyHandler.getKvStateLookupInfo(KvStateClientProxyHandler.java:228)
>>     at 
>> org.apache.flink.queryablestate.client.proxy.KvStateClientProxyHandler.getState(KvStateClientProxyHandler.java:162)
>>     at 
>> org.apache.flink.queryablestate.client.proxy.KvStateClientProxyHandler.executeActionAsync(KvStateClientProxyHandler.java:129)
>>     at 
>> org.apache.flink.queryablestate.client.proxy.KvStateClientProxyHandler.handleRequest(KvStateClientProxyHandler.java:119)
>>     at 
>> org.apache.flink.queryablestate.client.proxy.KvStateClientProxyHandler.handleRequest(KvStateClientProxyHandler.java:63)
>>     at 
>> org.apache.flink.queryablestate.network.AbstractServerHandler$AsyncRequestTask.run(AbstractServerHandler.java:236)
>>     at 
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>     at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>     at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>     at java.lang.Thread.run(Thread.java:745)
>> From the documentation, I can see that:
>> 
>> The client connects to a Client Proxy running on a given Task Manager. The 
>> proxy is the entry point of the client to the Flink cluster. It forwards the 
>> requests of the client to the Job Manager and the required Task Manager, and 
>> forwards the final response back the client.
>> 
>> Did I miss something? Is the QueryableStateClientProxy only fetching info 
>> from a job that is running on his local tm? If so, is there a way to 
>> retrieve the job-graph? Or maybe another solution? 
>> 
>> Thanks!
>> Pierre Zemb
>> 
>> -- 
>> Cordialement,
>> Pierre Zemb
>> pierrezemb.fr <>
>> Software Engineer, Metrics Data Platform @OVH
> 
> -- 
> Cordialement,
> Pierre Zemb
> pierrezemb.fr <>
> Software Engineer, Metrics Data Platform @OVH

Reply via email to