Dear Flink Users, I'm trying to upgrade to flink 1.5.0, so far everything works except for the Queryable state client. Now here's where it gets weird. I have the client sitting behind a web API so the rest of our non-java ecosystem can consume it. I've got 2 tests, one calls my route directly as a java method call, the other calls the deployed server via HTTP (the difference in the test intending to flex if the server is properly started, etc).
The local call succeeds, the remote call fails, but the failure isn't what I'd expect (around the server layer) it's compalining about contacting the oracle for state location: <very very long stack trace> Caused by: org.apache.flink.queryablestate.exceptions.UnknownLocationException: Could not contact the state location oracle to retrieve the state location.\n\tat org.apache.flink.queryablestate.client.proxy.KvStateClientProxyHandler.getKvStateLookupInfo(KvStateClientProxyHandler.java:228)\n <etc> The client call is pretty straightforward: return client.getKvState( flinkJobID, stateDescriptor.queryableStateName, key, keyTypeInformation, stateDescriptor ) I've confirmed via logs that I have the exact same key, flink job ID, and queryable state name. So I'm going bonkers on what difference might exist, I'm wondering if I'm packing my jar wrong and there's some resources I need to look out for? (back on flink 1.3.x I had to handle the reference.conf file for AKKA when you were depending on that for Queryable State, is there something like that? etc). Is there /more logging/ somewhere on the server side that might give me a hint? like "Tried to query state for X but couldn't find the BananaLever" ? I'm pretty stuck right now and ready to try any random ideas to move forward. The only changes I've made aside from the jar version bump from 1.4.2 to 1.5.0 is handling queryableStateName (which is now nullable), no other code changes were required, and to confirm, this all works just fine with 1.4.2. Thank you.