Re: RocksDB MapState debugging key serialization

2021-06-30 Thread Thomas Breloff
Thanks Yuval.  Indeed it was a serialization issue.  I followed the 
instructions in the docs to set up a local test environment with RocksDB that I 
was able to set a breakpoint in and step through.

I discovered that my key was not properly registered with the Kryo serializer 
and the default FieldSerializer was not producing byte-wise equal 
serializations.

Thanks for the prompt response!
Tom

From: Yuval Itzchakov 
Date: Wednesday, June 30, 2021 at 12:56 PM
To: Thomas Breloff 
Cc: user@flink.apache.org 
Subject: Re: RocksDB MapState debugging key serialization
Here is what the documentation on 
RocksDBStateBackend<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.13%2Fdocs%2Fops%2Fstate%2Fstate_backends%2F%23the-embeddedrocksdbstatebackend=04%7C01%7Ctomb%40ec.ai%7C0ff24379154e42531d9c08d93be7bc7e%7Cf48a62c73e034851846d8f83284a7646%7C0%7C0%7C637606689664218364%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000=%2FD%2FJU17pir2c07dFjqNUgc%2BSHYl9t2ccAqXS9CFwOGA%3D=0>
 says:

The EmbeddedRocksDBStateBackend holds in-flight data in a RocksDB database that 
is (per default) stored in the TaskManager local data directories.
Unlike storing java objects in HashMapStateBackend, data is stored as 
serialized byte arrays, which are mainly defined by the type serializer, 
resulting in key comparisons being byte-wise instead of using Java’s hashCode() 
and equals() methods.

This means that if your keys are not byte-wise equivalent, they won't be 
matched.

On Wed, Jun 30, 2021 at 7:37 PM Thomas Breloff mailto:t...@ec.ai>> 
wrote:
Hello,
I am having trouble with a Flink job which is configured using a RocksDB state 
backend.

Tl;dr: How can I debug the key serialization for RocksDB MapState for a 
deployed Flink job?

Details:

When I “put” a key/value pair into a MapState, and then later try to “get” 
using a key which has the same hashCode/equals as what I put in, I get back 
“null”.

Some things I have verified:


  *   Looping over the “keys()” or “entries()” of the MapState contains the 
expected key (which matches both hashCode and equals)
  *   If I “put” the same key that I’m attempting to “get” with, and then look 
at the “entries”, then both of the keys appear in the map.

I think I understand that the RocksDB version of MapState will use the 
serialized keys, however I have tested what I think is the serializer and it 
returns the same serialization for both objects.

How can I find out the serialized values that are being used for key 
comparison? Can you recommend any possible solutions or debugging strategies 
that would help?

Thank you,
Tom


--
Best Regards,
Yuval Itzchakov.


Re: RocksDB MapState debugging key serialization

2021-06-30 Thread Yuval Itzchakov
Here is what the documentation on RocksDBStateBackend

says:

The EmbeddedRocksDBStateBackend holds in-flight data in a RocksDB database
that is (per default) stored in the TaskManager local data directories.
Unlike storing java objects in HashMapStateBackend, data is stored as
serialized byte arrays, which are mainly defined by the type
serializer, *resulting
in key comparisons being byte-wise instead of using Java’s hashCode() and
equals() methods.*

This means that if your keys are not byte-wise equivalent, they won't be
matched.

On Wed, Jun 30, 2021 at 7:37 PM Thomas Breloff  wrote:

> Hello,
>
> I am having trouble with a Flink job which is configured using a RocksDB
> state backend.
>
>
>
> Tl;dr: How can I debug the key serialization for RocksDB MapState for a
> deployed Flink job?
>
>
>
> Details:
>
>
>
> When I “put” a key/value pair into a MapState, and then later try to “get”
> using a key which has the same hashCode/equals as what I put in, I get back
> “null”.
>
>
>
> Some things I have verified:
>
>
>
>- Looping over the “keys()” or “entries()” of the MapState contains
>the expected key (which matches both hashCode and equals)
>- If I “put” the same key that I’m attempting to “get” with, and then
>look at the “entries”, then both of the keys appear in the map.
>
>
>
> I think I understand that the RocksDB version of MapState will use the
> serialized keys, however I have tested what I think is the serializer and
> it returns the same serialization for both objects.
>
>
>
> How can I find out the serialized values that are being used for key
> comparison? Can you recommend any possible solutions or debugging
> strategies that would help?
>
>
>
> Thank you,
>
> Tom
>


-- 
Best Regards,
Yuval Itzchakov.


RocksDB MapState debugging key serialization

2021-06-30 Thread Thomas Breloff
Hello,
I am having trouble with a Flink job which is configured using a RocksDB state 
backend.

Tl;dr: How can I debug the key serialization for RocksDB MapState for a 
deployed Flink job?

Details:

When I “put” a key/value pair into a MapState, and then later try to “get” 
using a key which has the same hashCode/equals as what I put in, I get back 
“null”.

Some things I have verified:


  *   Looping over the “keys()” or “entries()” of the MapState contains the 
expected key (which matches both hashCode and equals)
  *   If I “put” the same key that I’m attempting to “get” with, and then look 
at the “entries”, then both of the keys appear in the map.

I think I understand that the RocksDB version of MapState will use the 
serialized keys, however I have tested what I think is the serializer and it 
returns the same serialization for both objects.

How can I find out the serialized values that are being used for key 
comparison? Can you recommend any possible solutions or debugging strategies 
that would help?

Thank you,
Tom