Re: Query regarding state backend for Custom Map Function

2016-12-01 Thread Stefan Richter
Hi,

using the ValueState and RocksDB to store a map inside the value state means 
that you will have a different map for each key, which is automatically swapped 
on a per record basis, depending on the record’s key. If you are using a map 
and Checkpointed, there is only one map and your code is responsible for 
dispatching state between different keys.

If you use a map and Checkpointed, the map will be on the heap and the 
checkpoint will go directly against the filesystem; this is independent of the 
chosen backend, so no RocksDB is involved.

On a further note, we are working on an alternative to ValueState that is like 
a MapState. In contrast to ValueState, MapState does not deserialize the whole 
map on each access, but can access individual key/value pairs. This might be 
what you are looking for.

Best,
Stefan


> Am 01.12.2016 um 09:35 schrieb Anirudh Mallem :
> 
> Hi Everyone,
> I am trying to understand the Working With State feature page of the Flink 
> documentation.
>  My question is in case I am using a ValueState in my CustomMap class to 
> store my states with the RocksDb as my state backend then it is clear that 
> every state value is stored in RocksDb. 
> Now instead of a ValueState if I just use a normal Java Hashmap to store my 
> states and implement the Checkpointed interface then will the entire HashMap 
> reside on the RocksDb backend or will the HashMap be in memory and just the 
> snapshots sent to RocksDb? I am trying to see what will I lose/gain if I have 
> my own data structure to do state maintenance. Thanks. 
> 
> Regards,
> Anirudh 



Re: Query regarding state backend for Custom Map Function

2016-12-01 Thread Anirudh Mallem
Thanks a lot Stefan. I got what I was looking for. Is the MapState 
functionality coming as a part of the 1.2 release?

From: Stefan Richter
Reply-To: "user@flink.apache.org<mailto:user@flink.apache.org>"
Date: Thursday, December 1, 2016 at 2:53 AM
To: "user@flink.apache.org<mailto:user@flink.apache.org>"
Subject: Re: Query regarding state backend for Custom Map Function

Hi,

using the ValueState and RocksDB to store a map inside the value state means 
that you will have a different map for each key, which is automatically swapped 
on a per record basis, depending on the record’s key. If you are using a map 
and Checkpointed, there is only one map and your code is responsible for 
dispatching state between different keys.

If you use a map and Checkpointed, the map will be on the heap and the 
checkpoint will go directly against the filesystem; this is independent of the 
chosen backend, so no RocksDB is involved.

On a further note, we are working on an alternative to ValueState that is like 
a MapState. In contrast to ValueState, MapState does not deserialize the whole 
map on each access, but can access individual key/value pairs. This might be 
what you are looking for.

Best,
Stefan


Am 01.12.2016 um 09:35 schrieb Anirudh Mallem 
mailto:anirudh.mal...@247-inc.com>>:

Hi Everyone,
I am trying to understand the Working With State feature page of the Flink 
documentation.
 My question is in case I am using a ValueState in my CustomMap class to store 
my states with the RocksDb as my state backend then it is clear that every 
state value is stored in RocksDb.
Now instead of a ValueState if I just use a normal Java Hashmap to store my 
states and implement the Checkpointed interface then will the entire HashMap 
reside on the RocksDb backend or will the HashMap be in memory and just the 
snapshots sent to RocksDb? I am trying to see what will I lose/gain if I have 
my own data structure to do state maintenance. Thanks.

Regards,
Anirudh



Re: Query regarding state backend for Custom Map Function

2016-12-02 Thread Stefan Richter
Hi,

unfortunately, I think it is a little unlikely that it will still make it into 
1.2.

Best,
Stefan

> Am 01.12.2016 um 20:29 schrieb Anirudh Mallem :
> 
> Thanks a lot Stefan. I got what I was looking for. Is the MapState 
> functionality coming as a part of the 1.2 release? 
> 
> From: Stefan Richter
> Reply-To: "user@flink.apache.org <mailto:user@flink.apache.org>"
> Date: Thursday, December 1, 2016 at 2:53 AM
> To: "user@flink.apache.org <mailto:user@flink.apache.org>"
> Subject: Re: Query regarding state backend for Custom Map Function
> 
> Hi,
> 
> using the ValueState and RocksDB to store a map inside the value state means 
> that you will have a different map for each key, which is automatically 
> swapped on a per record basis, depending on the record’s key. If you are 
> using a map and Checkpointed, there is only one map and your code is 
> responsible for dispatching state between different keys.
> 
> If you use a map and Checkpointed, the map will be on the heap and the 
> checkpoint will go directly against the filesystem; this is independent of 
> the chosen backend, so no RocksDB is involved.
> 
> On a further note, we are working on an alternative to ValueState that is 
> like a MapState. In contrast to ValueState, MapState does not deserialize the 
> whole map on each access, but can access individual key/value pairs. This 
> might be what you are looking for.
> 
> Best,
> Stefan
> 
> 
>> Am 01.12.2016 um 09:35 schrieb Anirudh Mallem > <mailto:anirudh.mal...@247-inc.com>>:
>> 
>> Hi Everyone,
>> I am trying to understand the Working With State feature page of the Flink 
>> documentation.
>>  My question is in case I am using a ValueState in my CustomMap class to 
>> store my states with the RocksDb as my state backend then it is clear that 
>> every state value is stored in RocksDb. 
>> Now instead of a ValueState if I just use a normal Java Hashmap to store my 
>> states and implement the Checkpointed interface then will the entire HashMap 
>> reside on the RocksDb backend or will the HashMap be in memory and just the 
>> snapshots sent to RocksDb? I am trying to see what will I lose/gain if I 
>> have my own data structure to do state maintenance. Thanks. 
>> 
>> Regards,
>> Anirudh 
>