[ 
https://issues.apache.org/jira/browse/GEODE-9147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17323871#comment-17323871
 ] 

ASF subversion and git services commented on GEODE-9147:
--------------------------------------------------------

Commit 1be2ee31f592ef99cf588753529ec62f5e7cddfd in geode-native's branch 
refs/heads/develop from Blake Bender
[ https://gitbox.apache.org/repos/asf?p=geode-native.git;h=1be2ee3 ]

GEODE-9147: Revert to multi-hop PUTALL in the face of missing metadata (#784)

- This matches the behavior of the Java client
- Our current code to "tack on" values that we don't have metadata
  for will sometimes result in EventIds reaching a server out-of-order,
  causing them to be dropped and resulting in data loss.  Resorting
  to multi-hop avoids this altogether.
- Add test to repro issue of missing keys on single-hop putAll


> Dropped keys in single-hop PUTALL request when one or more servers is 
> unreachable
> ---------------------------------------------------------------------------------
>
>                 Key: GEODE-9147
>                 URL: https://issues.apache.org/jira/browse/GEODE-9147
>             Project: Geode
>          Issue Type: Bug
>          Components: native client
>            Reporter: Blake Bender
>            Priority: Major
>
> For single-hop PUTALL, the request from the app is broken up in Geode native 
> as follows:
> i. Each value is hashed to a bucket, the server corresponding to the bucket 
> is looked up in the metadata, and the value is added to a server-specific 
> list for that server.
> ii. When all values are added to a list, Geode native spins up a thread for 
> each list, and sends a PUTALL to each server.
>  
> When a server can't be reached by Geode native, its entries are removed from 
> the metadata, and the bucket-to-server lookup fails.  This situation is 
> handled as follows:
>  i. the size of the "leftover keys" list is divided by the number of servers, 
> then 1 added to compensate for any fractional piece.
> ii. That many keys are added to each remaining list going to a server that is 
> still reachable.
> iii. We proceed normally, and send one list to each server, on its own thread.
>  
> _Unfortunately_, this scenario can lead to data loss, because each of the 
> fractional pieces of the list going to the unreachable server has an eventId 
> with the same threadId and incrementing sequenceId.  Thus, if any of our 
> PUTALL threads send out-of-order, the earlier sequenceIds will be marked as 
> already "seen" on the server and _dropped_.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to