[ 
https://issues.apache.org/jira/browse/ACCUMULO-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15018153#comment-15018153
 ] 

Dave Marion commented on ACCUMULO-4062:
---------------------------------------

Looking into this further - I looked at the server side code 
{noformat}
TabletServer.flush() -> CommitSession.commit() -> Tabet.commit() -> 
TabletMemory.mutate() -> CommitSession.mutate() -> InMemoryMap.mutate()
{noformat}

at this point it calls one of the SimpleMap.mutate() implementations passing a 
list of mutations and a counter which gets incremented each time the 
SimpleMap.mutate() method is called. Looking at DefaultMap.mutate(), it creates 
a MemKey and add its to a map that uses the MemKeyComparator. The 
MemKeyComparator uses the counter if the two keys are identical.

Having said all of that, the order of the mutations does appear to be preserved 
as you indicate. However, this would only hold true if there is one client 
writing in that key space. If more than one client were writing in that key 
space, then I think the tablet server would apply them as they were received. 

Maybe some clients are counting on this behavior, but I don't think this 
behavior has been explicitly stated as being guaranteed. I don't want to break 
any client that are counting on this working, but I would like to see if there 
is a way to dedupe on the client side.


> Change MutationSet.mutations to use HashSet
> -------------------------------------------
>
>                 Key: ACCUMULO-4062
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4062
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>            Reporter: Dave Marion
>
> Change TabletServerBatchWriter.MutationSet.mutations from a
> {code}
>   HashMap<String,List<Mutation>>
> {code}
> to
> {code}
>   HashMap<String,HashSet<Mutation>>
> {code}
> so that duplicate mutations added by a client are not sent to the server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to