[ https://issues.apache.org/jira/browse/ACCUMULO-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15018153#comment-15018153 ]
Dave Marion commented on ACCUMULO-4062: --------------------------------------- Looking into this further - I looked at the server side code {noformat} TabletServer.flush() -> CommitSession.commit() -> Tabet.commit() -> TabletMemory.mutate() -> CommitSession.mutate() -> InMemoryMap.mutate() {noformat} at this point it calls one of the SimpleMap.mutate() implementations passing a list of mutations and a counter which gets incremented each time the SimpleMap.mutate() method is called. Looking at DefaultMap.mutate(), it creates a MemKey and add its to a map that uses the MemKeyComparator. The MemKeyComparator uses the counter if the two keys are identical. Having said all of that, the order of the mutations does appear to be preserved as you indicate. However, this would only hold true if there is one client writing in that key space. If more than one client were writing in that key space, then I think the tablet server would apply them as they were received. Maybe some clients are counting on this behavior, but I don't think this behavior has been explicitly stated as being guaranteed. I don't want to break any client that are counting on this working, but I would like to see if there is a way to dedupe on the client side. > Change MutationSet.mutations to use HashSet > ------------------------------------------- > > Key: ACCUMULO-4062 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4062 > Project: Accumulo > Issue Type: Improvement > Components: client > Reporter: Dave Marion > > Change TabletServerBatchWriter.MutationSet.mutations from a > {code} > HashMap<String,List<Mutation>> > {code} > to > {code} > HashMap<String,HashSet<Mutation>> > {code} > so that duplicate mutations added by a client are not sent to the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)