[ https://issues.apache.org/jira/browse/ACCUMULO-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038225#comment-14038225 ]
ASF subversion and git services commented on ACCUMULO-2925: ----------------------------------------------------------- Commit b062a0bd3ed388f89bc04dfa2903bf3cc951976c in accumulo's branch refs/heads/master from [~elserj] [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=b062a0b ] ACCUMULO-2925 Create regular Mutations from ServerMutations when applying replication data on a peer Mutations do not store unserialized ColumnUpdates, but only generate them on demand via the getter. This is intended to create an efficient implementation (both performance and size) while preseving immutability. Server-assigned timestamps work around this immutability by wrapping normal Mutations in a ServerMutation and ColumnUpdates with ServerColumnUpdates. By doing this, ServerMutations can "fake" the timestamp on ColumnUpdates that otherwise do not have a timestamp set. In the context of replication, this is still a problem as all Mutations that are sent to a peer are ServerMutations (as we read them from a WAL). These Mutations are deserialized and passed into a BatchWriter to apply to the local instance; however, the BatchWriter is ignorant of ServerMutations and the special timestamp handling. When the BatchWriter makes a "copy" of the Mutation (see ACCUMULO-2915), despite this being a shallow copy, the server-assigned timestamp is lost by creating a regular Mutation from what was a ServerMutation. Even if this were possible, the TMutation class, which the BatchWriter eventually uses to send to the Mutations to a TabletServer, is also ignorant of the ServerMutation timestamp without modification of the serialization and TMutation class. As such, the only option left is to, when encountering ServerMutations in the BatchWriterReplicationReplayer code, we *must* recreate new Mutations, applying the possibly present server-timestamp to each new Mutation we create to ensure that the timestamp is correctly propagated to this peer. > Timestamp is not propagated to peer > ----------------------------------- > > Key: ACCUMULO-2925 > URL: https://issues.apache.org/jira/browse/ACCUMULO-2925 > Project: Accumulo > Issue Type: Bug > Components: replication > Reporter: Josh Elser > Assignee: Josh Elser > Priority: Blocker > Fix For: 1.7.0 > > > Wrote a test that was doing some more intense verification of equality of two > tables and I was surprised to find that the tables were in fact not equal. > Digging into it some more, I eventually found that the keys and values were > identical, save for the timestamp. Despite the Mutations coming from the > local WAL having timestamps set by the server, these got lost. > Specifically, the "real" timestamp is stored on the ServerMutation, not each > ColumnUpdate. On the peer, when the BatchWriter makes a shallow copy of the > (Server)Mutation to apply on the target table for replication, we lose that > ServerMutation and get a "regular" Mutation which has updates that don't have > any timestamp set. If the BatchWriter didn't make the shallow copy, this > should work. -- This message was sent by Atlassian JIRA (v6.2#6252)