[ https://issues.apache.org/jira/browse/ACCUMULO-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15176372#comment-15176372 ]
William Slacum commented on ACCUMULO-4156: ------------------------------------------ Yeah I wouldn't doubt it. I didn't see any special offsets for WAL files, though I think you'd need some marker for a WAL that lives through a flush so you don't do a double-insert incase of a failure after a flush. > Tunable replication frequency > ----------------------------- > > Key: ACCUMULO-4156 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4156 > Project: Accumulo > Issue Type: Improvement > Components: core > Affects Versions: 1.7.1 > Reporter: William Slacum > Fix For: 1.8.0 > > > Currently, replication happens when a write ahead log file is closed. The > only parameter to toggle when this event occurs is write ahead log size, and > is only applicable to the tablet servers themselves. > By default this means that when replication happens isn't tied to the table > it is configured on, but also exogenous factors such as total write load and > failures. If a system receives ~100MB/day/TServer, and the WAL size is its > default 1GB, it will take 10 days for any replication event to occur. Another > possibility is that an unreplicated table is receiving many writes, which > will cause more frequent replication events, but proportionally the work will > involve less data for the table being replicated. > I don't have a specific implementation in mind, but I'd like to see a > solution that involves isolating the work down to specific table events such > as time-since-last-replication and data-added-since-last-replication. > [~elserj] has had some ideas about doing things incrementally within WAL > files (ie, replicating between two sync points) that can also help with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)