> On March 31, 2014, 4:21 p.m., kturner wrote: > > docs/src/main/resources/design/ACCUMULO-378-design.mdtext, line 34 > > <https://reviews.apache.org/r/19790/diff/1/?file=539855#file539855line34> > > > > Can a table be replicated to multiple clusters? > > kturner wrote: > More specifically, can a table on one cluster be replicated to multiple > cluster directly. The graph described seemed to only imply one outgoing > edge. I am just wondering about multiple outgoing edges from a single > cluster. It seems like this would implact the implementation of book > keeping for what files were replicated where.
No, the intent was to support replication from one cluster to N clusters. We could make this detail transparent by including the destination in the table that we store references data to be replicated at the cost of storing N*M records instead of just M records. N is the number of clusters the source is replicating to while M is the number of references to data that needs to be replicated. The more I think about it, the more I think it's definitely worth it. > On March 31, 2014, 4:21 p.m., kturner wrote: > > docs/src/main/resources/design/ACCUMULO-378-design.mdtext, line 80 > > <https://reviews.apache.org/r/19790/diff/1/?file=539855#file539855line80> > > > > Whats the rational for replicating WAL as opposed to replicating minor > > compacted rfiles? What are the pros and cons? One con w/ WALs is that they > > could possibly contain a lot of data for tables that are not being > > replicated. This data would need to be filtered. The biggest issue is for using them is that they drastically reduce the latency for data to *begin* the replication process. We certainly could use RFiles for everything which would simplify things, but I'm worried about the latency that would incur. If we used RFiles, the only solution I can come up with to speed up that latency before replication even begins would be to increase the minc's frequency. Maybe that's sufficient for a first-pass? I think I need to quantify this opinions with some numbers. Right now, we tend to recommend a bigger in-memory map for increased ingest performance. The worry here would be that recommendation now comes with increased replication latency. - Josh ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19790/#review39051 ----------------------------------------------------------- On March 28, 2014, 5:54 p.m., kturner wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/19790/ > ----------------------------------------------------------- > > (Updated March 28, 2014, 5:54 p.m.) > > > Review request for accumulo. > > > Bugs: ACCUMULO-378 > https://issues.apache.org/jira/browse/ACCUMULO-378 > > > Repository: accumulo > > > Description > ------- > > ACCUMULO-378 Design document. Posting for review here, not meant for commit. > Final version of document should be posted on issue. > > > Diffs > ----- > > docs/src/main/resources/design/ACCUMULO-378-design.mdtext PRE-CREATION > > Diff: https://reviews.apache.org/r/19790/diff/ > > > Testing > ------- > > > Thanks, > > kturner > >
