----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19790/#review38927 -----------------------------------------------------------
docs/src/main/resources/design/ACCUMULO-378-design.mdtext <https://reviews.apache.org/r/19790/#comment71268> Seems like accumulo should have a public API for querying what needs to be replicated, notifying it when something has been replicated, and methods for importing replicated data. I am thinking of something different than a plugin, more like the import/export table API. How the replication happens is up the user. We could provide a default implementation that does replication as you mentioned. Some users may want to occassionally replicate large batches using map reduce. Others may want to continually replicate files using distributed queueing solutions. docs/src/main/resources/design/ACCUMULO-378-design.mdtext <https://reviews.apache.org/r/19790/#comment71259> A walog or bulk imported file could be referenced by multiple tablets. I am wondering if it would be better to move this info out of the tablet and do something like ~del markers in the metadata table. Like a ~repl_hdfs://foo/a.rf row in the metadata table. This row could store replication status. If the ~repl row exist, then file would not be deleted. The ~repl marker could not be removed until the file is replicated and there are no more refs in the tablet metadata (is this sufficient to prevent addint a repl marker for something that already replicated). Could possibly update repl markers using conditional mutations, since multiple tablets and the master may mutate it. docs/src/main/resources/design/ACCUMULO-378-design.mdtext <https://reviews.apache.org/r/19790/#comment71261> Are you thinking a FATE operation per file? FATE uses zookeeper, and zookeeper keeps everything in memory. docs/src/main/resources/design/ACCUMULO-378-design.mdtext <https://reviews.apache.org/r/19790/#comment71263> How will the locality group config changing be handled? docs/src/main/resources/design/ACCUMULO-378-design.mdtext <https://reviews.apache.org/r/19790/#comment71264> Not gonna work. It may be worthwhile to consider having the ConditionalWriter detect unsupported replication configurations and throw an exception. docs/src/main/resources/design/ACCUMULO-378-design.mdtext <https://reviews.apache.org/r/19790/#comment71260> Tablets could support an atomic operation that marks all of its current files as needing replication and appropriately handle new data coming in. The master would go through all tablets in a table calling this operation. Tablets could write something to the metadata table when the operation is successful. This allows the master to know which tablets are done. docs/src/main/resources/design/ACCUMULO-378-design.mdtext <https://reviews.apache.org/r/19790/#comment71265> Or maybe the replication info could be stored externally, if that information applies to the entire wal file. docs/src/main/resources/design/ACCUMULO-378-design.mdtext <https://reviews.apache.org/r/19790/#comment71266> This could be done w/ table permissions. Dissallow granting the write permission to a slave table. - kturner On March 28, 2014, 5:54 p.m., kturner wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/19790/ > ----------------------------------------------------------- > > (Updated March 28, 2014, 5:54 p.m.) > > > Review request for accumulo. > > > Bugs: ACCUMULO-378 > https://issues.apache.org/jira/browse/ACCUMULO-378 > > > Repository: accumulo > > > Description > ------- > > ACCUMULO-378 Design document. Posting for review here, not meant for commit. > Final version of document should be posted on issue. > > > Diffs > ----- > > docs/src/main/resources/design/ACCUMULO-378-design.mdtext PRE-CREATION > > Diff: https://reviews.apache.org/r/19790/diff/ > > > Testing > ------- > > > Thanks, > > kturner > >
