Hardly voodoo, but also not something that can be done casually. You need strong transactional guarantees from the file system layer to do this.
And yes, it does come down to something like groups of group commits. It didn't require patching the layer below dfsclient so much as correct and careful design of the layer below that layer. I should repeat that this only happens on MapR; we didn't touch the HDFS code. I would expect that getting this to work correctly at that layer could be extremely difficult because you would have a huge proof of correctness task because the lower layers of HDFS are not well specified in terms of temporal semantics. On Mon, Jul 11, 2011 at 8:45 PM, Stack <st...@duboce.net> wrote: > Ted, you seem to be describing voodoo? Are you talking of a group > commit of the group commits? Bigger batches at the layer below > dfsclient? > St.Ack > > On Mon, Jul 11, 2011 at 11:57 AM, Ted Dunning <tdunn...@maprtech.com> > wrote: > > On Mon, Jul 11, 2011 at 11:22 AM, Joey Echeverria <j...@cloudera.com> > wrote: > > > >> On Mon, Jul 11, 2011 at 12:47 PM, Ted Dunning <tdunn...@maprtech.com> > >> wrote: > >> > Also, on MapR, you get another level of group commit above the row > level. > >> > That takes the writes even further from the byte by byte level. > >> > >> Is this done with an HBASE patch? I don't see how this could be done > >> merely at the FS layer. > >> > > > > :-) > > > > No changes were required in HBase to enable this. > > >