Also, on MapR, you get another level of group commit above the row level. That takes the writes even further from the byte by byte level.
On Mon, Jul 11, 2011 at 9:20 AM, Andrew Purtell <apurt...@apache.org> wrote: > > Despite having support for append in HDFS, it is still expensive to > > update it on every byte and here is where the wal flushing policies come > > in. > > Right, but a minor correction here. HBase doesn't flush the WAL per byte. > We do a "group commit" of all changes to a row, to the extent the user has > grouped changes to the row into a Put. So at the least this is first a write > of all the bytes of an edit, or it could be more than one edit if we can > group them, and _then_ a sync. > > > Also most who run HBase run a HDFS patched with HDFS-895, so multiple syncs > can be in flight. This does not reduce the added latency of a sync for the > current writer but it does significantly reduce the expense of the sync with > respect to other parallel writers. > > > Best regards, > > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > > > ----- Original Message ----- > > From: Arvind Jayaprakash <w...@anomalizer.net> > > To: user@hbase.apache.org; Andrew Purtell <apurt...@apache.org> > > Cc: > > Sent: Monday, July 11, 2011 6:34 AM > > Subject: Re: Hbase performance with HDFS > > > > On Jul 07, Andrew Purtell wrote: > >>> Since HDFS is mostly write once how are updates/deletes handled? > >> > >> Not mostly, only write once. > >> > >> Deletes are just another write, but one that writes tombstones > >> "covering" data with older timestamps. > >> > >> When serving queries, HBase searches store files back in time until it > >> finds data at the coordinates requested or a tombstone. > >> > >> The process of compaction not only merge sorts a bunch of accumulated > >> store files (from flushes) into fewer store files (or one) for read > >> efficiency, it also performs housekeeping, dropping data "covered" > > by > >> the delete tombstones. Incidentally this is also how TTLs are > >> supported: expired values are dropped as well. > > > > Just wanted to talk about WAL. My understanding is that updates are > > journalled onto HDFS by sequentially recording them as they happen per > > region. This is where the need for HDFS append comes in, something that > > I don't recollect seeing in the GFS paper. > > > > Despite having support for append in HDFS, it is still expensive to > > update it on every byte and here is where the wal flushing policies come > > in. > > >