Re: Monitoring long / stuck CTAS

Ted Dunning Fri, 29 May 2015 14:31:14 -0700

There are two methods to support HBase table API's.  The first is to simply
run HBase. That is just like, well, running HBase.

The more interesting alternative is to use a special client API that talks
a special table-oriented wire protocol to the file system which implements
a column-family / column oriented table API similar to what HBase uses.
The big differences have to do with the fact that code inside the file
system has capabilities available to it that are not available to HBase.
For instance, it can use a file oriented transaction and recovery system.
It can also make use of knowledge about file system layout that is not
available to HBase.

Because we can optimize the file layouts, we can also change the low level
protocols for disk reorganization.  MapR tables have more levels of
sub-division than HBase and we use different low-level algorithms.  This
results in having lots of write-ahead logs which would crush HDFS because
of the commit rate, but it allows very fast crash recovery (10's to low
100's of ms after the basic file system is back)

Also, since the tables are built using standard file-system primitives all
of the transactionally correct snapshots and mirrors carry over to tables
as well.

Oh, and it tends to be a lot faster and failure tolerant as well.

On Fri, May 29, 2015 at 7:00 AM, Yousef Lasi <yousef.l...@gmail.com> wrote:

> Could you expand on the HBase table integration? How does that work?
>
> On Fri, May 29, 2015 at 5:55 AM, Ted Dunning <ted.dunn...@gmail.com>
> wrote:
>
> >
> > 4) you get the use of the HBase API without having to run HBase.  Tables
> > are integrated directly into MapR FS.
> >
> >
> >
> >
> >
> > On Thu, May 28, 2015 at 9:37 AM, Matt <bsg...@gmail.com> wrote:
> >
> > > I know I can / should assign individual disks to HDFS, but as a test
> > > cluster there are apps that expect data volumes to work on. A dedicated
> > > Hadoop production cluster would have a disk layout specific to the
> task.
> >
>

Re: Monitoring long / stuck CTAS

Reply via email to