There are two methods to support HBase table API's. The first is to simply run HBase. That is just like, well, running HBase.
The more interesting alternative is to use a special client API that talks a special table-oriented wire protocol to the file system which implements a column-family / column oriented table API similar to what HBase uses. The big differences have to do with the fact that code inside the file system has capabilities available to it that are not available to HBase. For instance, it can use a file oriented transaction and recovery system. It can also make use of knowledge about file system layout that is not available to HBase. Because we can optimize the file layouts, we can also change the low level protocols for disk reorganization. MapR tables have more levels of sub-division than HBase and we use different low-level algorithms. This results in having lots of write-ahead logs which would crush HDFS because of the commit rate, but it allows very fast crash recovery (10's to low 100's of ms after the basic file system is back) Also, since the tables are built using standard file-system primitives all of the transactionally correct snapshots and mirrors carry over to tables as well. Oh, and it tends to be a lot faster and failure tolerant as well. On Fri, May 29, 2015 at 7:00 AM, Yousef Lasi <yousef.l...@gmail.com> wrote: > Could you expand on the HBase table integration? How does that work? > > On Fri, May 29, 2015 at 5:55 AM, Ted Dunning <ted.dunn...@gmail.com> > wrote: > > > > > 4) you get the use of the HBase API without having to run HBase. Tables > > are integrated directly into MapR FS. > > > > > > > > > > > > On Thu, May 28, 2015 at 9:37 AM, Matt <bsg...@gmail.com> wrote: > > > > > I know I can / should assign individual disks to HDFS, but as a test > > > cluster there are apps that expect data volumes to work on. A dedicated > > > Hadoop production cluster would have a disk layout specific to the > task. > > >