Hello Jean-Daniel, Ok, thank you for your response. I was worried that maybe because when using Thrift, the client would have to do any communications with a Hbase regionserver through the master server -- while I still don't quite understand how it's solved with Thrift, as I understand it, the Thrift client code (as in, the code that I embed in my application) will not query the master server "after it learns the location of the ROOT HRegion", and from then will talk directly to the RegionServers, since the Thrift API actually fully implements the regular Java HBase client, even when working from a language such as C++ ?
I always thought Thrift was a simple way to serialize/unserialize data in an efficient and platform independent manner, but sounds like it's more advanced, which is good. :-) Regards, Leon Mergen On Mon, Aug 4, 2008 at 3:56 AM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote: > Leon, > > The HBase Architecture page in the wiki does give this kind of information, > specifically here: > http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture#metadata and since > HBase is a Bigtable clone, reading it's paper also gives useful > information: > http://labs.google.com/papers/bigtable.html > > To make it short, the client queries the .META. table to find the users > tables regions to which it puts and gets data. Thrift only acts a as > decorator on the Java HBase client. > > Until Zookeeper is integrated in HBase (like Chubby for Bigtable), the > Master is a SPOF but should not have any scalability-related problem. > > Hope this helps, > > J-D > > On Sun, Aug 3, 2008 at 7:22 PM, Leon Mergen <[EMAIL PROTECTED]> wrote: > > > Hello, > > > > I'm looking for some information on hbase's architecture (out of pure > > interest), which i wasn't able to find anything about it on the Hbase > site > > (including the architecture description). > > > > Specifically, I am curious how writes/mutations are distributed amongst > the > > servers, and whether this is different when using an interface like > Thrift. > > Is a server located for each mutateRow () operations "asked for" at the > > master server, or is that cached at some level ? If not, how is that > > problem > > solved that a client only connects to the master server but actually > needs > > to talk to one of the slave servers ? Or is the master server a single > weak > > spot that could introduce scalability problems on large (huge) scale ? > > > > Thanks in advance for any responses! > > > > Regards, > > > > Leon Mergen > > > -- Leon Mergen http://www.solatis.com
