Glad to help. About those "tons of Qs", as much as we enjoy answering them, we don't like to repeat ourselves too much (so few hours in a day, so many things we'd like to develop for HBase). So to give you a kick start, I'd like to point you to Google's Bigtable and the Hadoop Definitve Guide as bedtime reading since they may field a lot of those questions :)
Also if you like interactive discussions, come by #hbase on irc.freenode.net J-D On Thu, Jul 8, 2010 at 1:49 PM, <vramanatha...@aol.com> wrote: > Thankyou..That is very helpfull.. > sorry about the thread hijack..i wanted to change subject before last send > ..forgot.. > > thanks again.. > i'll be one active user with tons of Qs in the next few months :) > > > > > > > > > > > -----Original Message----- > From: Jean-Daniel Cryans <jdcry...@apache.org> > To: user@hbase.apache.org > Sent: Thu, Jul 8, 2010 1:58 pm > Subject: About data locality (Was: Re: HBase on same boxes as HDFS Data nodes) > > > (changing the subject, let's not hijack threads) > > > >> will the data move over time though...for example if i have lots of access to > > data in DataNode A ? without the current work that is in progress.. > > > > HBase has no control on that, but data will be moved if those regions > > are used. Like the article explains, the first replica goes to the > > local node, so through compactions/flushes one replica of each block > > will be on the local node. > > > > Also keep in mind that the new datanode may already contain some > > replicas of some of the blocks for that region, so it's not just black > > and white. This is quite possible on a small cluster, but over 1k > > nodes not that much ;) > > > > J-D > > > > On Thu, Jul 8, 2010 at 10:51 AM, <vramanatha...@aol.com> wrote: > >> > >> Thankyou.. > >> I've some more questions > >> I'm spending quite a bit over last few weeks to develop one of our > > applications using HBase/Hadoop > >> & using 0.20.4 > >> > >> Hbase - Table X > >> rows - 1- 100 -> Region A -> RegionServer A --> DataNode A > >> .... > >> rows 1500 - 1600 -> Region M -> RegionServer B -> DataNode B > >> > >> So based on what I have read so far..I'm thinking of Region Server A & Data > > Node A pairs on the same host to > >> make use of locality.. > >> > >> As per your answer ..If we restart the cluster, because of radom assigment, > > locality is gone > >> so..Region Server B -..> Region A ---> data blocks will be in Data Node A > >> ...if I understand correctly.. > >> will the data move over time though...for example if i have lots of access to > > data in DataNode A ? without the current work that is in progress.. > >> > >> thanks again for your reply > >> > >> venkatesh > >> > >> > > > >