Thanks for the overview. It's helpful. Can you also help me understand why 2 region servers for the same row keys can't be running on the nodes where blocks are being replicated? I am assuming all the logs/HFiles etc are already being replicated so if one region server fails other region server is still taking reads/writes.
On Fri, Dec 2, 2011 at 12:15 PM, Ian Varley <ivar...@salesforce.com> wrote: > Mohit, > > Yeah, those are great places to go and learn. > > To fill in a bit more on this topic: "partition-tolerance" usually refers to > the idea that you could have a complete disconnection between N sets of > machines in your data center, but still be taking writes and serving reads > from all the servers. Some "NoSQL" databases can do this (to a degree), but > HBase cannot; the master and ZK quorum must be accessible from any machine > that's up and running the cluster. > > Individual machines can go down, as J-D said, and the master will reassign > those regions to another region server. So, imagine you had a network switch > fail that disconnected 10 machines in a 20-machine cluster; you wouldn't have > 2 baby 10-machine clusters, like you might with some other software; you'd > just have 10 machines "down" (and probably a significant interruption while > the master replays logs on the remaining 10). That would also require that > the underlying HDFS cluster (assuming it's on the same machines) was keeping > replicas of the blocks on different racks (which it does by default), > otherwise there's no hope. > > HBase makes this trade-off intentionally, because in real-world scenarios, > there aren't too many cases where a true network partition would be survived > by the rest of your stack, either (e.g. imagine a case where application > servers can't access a relational database server because of a partition; > you're just down). The focus of HBase fault tolerance is recovering from > isolated machine failures, not the collapse of your infrastructure. > > Ian > > > On Dec 2, 2011, at 2:03 PM, Jean-Daniel Cryans wrote: > > Get the HBase book: > http://www.amazon.com/HBase-Definitive-Guide-Lars-George/dp/1449396100 > > And/Or read the Bigtable paper. > > J-D > > On Fri, Dec 2, 2011 at 12:01 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > Where can I read more on this specific subject? > > Based on your answer I have more questions, but I want to read more > specific information about how it works and why it's designed that > way. > > On Fri, Dec 2, 2011 at 11:59 AM, Jean-Daniel Cryans <jdcry...@apache.org> > wrote: > No, data is only served by one region server (even if it resides on > multiple data nodes). If it dies, clients need to wait for the log > replay and region reassignment. > > J-D > > On Fri, Dec 2, 2011 at 11:57 AM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > Why is HBase consisdered high in consistency and that it gives up > parition tolerance? My understanding is that failure of one data node > still doesn't impact client as they would re-adjust the list of > available data nodes. >