Re: about index in HBase

2009-11-16 Thread 梁景明
u can build it on top of solr as i do , hbase just give the mapdata 2009/11/12 Ryan Rawson > The HFile "index" is an implementation detail, it doesnt affect how > the top level API presents itself. At the highest level HBase provides > an API that lets you store sorted order keys, and then seek

Re: regionserver disconnection

2009-11-16 Thread Zhenyu Zhong
Here is the diskIO and CPU around the time we had RS disconnection on one machine that runs RegionServer. It doesn't seem to be high. Similar disk and cpu usage have been seen before and the HBase was running fine. So far I haven't found why my 10 minutes session timeout doesn't apply. Still digg

Re: regionserver disconnection

2009-11-16 Thread stack
On Mon, Nov 16, 2009 at 12:05 PM, Zhenyu Zhong wrote: > I just realized that there was a MapReduce job running during the time the > regionserver disconnected from the zookeeper. > That MapReduce Job was processing 500GB data and took about 8 minutes to > finish. It launched over 2000 map tasks.

Re: regionserver disconnection

2009-11-16 Thread Zhenyu Zhong
I just realized that there was a MapReduce job running during the time the regionserver disconnected from the zookeeper. That MapReduce Job was processing 500GB data and took about 8 minutes to finish. It launched over 2000 map tasks. I doubt that this introduced resource contention between DataNod

Re: Hbase on Amazon S3?

2009-11-16 Thread Andrew Purtell
The new scripts in trunk at src/contrib/ec2 will offer this approach soon. Right now they simply back HDFS with instance storage (volatile) and rely on not having more than the HDFS replication factor (default = 3) instances crash or terminate at one time. Using EBS is a big win for its persiste

Re: Hbase on Amazon S3?

2009-11-16 Thread Vaibhav Puranik
We have HBase 0.20.0 running on EC2 with EBS volume since July 2009. We are using m1.Large machines for all the 4 nodes. All of our data resides on EBS volume. This helps us in backing up the data. This also helps us in bringing up a separate cluster with the same data for QA purposes. So far n

Re: Hbase on Amazon S3?

2009-11-16 Thread stack
A recent long thread herein was about hbase on s3. In the end, there were issues remaining (I believe -- someone correct me if I'm wrong please), leaving aside the base fact that s3 is an eventually consistent filesystem whereas hdfs wants to be strongly consistent -- at least when its running und

Hbase on Amazon S3?

2009-11-16 Thread Something Something
Anyone installed HBase on S3 (or EC2 for that matter)? Any pointers would be greatly appreciated. Thanks.