Re: Region server failure question

2012-08-02 Thread Lars George
That is correct, the client blocks and retries a configurable amount of time until the regions are available again. Lars On Aug 2, 2012, at 7:01 AM, Mohit Anchlia wrote: > On Wed, Aug 1, 2012 at 12:52 PM, Mohammad Tariq wrote: > >> Hello Mohit, >> >> If replication factor is set to som

hbase can't start:KeeperErrorCode = NoNode for /hbase

2012-08-02 Thread abloz...@gmail.com
I even move /hbase to hbase2, and create a new dir /hbase1, modify hbase-site.xml to: hbase.rootdir hdfs://Hadoop48:54310/hbase1 zookeeper.znode.parent /hbase1 But the error message still KeeperErrorCode = NoNode for /hbase Any body can give any help? Thanks! Andy zhou 201

Re: hbase can't start:KeeperErrorCode = NoNode for /hbase

2012-08-02 Thread N Keywal
Hi, The issue is in ZooKeeper, not directly HBase. It seems its data is corrupted, so it cannot start. You can configure zookeeper to another data directory to make it start. N. On Thu, Aug 2, 2012 at 11:11 AM, abloz...@gmail.com wrote: > I even move /hbase to hbase2, and create a new dir /hba

RE: Region balancing question

2012-08-02 Thread Anoop Sam John
Hi Which version you are using? >From 0.94 in the balancer there are 2 ways of balancing. One is by table >balancing in which balancer will make sure the regions for one table is >balanced across the RSs. But in the other way of balancing in generic way it >will consider all the regions acros

RE: Region balancing question

2012-08-02 Thread Anoop Sam John
Seems this is available from 0.92 version.. See HBASE-3373 -Anoop- From: Anoop Sam John [anoo...@huawei.com] Sent: Thursday, August 02, 2012 3:24 PM To: user@hbase.apache.org Subject: RE: Region balancing question Hi Which version you are using? >From

RE: Region balancing question

2012-08-02 Thread Ramkrishna.S.Vasudevan
hbase.master.loadbalance.bytable - By default it is true. Regards Ram > -Original Message- > From: Anoop Sam John [mailto:anoo...@huawei.com] > Sent: Thursday, August 02, 2012 3:32 PM > To: user@hbase.apache.org > Subject: RE: Region balancing question > > Seems this is available from 0.

Re: How to query by rowKey-infix

2012-08-02 Thread Christian Schäfer
OK, at first I will try the scans. If that's too slow I will have to upgrade hbase (currently 0.90.4-cdh3u2) to be able to use coprocessors. Currently I'm stuck at the scans because it requires two steps (therefore some kind of filter chaining) The key:  userId-dateInMllis-sessionId At first

Re: hbase can't start:KeeperErrorCode = NoNode for /hbase

2012-08-02 Thread abloz...@gmail.com
Thank you, Keywal and Mohammad. I also think the data is corrupted, but the zookeeper is inner of Hbase, I don't know how to change the zookeeper data directory. I'll try this way. So if kill java process rudely, there may be corrupted of data. But sometimes, stop shell script will not work. Here

WG: How to query by rowKey-infix

2012-08-02 Thread Christian Schäfer
Excuse my double posting. Here is the complete mail: OK, at first I will try the scans. If that's too slow I will have to upgrade hbase (currently 0.90.4-cdh3u2) to be able to use coprocessors. Currently I'm stuck at the scans because it requires two steps (therefore maybe some kind of fi

Re: Filter with State

2012-08-02 Thread Jerry Lam
Hi Lars: That is useful. I appreciate it. The idea about cross row transaction is an interesting one. Can I have an iterator on the client side that get rows from a coprocessor? (i.e. Filtered rows are streamed into the client application and client can access them via iterator) Best Regards, J

KeyValue size too large

2012-08-02 Thread Bai Shen
I'm running Nutch 2 which uses HBase for storage. It is attempting to store files bigger than 10MB into HBase. This causes the KeyValue size too large exception. However, I've set the hbase.client.keyvalue.maxsize to 0, -1, and 100MB. None of those have had an effect. AFAIK, it's finding my hba

Re: Region balancing question

2012-08-02 Thread Ted Yu
0.92 doesn't have this feature. The change was rolled back. FYI On Thu, Aug 2, 2012 at 3:01 AM, Anoop Sam John wrote: > Seems this is available from 0.92 version.. See HBASE-3373 > > -Anoop- > > From: Anoop Sam John [anoo...@huawei.com] > Sent: Thursda

Re: Filter with State

2012-08-02 Thread lars hofhansl
Hi Jerry, you could create a RegionObserver implementation and have that implement the postScannerOpen hook and wrap the passed scanner with your own RegionScanner to do the filtering. Now, RegionObservers are still per region, so actually that would not help you either. Your best bet might b

RE: Retrieve Put timestamp

2012-08-02 Thread Wei Tan
+1. So far I think timestamp is very useful. I would imagine if we can configure the return, say in pre/post put, it would be even nicer. Thanks, Wei Wei Tan Research Staff Member IBM T. J. Watson Research Center 19 Skyline Dr, Hawthorne, NY 10532 w...@us.ibm.com; 914-784-6752 From: "Ramk

Re: Poor data locality of MR job

2012-08-02 Thread Bryan Keller
I presplit the table. The regionservers have gone down on occassion but have been up for a while (weeks). How could that result in having no regions on one node? On Aug 1, 2012, at 11:39 PM, Adrien Mogenet wrote: > Did you pre split your table or did you let balancer assign regions to > regio

Re: Region balancing question

2012-08-02 Thread Bryan Keller
I'm using 0.92 (Cloudera CDH4). Yes I definitely do not want to balance all regions across all tables together, as some tables are much more active than others and thus some regions are barely being used. I was thinking this might be what the balancer was doing. The regions are balanced in terms

Re: Region balancing question

2012-08-02 Thread Kevin O'dell
Bryan, https://issues.apache.org/jira/browse/HBASE-3373 did not make it into CDH4, there is not a real easy way to do this on your own. I have attached some sample code to get your started with writing your own(a colleague of mine wrote it). On Thu, Aug 2, 2012 at 1:20 PM, Bryan Keller wrot

Re: Region balancing question

2012-08-02 Thread Elliott Clark
Even when balancing by table the current default balancer does not take into account region size, request rate, or memory usage. If you want those things there's a new balancer in trunk (slated for 0.96) that gives these things. However that patch is a little bit more involved and applying it to

Re: Poor data locality of MR job

2012-08-02 Thread Jean-Daniel Cryans
On Wed, Aug 1, 2012 at 11:31 PM, Bryan Keller wrote: > I have an 8 node cluster and a table that is pretty well balanced with on > average 36 regions/node. When I run a mapreduce job on the cluster against > this table, the data locality of the mappers is poor, e.g 100 rack local > mappers and

Re: KeyValue size too large

2012-08-02 Thread Jean-Daniel Cryans
It sounds like you are overlooking something, are you sure the client is really picking the hbase-site.xml you think it is? Also I don't get your reference to the hbase.rootdir, the client doesn't use it and hbase.client.keyvalue.maxsize is client-side only. J-D On Thu, Aug 2, 2012 at 7:26 AM, B

Re: KeyValue size too large

2012-08-02 Thread Bai Shen
That was what I was missing. I moved it and that fixed it. Thanks. On Thu, Aug 2, 2012 at 2:42 PM, Jean-Daniel Cryans wrote: > It sounds like you are overlooking something, are you sure the client > is really picking the hbase-site.xml you think it is? > > Also I don't get your reference to the

Re: Poor data locality of MR job

2012-08-02 Thread Bryan Keller
I see what you mean about block locality, that is at the regionserver level, transparent to the MR job. This doesn't happen only to the final mappers, some of the early mappers are rack local. The table is reasonably well distributed across the nodes but not perfectly (that is a question I have

Re: How to query by rowKey-infix

2012-08-02 Thread Alex Baranau
Hi Christian! If to put off secondary indexes and assume you are going with "heavy scans", you can try two following things to make it much faster. If this is appropriate to your situation, of course. 1. > Is there a more elegant way to collect rows within time range X? > (Unfortunately, the dat

Re: How to query by rowKey-infix

2012-08-02 Thread Matt Corgan
Also Christian, don't forget you can read all the rows back to the client and do the filtering there using whatever logic you like. HBase Filters can be thought of as an optimization (predicate push-down) over client-side filtering. Pulling all the rows over the network will be slower, but I don'

Re: How to query by rowKey-infix

2012-08-02 Thread Alex Baranau
I think this is exactly what Christian is trying to (and should be trying to) avoid ;). I can't imagine use-case when you need to filter something and you can do it with (at least) server-side filter, and yet in this situation you want to try to do it on the client-side... Doing filtering on clien

Re: How to query by rowKey-infix

2012-08-02 Thread Matt Corgan
Yeah - just thought i'd point it out since people often have small tables in their cluster alongside the big ones, and when generating reports, sometimes you don't care if it finishes in 10 minutes vs an hour. On Thu, Aug 2, 2012 at 6:15 PM, Alex Baranau wrote: > I think this is exactly what Chr

Re: HBaseTestingUtility on windows

2012-08-02 Thread N Keywal
Hi Mohit, For simple cases, it works for me for hbase 0.94 at least. But I'm not sure it works for all features. I've never tried to run hbase unit tests on windows for example. N. On Fri, Aug 3, 2012 at 6:01 AM, Mohit Anchlia wrote: > I am trying to run mini cluster using HBaseTestingUtility C