Re: Load balancer repeatedly close and open region in the same regionserver.

2012-07-30 Thread yuzhihong
Can you trace master log to see why there were two region servers on that ip with different start codes ? Thanks On Jul 29, 2012, at 10:46 PM, deanforwever2010 wrote: > hi Ted,I am in the same team of Howard's > We didn't found two region server processes running on > 192.168.18.40 > > >

Re: Load balancer repeatedly close and open region in the same regionserver.

2012-07-30 Thread Howard
we checked the regionserver using ps aux,we find only one regionserver instance. we checked zookeeper,using ls,we find there is only one regionserver info. [zk: localhost:2181(CONNECTED) 3] ls /unimportance-hbase/rs [192.168.18.38,60020,1343569932859, 192.168.18.39,60020,1343569991763, 192.168.18.3

Re: Best practices for custom filter class distribution?

2012-07-30 Thread Ben Kim
Yes testing custom filters in HBase are sort of pain. On Thu, Jun 28, 2012 at 5:53 AM, Evan Pollan wrote: > Thanks Amandeep -- I hadn't seen the FilterList. That should be able to > get me most of the way there by simply "indexing" and chaining together > DependentColumnFilters. > > I think I'm

Re: Load balancer repeatedly close and open region in the same regionserver.

2012-07-30 Thread deanforwever2010
by my understanding, even if we started two region server on one machine, the master will check if the region server with the same ip and port had registered, if registered ,the later one will be ignored. you can see the source code in org.apache.hadoop.hbase.master.ServerManager 2012/7/30 > Ca

Re: Coprocessor POC

2012-07-30 Thread Cyril Scetbon
I've given the values returned by scan 'table' command in hbase shell in my first email. Regards Cyril SCETBON On Jul 30, 2012, at 12:50 AM, Himanshu Vashishtha wrote: > And also, what are your cell values look like? > > Himanshu > > On Sun, Jul 29, 2012 at 3:54 PM, wrote: >> Can you use

Re: Coprocessor POC

2012-07-30 Thread Cyril Scetbon
Here is the stack : (with hbase-0.94.0.jar) row count is 1 12/07/30 15:08:53 WARN client.HConnectionManager$HConnectionImplementation: Error executing for row java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions

Re: Coprocessor POC

2012-07-30 Thread Ted Yu
Did your client include the following fix ? HBASE-5821 Incorrect handling of null value in Coprocessor aggregation function min() (Maryann Xue) >From the stack trace you provided, looks like the NPE came from this line: sumVal = ci.add(sumVal, ci.castToReturnType(ci.getValue(colFamily,

Re: Coprocessor POC

2012-07-30 Thread Himanshu Vashishtha
On Mon, Jul 30, 2012 at 6:55 AM, Cyril Scetbon wrote: > I've given the values returned by scan 'table' command in hbase shell in my > first email. Somehow I missed the scan result in your first email. So, can you pass a LongColumnInterpreter instance instead of null? See TestAggregateProtocol me

Region Server failure due to remote data node errors

2012-07-30 Thread Jay T
A couple of our region servers (in a 16 node cluster) crashed due to underlying Data Node errors. I am trying to understand how errors on remote data nodes impact other region server processes. *To briefly describe what happened: * 1) Cluster was in operation. All 16 nodes were up, reads and w

Re: Query a version of a column efficiently

2012-07-30 Thread Suraj Varma
You may need to setup your Eclipse workspace and search using references etc.To get started, this is one class that uses TimeRange based matching ... org.apache.hadoop.hbase.regionserver.ScanQueryMatcher Also - Get is internally implemented as a Scan over a single row. Hope this gets you started.

Re: Coprocessor POC

2012-07-30 Thread Cyril Scetbon
Thanks, it's really better ! I've read that by default it supports only Long values, that's why I was using a null ColumnInterpreter. Regards. Cyril SCETBON On Jul 30, 2012, at 5:56 PM, Himanshu Vashishtha wrote: > On Mon, Jul 30, 2012 at 6:55 AM, Cyril Scetbon wrote: > >> I've given the

Re: Region Server failure due to remote data node errors

2012-07-30 Thread N Keywal
Hi Jay, Yes, the whole log would be interesting, plus the logs of the datanode on the same box as the dead RS. What's your hbase & hdfs versions? The RS should be immune to hdfs errors. There are known issues (see HDFS-3701), but it seems you have something different... This: > java.nio.channels.

Re: Coprocessor POC

2012-07-30 Thread Himanshu Vashishtha
We should fix the reference then. Where did you read it? On Mon, Jul 30, 2012 at 10:43 AM, Cyril Scetbon wrote: > Thanks, it's really better ! > > I've read that by default it supports only Long values, that's why I was > using a null ColumnInterpreter. > > Regards. > Cyril SCETBON > > On Jul 30

Re: Cluster load

2012-07-30 Thread Mohit Anchlia
On Fri, Jul 27, 2012 at 6:03 PM, Alex Baranau wrote: > Yeah, your row keys start with \x00 which is = (byte) 0. This is not the > same as "0" (which is = (byte) 48). You know what to fix now ;) > > I made required changes and it seems to be load balancing it pretty well. I do have a follow up que

Re: Region Server failure due to remote data node errors

2012-07-30 Thread Jay T
Thanks for the quick reply Nicolas. We are using HBase 0.94 on Hadoop 1.0.3. I have uploaded the logs here: Region Server log: http://pastebin.com/QEQ22UnU Data Node log: http://pastebin.com/DF0JNL8K Appreciate your help in figuring this out. Thanks, Jay On 7/30/12 1:02 PM, N Keywal wro

Re: Cluster load

2012-07-30 Thread Alex Baranau
Glad to hear that answers & suggestions helped you! The format you are seeing is the output of org.apache.hadoop.hbase.util.Bytes.toStringBinary(..) method [1]. As you can see below, for "printable characters" it outputs the character itself, while for "non-printable" characters it outputs data in

Re: Cluster load

2012-07-30 Thread Mohit Anchlia
On Mon, Jul 30, 2012 at 11:58 AM, Alex Baranau wrote: > Glad to hear that answers & suggestions helped you! > > The format you are seeing is the output of > org.apache.hadoop.hbase.util.Bytes.toStringBinary(..) method [1]. As you > can see below, for "printable characters" it outputs the character

Region Server failure due to remote data node errors

2012-07-30 Thread Jay Talreja
A couple of our region servers (in a 16 node cluster) crashed due to underlying Data Node errors. I am trying to understand how errors on remote data nodes impact other region server processes. *To briefly describe what happened: * 1) Cluster was in operation. All 16 nodes were up, reads and w

Re: Region Server failure due to remote data node errors

2012-07-30 Thread N Keywal
Hi Jay, As you said aldready, the pipeline for blk_5116092240243398556_489796 contains only dead nodes, and this is likely the cause for the wrong behavior. This block is used by a hlog file, created just before the error. I don't get why there are 3 nodes in the pipeline, I would expect only 2. D

Re: Coprocessor POC

2012-07-30 Thread Cyril Scetbon
unfortunately I can't remember/find it :( and I see in AggregationClient's javadoc that : "Column family can't be null" , so I suppose I should have read it at first ! Thanks again Cyril SCETBON On Jul 30, 2012, at 7:30 PM, Himanshu Vashishtha wrote: > We should fix the reference then. Where

Retrieve Put timestamp

2012-07-30 Thread Pablo Musa
Hey guys, in my application the HBase timestamp is used as version in my logic. I would like to know what is the best way to insert a new record and get its timestamp. I have come up with two possibilities: /* I could force timestamp, but it is not a good idea since different servers * write in

Web Admin Pages & SSL

2012-07-30 Thread Patrick Schless
I like having access to the web admin pages that HBase, HDFS, etc provide. I can't find a way to put them behind SSL, though. For the HMaster it's easy enough (nginx+SSL as a reverse proxy), but the HMaster generates links like data01.company.com:60030. Is there a way to change the scheme and port

Re: Parallel scans

2012-07-30 Thread Bertrand Dechoux
Hi, Are you talking about as coprocessor or MapReduce input? If it is the first then it is up to you (the client). If it is the latter I am not sure that -if scans were changed to be parallel (assuming they are sequential now)- the whole job would be noticeably faster. But I am interested in an an