Re: FilterList and SingleColumnValueFilter

2009-12-15 Thread Paul Ambrose
Hey Michael, If hbase-2037 will make it into 0.20.3, I am fine. If not, I would greatly appreciate you breaking it out for 0.20.3. Thanks, Paul On Dec 15, 2009, at 10:28 PM, stack wrote: > Paul: > > I can apply the fix from hbase-2037... I can break it out of the posted > patch thats up ther

Re: FilterList and SingleColumnValueFilter

2009-12-15 Thread stack
Paul: I can apply the fix from hbase-2037... I can break it out of the posted patch thats up there. Just say the word. St.Ack On Tue, Dec 15, 2009 at 4:17 PM, Ram Kulbak wrote: > Hi Paul, > > I've encountered the same problem. I think its fixed as part of > https://issues.apache.org/jira/bro

Re: HBase Utility functions (for Java 5+)

2009-12-15 Thread Andrew Purtell
Thanks for the feedback Paul. I agree the Builder pattern is an interesting option. Please see https://issues.apache.org/jira/browse/HBASE-2051 - Andy From: Paul Smith To: hbase-user@hadoop.apache.org Sent: Tue, December 15, 2009 3:21:44 PM Subject: Re: H

Re: FilterList and SingleColumnValueFilter

2009-12-15 Thread Ram Kulbak
Hi Paul, I've encountered the same problem. I think its fixed as part of https://issues.apache.org/jira/browse/HBASE-2037 Regards, Yoram On Wed, Dec 16, 2009 at 10:45 AM, Paul Ambrose wrote: > I ran into some problems with FilterList and SingleColumnValueFilter. > > I created a FilterList wi

FilterList and SingleColumnValueFilter

2009-12-15 Thread Paul Ambrose
I ran into some problems with FilterList and SingleColumnValueFilter. I created a FilterList with MUST_PASS_ONE and two SingleColumnValueFilters (each testing equality on a different columns) and query some trivial data: http://pastie.org/744890 The problem that I encountered were two-fold: Sin

Re: HBase Utility functions (for Java 5+)

2009-12-15 Thread Paul Smith
On 16/12/2009, at 7:04 AM, stack wrote: > On Tue, Dec 15, 2009 at 9:56 AM, Kevin Peterson wrote: > >> These kinds of cleaner APIs would be a good way to prevent the standard >> situation of one engineer on the team figuring out HBase, then others say >> "why is this so complicated" so they writ

Re: hlogs do not get cleared

2009-12-15 Thread Kevin Peterson
On Tue, Dec 15, 2009 at 10:43 AM, Jean-Daniel Cryans wrote: > > Too many hlogs means that the inserts are hitting a lot of regions, > that those regions aren't filled enough to flush so that we have to > force flush them to give some room. When you added region servers, it > spread the regions loa

Re: Help on HBase shell alter command usage

2009-12-15 Thread Ted Yu
That works. scan command gives values for columns. Is there a shell command which lists unique row values, such as 'com.onsoft.www:http/' ? Thanks On Tue, Dec 15, 2009 at 12:09 PM, stack wrote: > Try: > > hbase(main):005:0> get 'crawltable', 'com.onsoft.www:http/', { COLUMNS => > 'stt:'} > > i

Re: HBase Utility functions (for Java 5+)

2009-12-15 Thread Tim Robertson
Seems like an intuitive option to me. Tim On Tue, Dec 15, 2009 at 9:04 PM, stack wrote: > On Tue, Dec 15, 2009 at 9:56 AM, Kevin Peterson wrote: > >> These kinds of cleaner APIs would be a good way to prevent the standard >> situation of one engineer on the team figuring out HBase, then others

Re: Help on HBase shell alter command usage

2009-12-15 Thread stack
Try: hbase(main):005:0> get 'crawltable', 'com.onsoft.www:http/', { COLUMNS => 'stt:'} i.e. '=>' rather than '='. Also, its COLUMNS (uppercase I believe) rather than column. Run 'help' in the shell for help and examples. St.Ack On Tue, Dec 15, 2009 at 11:53 AM, Ted Yu wrote: > Hi, > I saw t

Re: HBase Utility functions (for Java 5+)

2009-12-15 Thread stack
On Tue, Dec 15, 2009 at 9:56 AM, Kevin Peterson wrote: > These kinds of cleaner APIs would be a good way to prevent the standard > situation of one engineer on the team figuring out HBase, then others say > "why is this so complicated" so they write an internal set of wrappers and > utility metho

Re: Help on HBase shell alter command usage

2009-12-15 Thread Ted Yu
Hi, I saw the following from scan 'crawltable' command in hbase shell: ... com.onsoft.www:http/column=stt:, timestamp=1260405530801, value=\003 3 row(s) in 0.2490 seconds How do I query the value for stt column ? hbase(main):005:0> get 'crawltable', 'com.onsoft.www:http/', { column='stt:

Re: hlogs do not get cleared

2009-12-15 Thread stack
I'd advise setting the upper limit for WALs back down to 32 rather than the 96 you have. Lets figure why old logs are not being cleared up even if only 32. When 96, it means that on crash, the log splitting process has more logs to process (~96 rather than ~32). It'll take longer for the split p

Re: running unit test based on HBaseClusterTestCase

2009-12-15 Thread stack
Order can be important. Don't forget to include conf directories. Below is from an eclipse .classpath that seems to work for me:

Re: Performance related question

2009-12-15 Thread Patrick Hunt
Btw, nothing says that ZK users (incl hbase) _must_ run a multi-node ZK ensemble. For coordination tasks a single ZK server (standalone mode) is often sufficient, you just need to realize you are sacrificing reliability/availability. Going from 1 -> 3 -> 5 -> 7 ZK servers in an ensemble should

Re: running unit test based on HBaseClusterTestCase

2009-12-15 Thread Guohua Hao
Yes, I included all the necessary jar files I think. I guess my problem is probably related to my eclipse setup. I can create a MiniDFSCluster object by running my application in command line (e.g., bin/hadoop myApplicationClass) , and a MiniDFSCluster object is created inside the main function o

Re: Performance related question

2009-12-15 Thread Jean-Daniel Cryans
Given that m1.small has 1 CPU, 1.7GB of RAM and 1/8 (or less) the IO of the host machine and counting in the fact that those machines are networked as a whole I expect it to much much slower that your local machine. Those machines are so under-powered that the overhead of hadoop/hbase probably over

Re: hlogs do not get cleared

2009-12-15 Thread Jean-Daniel Cryans
Kevin, Too many hlogs means that the inserts are hitting a lot of regions, that those regions aren't filled enough to flush so that we have to force flush them to give some room. When you added region servers, it spread the regions load so that hlogs were getting filled at a slower rate. Could yo

hlogs do not get cleared

2009-12-15 Thread Kevin Peterson
We're running a 13 node HBase cluster. We had some problems a week ago with it being overloaded and errors related to not being able to find a block on HDFS, but adding four more nodes and increasing max heap from 3GB to 4.5GB on all nodes fixed any problems. Looking at the logs now, though, we se

Re: Fwd: What should we expect from Hama Examples Rand?

2009-12-15 Thread Andrew Purtell
You are missing some supporting jar. > java.io.IOException: java.io.IOException: java.lang.NullPointerException > at java.lang.Class.searchMethods(Unknown Source) Note that the exception is in a JVM method (java.lang.Class.searchMethods). This is not really a HBase problem per se, but instea

Re: running unit test based on HBaseClusterTestCase

2009-12-15 Thread Stack
Do you have hadoop jars in your eclipse classpath? Stack On Dec 14, 2009, at 10:58 PM, Guohua Hao wrote: Hello All, In my own application, I have a unit test case which extends HBaseClusterTestCase in order to test some of my operation over HBase cluster. I override the setup function in my

Re: HBase Utility functions (for Java 5+)

2009-12-15 Thread Kevin Peterson
On Tue, Dec 15, 2009 at 9:21 AM, Gary Helmling wrote: > I completely agree with the need to understand both the fundamental HBase > API, and how HBase stores data at a low level. Both are very important in > knowing how to structure your data for best performance. Which you should > figure out

Re: HBase Utility functions (for Java 5+)

2009-12-15 Thread Gary Helmling
I completely agree with the need to understand both the fundamental HBase API, and how HBase stores data at a low level. Both are very important in knowing how to structure your data for best performance. Which you should figure out before moving on to other niceties. As far as the actual data s

Re: Performance related question

2009-12-15 Thread Something Something
Thanks J-D & Mtohiko for the tips. Significant improvement in performance, but there's still room for improvement. In my local pseudo distributed mode the 2 map reduce jobs now run in less than 4 minutes (from 32 mins) and in cluster of 10 nodes + 5 zk nodes they run in 11 minutes (down from 1 ho

Re: HBase Utility functions (for Java 5+)

2009-12-15 Thread Edward Capriolo
On Tue, Dec 15, 2009 at 11:04 AM, Gary Helmling wrote: > This definitely seems to be a common initial hurdle, though I think each > project comes at it with their own specific needs.  There are a variety of > frameworks or libraries you can check out on the Supporting Projects page: > http://wiki.

Re: HBase Utility functions (for Java 5+)

2009-12-15 Thread Gary Helmling
This definitely seems to be a common initial hurdle, though I think each project comes at it with their own specific needs. There are a variety of frameworks or libraries you can check out on the Supporting Projects page: http://wiki.apache.org/hadoop/SupportingProjects In my case, I wanted a sim

Re: HBase Utility functions (for Java 5+)

2009-12-15 Thread Edward Capriolo
On Tue, Dec 15, 2009 at 1:03 AM, stack wrote: > HBase requires java 6 (1.6) or above. > St.Ack > > On Mon, Dec 14, 2009 at 7:41 PM, Paul Smith wrote: > >> Just wondering if anyone knows of an existing Hbase utility library that is >> open sourced that can assist those that have Java5 and above.  

Fwd: What should we expect from Hama Examples Rand?

2009-12-15 Thread Edward J. Yoon
Hi, 'RAND' example of hama-examples.jar is basically a simple M/R job that creates a table filled with random numbers. So, before the run Hama, Pls check whether you able to create tables via hbase shell. Can someone of Hbase the developers help this problem? Thanks -- Forwarded message