Re: just open sourced Orderly -- a row key schema system (composite keys, etc) for use with HBase

2011-11-16 Thread Denis Kreis
Hi Mike, whats about the project? Was it moved to another location? Best regards, Denis

Re: just open sourced Orderly -- a row key schema system (composite keys, etc) for use with HBase

2011-11-16 Thread Michael Dalton
Hi Denis, Yeah we got a corporate github account, it's now at https://github.com/zettaset/orderly . Sorry for the confusion Best, Mike On Wed, Nov 16, 2011 at 12:24 AM, Denis Kreis wrote: > Hi Mike, > > whats about the project? Was it moved to another location? > > Best regards, > Denis > > >

Re: Query on analyze big data with Hbase

2011-11-16 Thread Cosmin Lehene
You should consider looking over the available HBase resources There's an online book http://hbase.apache.org/book.html And there's Lars George's book from O'Reilly (http://shop.oreilly.com/product/0636920014348.do) On 11/16/11 6:39 AM, "Stuti Awasthi" wrote: >Hi all, > >I have a scenario in wh

balancer stopped working

2011-11-16 Thread Matthias Hofschen
Hi, we had a case today where the loadbalancer stopped working. (cloudera-cdh3-u1, 52nodes). Basically we had a hot region that we moved to another node. Shortly thereafter the regionserver of that region was stopped. In the master logs we see that master is trying to contact this regionserver to m

Re: balancer stopped working

2011-11-16 Thread yuzhihong
Can you post the stack trace and relevant log snippets ? Thanks On Nov 16, 2011, at 2:40 AM, Matthias Hofschen wrote: > Hi, > we had a case today where the loadbalancer stopped working. > (cloudera-cdh3-u1, 52nodes). Basically we had a hot region that we moved to > another node. Shortly ther

Re: balancer stopped working

2011-11-16 Thread Matthias Hofschen
Hi this is the stacktrace of the master process till restart: 2011-11-16 09:28:20,007 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. servers=51 regions=76793 average=1505.7451 mostloaded=1506 leastloaded=1506 2011-11-16 09:29:51,276 INFO org.apache.hadoop.hbase.master.

Not able to change the VERSION of hbase row

2011-11-16 Thread Vamshi Krishna
Hi, i am trying to create and inserting data to hbase table. when i wanted to change the maximum number of versions of row in hbase from default to 1, it worked well, i.e whenever i add new row to table, the latest value is seen. But when i tried to change using hbase shell like the following i cou

Scans and lexical sorting

2011-11-16 Thread Mark
Section 5.7.3 of the HBase book displays a scan operation: HTable htable = ... // instantiate HTable Scan scan = new Scan(); scan.addColumn(Bytes.toBytes("cf"),Bytes.toBytes("attr")); scan.setStartRow( Bytes.toBytes("row")); scan.setStopRow( Bytes.toBytes("row" + new byte[] {0})); // note

RE: balancer stopped working

2011-11-16 Thread Ramkrishna S Vasudevan
Can you share the logs to confirm things? Regards Ram -Original Message- From: Matthias Hofschen [mailto:hofsc...@gmail.com] Sent: Wednesday, November 16, 2011 4:10 PM To: hbase-u...@hadoop.apache.org Subject: balancer stopped working Hi, we had a case today where the loadbalancer stop

Re: Scans and lexical sorting

2011-11-16 Thread lars hofhansl
Hi Mark, good find. I think that works by accident and the book is wrong. "row" +  new byte[] {0} will use byte[].toString() and actually result in something like: "row[B@152b6651", which (again accidentally) sorts past rowN. "row" + new byte[] {255} is not better, though. You'd have to construct

RE: n00b trying to run HBase example code

2011-11-16 Thread Royston Sellman
Thanks for your suggestion J-D, I hadn't tried that. So, following your advice, below is the error log from one of the slaves in my cluster. Maybe those "connection refused" messages are the cause of the exception... Does it ring any bells for you? 2011-11-16 14:37:14,641 INFO org.apac

Re: n00b trying to run HBase example code

2011-11-16 Thread Jean-Daniel Cryans
Ah right: > 2011-11-16 14:37:14,670 INFO org.apache.zookeeper.ClientCnxn: Opening socket > connection to server localhost/127.0.0.1:2181 Unless you have a ZK server running on every node (you shouldn't), then it's not going to find it. Your job needs to know about your zookeeper configuration. C

Re: Not able to change the VERSION of hbase row

2011-11-16 Thread Jean-Daniel Cryans
You need to tell the Get or Scan to fetch more versions. For example, the help for the get commands gives this example: hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4} In the API you would use http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.htm

Re: Facing Issues with RowCounter

2011-11-16 Thread Jean-Daniel Cryans
What I can decrypt from those outputs is that you have a total of 7 rows, and none of them have data in the "Set" column family. Is it the case or not? Without more info from you, it's hard to tell. J-D On Tue, Nov 15, 2011 at 11:41 PM, Stuti Awasthi wrote: > Hi, > I tried to use MR RowCounter t

Re: just open sourced Orderly -- a row key schema system (composite keys, etc) for use with HBase

2011-11-16 Thread Stack
On Wed, Nov 16, 2011 at 12:32 AM, Michael Dalton wrote: > Hi Denis, > > Yeah we got a corporate github account, it's now at > https://github.com/zettaset/orderly . Sorry for the confusion > Consider sticking a link to Orderly up here Michael: http://wiki.apache.org/hadoop/SupportingProjects Yours

Re: just open sourced Orderly -- a row key schema system (composite keys, etc) for use with HBase

2011-11-16 Thread Michael Dalton
Thanks, will do! Best, Mike On Wed, Nov 16, 2011 at 10:14 AM, Stack wrote: > On Wed, Nov 16, 2011 at 12:32 AM, Michael Dalton > wrote: > > Hi Denis, > > > > Yeah we got a corporate github account, it's now at > > https://github.com/zettaset/orderly . Sorry for the confusion > > > > Consider s

Fwd: Upgrading master hardware

2011-11-16 Thread Mark
We will be adding more memory into our master node in the near future. We generally don't mind if our map/reduce jobs are unable to run for a short period but we are more concerned about the impact this may have on our HBase cluster. Will HBase continue to work will hadoops name-node and/or HMa

Re: Upgrading master hardware

2011-11-16 Thread Karthik Ranganathan
Mark, If the NN goes down the entire HBase cluster will be halted. If the HMaster is down, you will not be able to get regions assigned (in case any regionserver goes down). So the best thing to do is start NN and Hmaster on another node, do you upgrade and move them back. Thanks Karthik On 11

Re: Upgrading master hardware

2011-11-16 Thread Mark
So obviously there will be a short interruption while I shut down the NN on 1 machine and start it up on another? On 11/16/11 10:53 AM, Karthik Ranganathan wrote: Mark, If the NN goes down the entire HBase cluster will be halted. If the HMaster is down, you will not be able to get regions assi

Re: Not able to change the VERSION of hbase row

2011-11-16 Thread Doug Meil
Also, there is a versioned Get example here... http://hbase.apache.org/book.html#get On 11/16/11 12:34 PM, "Jean-Daniel Cryans" wrote: >You need to tell the Get or Scan to fetch more versions. For example, >the help for the get commands gives this example: > > hbase> get 't1', 'r1', {CO

[ANN]: HBase-Writer 0.90.3 available for download

2011-11-16 Thread Ryan Smith
The HBase-Writer team is happy to announce that HBase-Writer 0.90.3 is available for download: http://code.google.com/p/hbase-writer/downloads/list HBase-Writer 0.90.3 is a maintenance release that fixes library compatibility since older versions of Heritrix and HBase. More details may be fou

Metrics

2011-11-16 Thread Mark
I've enabled TimeStampingFileContext in /etc/hbase/conf/hadoop-metrics.conf however nothing is being outputted. I've check the logs but I don't see any configuration errors. How can I test that I've configured everything correctly? Any Ideas? Thanks

Re: Metrics

2011-11-16 Thread Mark
The only way I can get any metrics to work is if I append them to HADOOP_HOME/conf/hadoop-metrics.properties. Is this expected? On 11/16/11 11:37 AM, Mark wrote: I've enabled TimeStampingFileContext in /etc/hbase/conf/hadoop-metrics.conf however nothing is being outputted. I've check the logs

December 2011 SF Hadoop User Group

2011-11-16 Thread Aaron Kimball
After a month's hiatus for Hadoop World, we're back! The December Hadoop meetup will be held Wednesday, December 14, from 6pm to 8pm. This meetup will be hosted by Splunk at their office on Brannan St. As usual, we will use the discussion-based "unconference" format. At the beginning of the meetup

Re: Metrics

2011-11-16 Thread Sam Seigal
I think that is expected: http://hbase.apache.org/metrics.html On Wed, Nov 16, 2011 at 1:10 PM, Mark wrote: > The only way I can get any metrics to work is if I append them to > HADOOP_HOME/conf/hadoop-metrics.properties. Is this expected? > > On 11/16/11 11:37 AM, Mark wrote: >> >> I've enabled

Re: Metrics

2011-11-16 Thread Mark
- To have HBase emit metrics, edit|$HBASE_HOME/conf/hadoop-metrics.properties|and enable metric 'contexts' per plugin. The only way it actually works is I append the configuration to *hadoops* hadoop-metrics.properties.. not hbases hadoop-metrics.properties. On 11/16/11 2:31 PM, Sam Seigal w

Help with continuous loading configuration

2011-11-16 Thread Amit Jain
Hello, We're doing a proof-of-concept study to see if HBase is a good fit for an application we're planning to build. The application will be recording a continuous stream of sensor data throughout the day and the data needs to be online immediately. Our test cluster consists of 16 machines, eac

Re: Help with continuous loading configuration

2011-11-16 Thread lars hofhansl
Hi Amit, 12MB write buffer might be a bit high. How are you generating your keys? You might hot spot a single region server if (for example) you create monotonically increasing keys. When you look at the HBase monitoring page, do you see a single region server getting all the requests? Anythi

Re: Help with continuous loading configuration

2011-11-16 Thread Amit Jain
Hi Lars, The keys are arriving in random order. The HBase monitoring page shows evenly distributed load across all of the region servers. I didn't see anything weird in the gc logs, no mention of any failures. I'm a little unclear about what the optimal values for the following properties shoul

Re: Help with continuous loading configuration

2011-11-16 Thread Stack
On Wed, Nov 16, 2011 at 3:26 PM, Amit Jain wrote: > Hi Lars, > > The keys are arriving in random order.  The HBase monitoring page shows > evenly distributed load across all of the region servers. What kind of ops rates are you seeing? They are running nice and smooth across all servers? No st

Re: Help with continuous loading configuration

2011-11-16 Thread lars hofhansl
hbase.hstore.blockingStoreFiles is the maximum number of store files HBase will allow before it will block writes in order to catch up with compacting files. Default is 7. If this is too low you'll see warning about blocking writers in the logs. I found that for some test load I had, I needed to

Re: Scans and lexical sorting

2011-11-16 Thread Doug Meil
I'll update this. Thanks. On 11/16/11 12:17 PM, "lars hofhansl" wrote: >Hi Mark, >good find. I think that works by accident and the book is wrong. >"row" + new byte[] {0} will use byte[].toString() and actually result in >something like: "row[B@152b6651", which (again accidentally) sorts

Re: Not able to change the VERSION of hbase row

2011-11-16 Thread Doug Meil
Also, there is an example of a versioned get here... http://hbase.apache.org/book.html#get On 11/16/11 12:34 PM, "Jean-Daniel Cryans" wrote: >You need to tell the Get or Scan to fetch more versions. For example, >the help for the get commands gives this example: > > hbase> get 't1', 'r1',

on HBase 1.0

2011-11-16 Thread Andrew Purtell
It's possible a release of 0.20.20X (X=5 I think) as Hadoop 1.0 is imminent. The Hadoop 1.0 release is an acknowledgement of reality -- 0.20 branch is in production at many places. I know we agreed to separate HBase versioning from Hadoop versioning, but if we continue to number HBase as 0.X af

Re: on HBase 1.0

2011-11-16 Thread Stack
On Wed, Nov 16, 2011 at 3:50 PM, Andrew Purtell wrote: > It's possible a release of 0.20.20X (X=5 I think) as Hadoop 1.0 is imminent. > > The Hadoop 1.0 release is an acknowledgement of reality -- 0.20 branch is in > production at many places. > > I know we agreed to separate HBase versioning fro

Re: on HBase 1.0

2011-11-16 Thread Andrew Purtell
> From: Stack > To: d...@hbase.apache.org; Andrew Purtell > Cc: "user@hbase.apache.org" > Sent: Wednesday, November 16, 2011 3:52 PM > Subject: Re: on HBase 1.0 > > On Wed, Nov 16, 2011 at 3:50 PM, Andrew Purtell > wrote: >> It's possible a release of 0.20.20X (X=5 I think) as Hadoop 1.0 is

Re: on HBase 1.0

2011-11-16 Thread Karthik Ranganathan
My 2 cents - whatever branch we decide to put out as 1.0, I think we should have a stability/testing phase without adding too many features, so that it is pretty stable to end users. - Karthik On 11/16/11 3:57 PM, "Andrew Purtell" wrote: >> From: Stack > >> To: d...@hbase.apache.org; Andrew

Re: Help with continuous loading configuration

2011-11-16 Thread Amit Jain
Hi Stack, Thanks for the feedback. Comments inline ... On Wed, Nov 16, 2011 at 3:35 PM, Stack wrote: > On Wed, Nov 16, 2011 at 3:26 PM, Amit Jain wrote: > > Hi Lars, > > > > The keys are arriving in random order. The HBase monitoring page shows > > evenly distributed load across all of the r

Re: What happens if Hbase is installed on machines with no HDFS?

2011-11-16 Thread edward choi
Thanks for the info. That webpage turns out to be very instructive :) Regards, Ed 2011/11/11 Suraj Varma > Yes - it could be separated at the cost of network io and data locality. > > See this: > http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html > --Suraj > > On Tue, Nov 8, 201

Re: Help with continuous loading configuration

2011-11-16 Thread Matt Corgan
You can set put.setWriteToWAL(false) to skip the write ahead logging which slows down puts significantly. But, you will lose data if a regionserver crashes with data in its memstore. On Wed, Nov 16, 2011 at 4:09 PM, Amit Jain wrote: > Hi Stack, > > Thanks for the feedback. Comments inline ...

Re: Help with continuous loading configuration

2011-11-16 Thread Amit Jain
We would prefer not to do this. It's important that we have all of the historical data without any loss. But thanks for the suggestion. - Amit On Wed, Nov 16, 2011 at 4:30 PM, Matt Corgan wrote: > You can set put.setWriteToWAL(false) to skip the write ahead logging which > slows down puts sig

Re: on HBase 1.0

2011-11-16 Thread lars hofhansl
Personally I think before we can label it 1.0 we have to solve the RPC versioning issue, so that clients and servers can be updated out of step. From: Andrew Purtell To: "d...@hbase.apache.org" Cc: "user@hbase.apache.org" Sent: Wednesday, November 16, 2011 3:50

Re: on HBase 1.0

2011-11-16 Thread Stack
On Wed, Nov 16, 2011 at 5:07 PM, lars hofhansl wrote: > Personally I think before we can label it 1.0 we have to solve the RPC > versioning issue, > so that clients and servers can be updated out of step. Lets make this a blocker for 1.0? St.Ack

Re: Help with continuous loading configuration

2011-11-16 Thread Stack
On Wed, Nov 16, 2011 at 4:09 PM, Amit Jain wrote: > On Wed, Nov 16, 2011 at 3:35 PM, Stack wrote: > >> On Wed, Nov 16, 2011 at 3:26 PM, Amit Jain wrote: >> > Hi Lars, >> > >> > The keys are arriving in random order.  The HBase monitoring page shows >> > evenly distributed load across all of the

Re: on HBase 1.0

2011-11-16 Thread lars hofhansl
If that is realistic. (?) Hadoop will (apparently) release without it. - Original Message - From: Stack To: d...@hbase.apache.org; lars hofhansl Cc: Andrew Purtell ; "user@hbase.apache.org" Sent: Wednesday, November 16, 2011 7:53 PM Subject: Re: on HBase 1.0 On Wed, Nov 16, 2011 at

RE: Facing Issues with RowCounter

2011-11-16 Thread Stuti Awasthi
Hi JD, Table 'Keyword' contains 'Set' column family with 7 rows. Here is the output of scan : hbase(main):001:0> scan 'Keyword',{COLUMNS=>['Set']} ROWCOLUMN+CELL Apachecolumn=Set:Fuse, timestamp=1321506922206, value= Apache

RE: Query on analyze big data with Hbase

2011-11-16 Thread Stuti Awasthi
Hey Cosmin, Thanks for info. I will do more testing to know the factors which can cause issue. :) -Original Message- From: Cosmin Lehene [mailto:cleh...@adobe.com] Sent: Wednesday, November 16, 2011 3:52 PM To: user@hbase.apache.org; hbase-u...@hadoop.apache.org Subject: Re: Query on a

RE: Help with continuous loading configuration

2011-11-16 Thread Ramkrishna S Vasudevan
Hi Amit As you said the regions may be distributed evenly across RS, if you can see if the puts are reaching to a particular RS only at any point of time it will surely overload the RS. As Stack pointed out, what is your schema and how is your row key designed ? Regards Ram -Original M

HRegionserver daemon is not running on region server node

2011-11-16 Thread Vamshi Krishna
hi i am working with 2 node hbase cluster as shown below On node1 (10.0.1.54) : master node, region server, hadoop namenode, hadoop datanode on node2 (10.0.1.55): region server, hadoop datanode. When i start both hadoop then hbase, all daemons are running properly on masternode i.e node1, 2404 Nam