Re: how to use hbase with eclipse?

2009-10-21 Thread jdh
Thank you very much for the sample url. It will help me a lot. Doss_IPH wrote: > > Hi, > If you are using hbase 0.20 version, you need to use hadoop 0.20.x. > > You can find sample sources on this URL: > > http://intellipowerhive.com/IPH-HBase.zip Download > > > > jdh wrote: >> >> Hi, >

RE: Sample application on HBase

2009-10-21 Thread Doss_IPH
I will update to hbase.0.20.x. as soon as possible. This is my own interesting to develop applications using this platform. But, I need support for this to further developing more. No license for this.. Patterson, Josh wrote: > > This seems to be like a web based "enterprise manager" for hb

RE: Sample application on HBase

2009-10-21 Thread Doss_IPH
I will update to hbase.0.20.x. as soon as possible. This is my own interesting to develop applications using this platform. But, I need support for this to further developing more. No license for this.. Patterson, Josh wrote: > > This seems to be like a web based "enterprise manager" for hb

Re: Table Upload Optimization

2009-10-21 Thread Jonathan Gray
That depends on how much memory you have for each node. I recommend setting heap to 1/2 total memory In general, I do not recommend running with VMs... Running two hbase nodes on a single node in VMs vs running one hbase node on the same node w/o VM, I don't really see where you'd get any ben

Re: HBase 0.20.1 scanners not closing properly (memory leak)

2009-10-21 Thread stack
Thanks for digging in Erik. Nice one. Would you mind making an issue of your findings? File it against 0.20.2 so we can roll out the fix in next 0.20.x release. St.Ack On Wed, Oct 21, 2009 at 2:25 PM, Erik Rozendaal wrote: > I did some more investigation into this issue since after the origina

Re: Hadoop/HBase node shutdown

2009-10-21 Thread stack
What Vaibhav said; just take out the regionservers gently by shutting each down using hbase-daemon.sh stop -- preferably not the hosts carrying -ROOT- and .META if you can avoid it (less churn if you don't have to take these down). Once down and their regions are deployed elsewhere, you'll want to

Re: HBase Exceptions on version 0.20.1

2009-10-21 Thread elsif
stack wrote: > On Wed, Oct 21, 2009 at 8:16 AM, elsif wrote: > > >> There are 239 "Block blk_-xxx is not valid errors", 522 "BlockInfo not >> found in volumeMap" errors, and 208 "BlockAlreadyExistsException" found >> in the hadoop logs over 12 hours of running the test. >> >> > > Above are

Re: HBase 0.20.1 scanners not closing properly (memory leak)

2009-10-21 Thread Erik Rozendaal
I did some more investigation into this issue since after the original issue stop occuring I noticed that MemStoreScanners where still being leaked when scanning a store with an empty MemStore. The cause looks to be the KeyValueHeap constructor. It drops scanners when the scanner's peek() m

Re: Hadoop/HBase node shutdown

2009-10-21 Thread Vaibhav Puranik
Rolling Restart would be the best option in my opinion. http://wiki.apache.org/hadoop/Hbase/RollingRestart Regards, Vaibhav Puranik GumGum On Wed, Oct 21, 2009 at 2:22 PM, Farshad Kazemzadeh wrote: > > Hello, > > I am trying to take down 9 nodes out of our 20 node Hadoop/HBase cluster. > What

Hadoop/HBase node shutdown

2009-10-21 Thread Farshad Kazemzadeh
Hello, I am trying to take down 9 nodes out of our 20 node Hadoop/HBase cluster. What is the best way to do this without having to shut down the entire cluster and have things to continue to run without a hiccup? Thank you in advance, FarshadK _

Re: HBase table design question

2009-10-21 Thread Something Something
Thanks, Jonathan for the reply. One quick question... So in the User table when I perform the put operation: .put("visited", "pageId", 100); .put("visited", "pageId", 200); The 100 gets overwritten with 200. Correct? So should I use... something like this... .put("visited", "pageId100", 10

Re: Waiting forever on scanner iterator

2009-10-21 Thread Ananth T. Sarathy
solrry... http://pastebin.com/m5aa78a09 Ananth T Sarathy On Wed, Oct 21, 2009 at 2:41 PM, Ananth T. Sarathy < ananth.t.sara...@gmail.com> wrote: > here's what I get > > Heap > par new generation total 19136K, used 13714K [0x2fc4, > 0x2aaab110, 0x2aaab4f7) > ed

Re: Waiting forever on scanner iterator

2009-10-21 Thread Ananth T. Sarathy
here's what I get Heap par new generation total 19136K, used 13714K [0x2fc4, 0x2aaab110, 0x2aaab4f7) eden space 17024K, 79% used [0x2fc4, 0x2aaab09816d8, 0x2aaab0ce) from space 2112K, 6% used [0x2aaab0ef, 0x2aaab0f13470, 0x0

RE: Table Upload Optimization

2009-10-21 Thread Mark Vigeant
Also, I updated the configuration and things seem to be working a bit better. What's a good heap size to set? -Original Message- From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of stack Sent: Wednesday, October 21, 2009 12:46 PM To: hbase-user@hadoop.apache.org Subject: R

RE: Sample application on HBase

2009-10-21 Thread Patterson, Josh
This seems to be like a web based "enterprise manager" for hbase. I've played with the "out of the box" web console for hbase 0.20 and this is a nice compliment to actually play with columns and rows. This might be something, or at least the inspiration for something, to add into a stock hbase inst

RE: Table Upload Optimization

2009-10-21 Thread Mark Vigeant
No, they are all running on separate hosts. What I described is the specs for each node. I have 4 VMs total, 2 running per 4 core machine. The machines are doing nothing else. How do I check for swapping? -Original Message- From: Jonathan Gray [mailto:jl...@streamy.com] Sent: Wednesda

Re: HBase table design question

2009-10-21 Thread Barney Frank
I am no expert, but I am doing something very similar i.e. tracking user sessions within Hbase. 2 tables: Table 1: 'Users' Column Family 1: WebPages Columns: Page Names RowId=UserId For a given userid, you could retrieve the pages visited, the number of times (watch out for versions), and the f

Re: unable to access META region after a region server FATAL crash

2009-10-21 Thread stack
Thanks for the detailed report Yannis. Your blow-by-blow makes sense for me (thanks for digging in). Can you make an issue and paste in the below? Your fix sounds fine too... Can you attach that and the logs? Mark it for fix in 0.20.2. St.Ack On Tue, Oct 20, 2009 at 5:00 PM, Yannis Pavlidis w

Re: Table Upload Optimization

2009-10-21 Thread Jonathan Gray
You are running all of these virtual machines on a single host node? And they are all sharing 4GB of memory? That is a major issue. First, GC pauses will start to lock things up and create time outs. Then swapping will totally kill performance of everything. Is that happening on your cluste

Re: two times more regions after update

2009-10-21 Thread Jonathan Gray
While you set the max versions to 1, that is only enforced on major compactions. So re-inserting all the data will actually mean you have double the data for some period of time. After a certain amount of time, a major compaction will occur in the background, and at that point only 1 version

Re: Waiting forever on scanner iterator

2009-10-21 Thread stack
If thats all you get, a thread dump on 10.242.71.191:60020 would help ($ kill -QUIT RS_PID). Thread dump will be in the RS .out file. Do a few. Paste to pastebin. Thanks Ananth, St.Ack On Wed, Oct 21, 2009 at 8:05 AM, Ananth T. Sarathy < ananth.t.sara...@gmail.com> wrote: > yeah, > I just don'

Re: HBase Exceptions on version 0.20.1

2009-10-21 Thread stack
On Wed, Oct 21, 2009 at 8:16 AM, elsif wrote: > > There are 239 "Block blk_-xxx is not valid errors", 522 "BlockInfo not > found in volumeMap" errors, and 208 "BlockAlreadyExistsException" found > in the hadoop logs over 12 hours of running the test. > Above are from application-level (hbase) or

Re: Sample application on HBase

2009-10-21 Thread stack
This looks interesting. Can you update it to run against 0.20.x? It does not have a license. You might want to add one. Thanks, St.Ack On Tue, Oct 20, 2009 at 9:29 PM, Doss_IPH wrote: > > Hi friends > I have developed sample small application using Hbase which can be run in > application ser

Re: HBase table design question

2009-10-21 Thread Jonathan Gray
You're generally on the right track. In many cases, rather than using secondary indexes in the relational world, you would have multiple tables in HBase with different keys. You may not need a table for each query, but that depends on your requirements of performance and the specific details

Re: two times more regions after update

2009-10-21 Thread stack
A better guage would b comparing space occupied in filesystem... do something like ./bin/hadoop fs -dus /HBASEDIR. Do it once at 44 regions. Then do it again, after major compacting when you have 84 regions. To major compact manually, run 'tools' in the shell to learn more. St.Ack On Wed, Oct 2

Re: Table Upload Optimization

2009-10-21 Thread stack
On Wed, Oct 21, 2009 at 8:53 AM, Mark Vigeant wrote: > >I saw this in your first posting: 10/21/09 10:22:52 INFO mapred.JobClient: > >map 100% reduce 0%. > > >Is your job writing hbase in the map task or in reducer? Are you using > >TableOutputFormat? > > I am using table output format and only a

RE: Table Upload Optimization

2009-10-21 Thread Mark Vigeant
>I saw this in your first posting: 10/21/09 10:22:52 INFO mapred.JobClient: >map 100% reduce 0%. >Is your job writing hbase in the map task or in reducer? Are you using >TableOutputFormat? I am using table output format and only a mapper. There is no reducer. Would a reducer make things more ef

Re: Table Upload Optimization

2009-10-21 Thread stack
On Wed, Oct 21, 2009 at 8:22 AM, Mark Vigeant wrote: > Ok, so first in response to St. Ack, nothing fishy appears to be happening > in the logs: data is being written to all regionservesrs. > > And it's not hovering around 100% done, it just has sent about 118 map > jobs, or "Task attempts" > > I

RE: Table Upload Optimization

2009-10-21 Thread Mark Vigeant
Oh and I'm using 32 bit Ubuntu if that is of interest -Original Message- From: Mark Vigeant [mailto:mark.vige...@riskmetrics.com] Sent: Wednesday, October 21, 2009 11:22 AM To: hbase-user@hadoop.apache.org Subject: RE: Table Upload Optimization Ok, so first in response to St. Ack, nothin

RE: Table Upload Optimization

2009-10-21 Thread Mark Vigeant
Ok, so first in response to St. Ack, nothing fishy appears to be happening in the logs: data is being written to all regionservesrs. And it's not hovering around 100% done, it just has sent about 118 map jobs, or "Task attempts" I'm using Hadoop 0.20.1 and HBase 0.20.0 Each node is a virtual

Re: HBase Exceptions on version 0.20.1

2009-10-21 Thread elsif
While running the test on this cluster of 14 servers, the highest loads I see are 3.68 (0.0% wa) on the master node and 2.65 (3.4% wa) on the node serving the .META. region. All the machines are on a single gigabit switch dedicated to the cluster. The highest throughput between nodes has been 21

Re: Waiting forever on scanner iterator

2009-10-21 Thread Ananth T. Sarathy
yeah, I just don't understand why getScanner("Column:X") returns the iterator and process them yet getScanner("Column:Y") just spins and spins, yet Column:Y is a much denser result. When I load from shell *Version: 0.20.0, r810752, Thu Sep 3 00:06:18 PDT 2009 hbase(main):001:0> count 'GS_Appli

Re: Table Upload Optimization

2009-10-21 Thread Jean-Daniel Cryans
Well the XMLStreamingInputFormat lets you map XML files which is neat but it has a problem and always needs to be patched. I wondered if that was missing but in your case it's not the problem. Did you check the logs of the master and region servers? Also I'd like to know - Version of Hadoop and H

HBase table design question

2009-10-21 Thread Something Something
Hello, Trying to figure out what's the recommended way of designing tables under HBase. Let's say I need a table to gather statistics regarding user's visits to different web pages. In the relational database world, we could have a table with following columns: Primary Key (system generated)

Re: Table Upload Optimization

2009-10-21 Thread stack
You are hovering around 100% done? What task is outstanding in your MR job? Can you tell anything from examination of its logs? St.Ack On Wed, Oct 21, 2009 at 7:52 AM, Mark Vigeant wrote: > Hey > > So I want to upload a lot of XML data into an HTable. I have a class that > successfully maps up

Re: Waiting forever on scanner iterator

2009-10-21 Thread stack
In both cases you are doing a full table scan? Try from shell with DEBUG enable. You'll see the regions being loaded. May help you narrow in on problem region or at least on problem regionserver. St.Ack On Wed, Oct 21, 2009 at 7:19 AM, Ananth T. Sarathy < ananth.t.sara...@gmail.com> wrote: >

Re: Waiting forever on scanner iterator

2009-10-21 Thread Jean-Daniel Cryans
As I said yesterday, you can check the logs, top, etc. One particular thing of interest would be a jstack of your region server's process while it's scanning and not returning. J-D On Wed, Oct 21, 2009 at 7:19 AM, Ananth T. Sarathy wrote: > Anyone have any further thoughts on this? > Ananth T Sa

RE: Table Upload Optimization

2009-10-21 Thread Mark Vigeant
No. Should I? -Original Message- From: jdcry...@gmail.com [mailto:jdcry...@gmail.com] On Behalf Of Jean-Daniel Cryans Sent: Wednesday, October 21, 2009 10:55 AM To: hbase-user@hadoop.apache.org Subject: Re: Table Upload Optimization Are you using the Hadoop Streaming API? J-D On Wed, O

Re: Table Upload Optimization

2009-10-21 Thread Jean-Daniel Cryans
Are you using the Hadoop Streaming API? J-D On Wed, Oct 21, 2009 at 7:52 AM, Mark Vigeant wrote: > Hey > > So I want to upload a lot of XML data into an HTable. I have a class that > successfully maps up to about 500 MB of data or so (on one regionserver) into > a table, but if I go for much b

Table Upload Optimization

2009-10-21 Thread Mark Vigeant
Hey So I want to upload a lot of XML data into an HTable. I have a class that successfully maps up to about 500 MB of data or so (on one regionserver) into a table, but if I go for much bigger than that it takes forever and eventually just stops. I tried uploading a big XML file into my 4 regio

Re: Waiting forever on scanner iterator

2009-10-21 Thread Ananth T. Sarathy
Anyone have any further thoughts on this? Ananth T Sarathy On Tue, Oct 20, 2009 at 6:37 PM, Ananth T. Sarathy < ananth.t.sara...@gmail.com> wrote: > Well that's not the case. Every Row has that column. In fact the second > snippet i sent is with a column with many less rows. (1k vs 25k) but co

two times more regions after update

2009-10-21 Thread guillaume.viland
Hello, i would appreciate explanations concerning the following point. I have an indexed table of 25M rows (44 regions after initial data insertion). The IndexedTable has been created with all the default attributes except that all columns are set with MaxVersions to 1. Only one column is indexed

Re: HBase 0.20.1 scanners not closing properly (memory leak)

2009-10-21 Thread Erik Rozendaal
Well, I can no longer reproduce this issue on my HBase install. One thing I did do was run major compaction on the .META. table, as it had 11 store files. I'll keep an eye on this to see if the problem occurs again. Thanks, Erik On 21 okt 2009, at 14:23, Guilherme Germoglio wrote: Hello

Re: HBase 0.20.1 scanners not closing properly (memory leak)

2009-10-21 Thread Erik Rozendaal
The pic showing the number of StoreScanner instances: http://i35.tinypic.com/2rr43dj.png The pic showing the size of the changedReaderObservers of one store: http://i36.tinypic.com/20573bl.png On 21 okt 2009, at 14:23, Guilherme Germoglio wrote: Hello Erik, I think your attachments were block

Re: HBase 0.20.1 scanners not closing properly (memory leak)

2009-10-21 Thread Guilherme Germoglio
Hello Erik, I think your attachments were blocked. Could you please upload them somewhere else? On Wed, Oct 21, 2009 at 8:44 AM, Erik Rozendaal wrote: > Hi all, > > After some performance testing on my HBase 0.20.1 development environment > (running in pseudo- and full-distributed mode on a singl

HBase 0.20.1 scanners not closing properly (memory leak)

2009-10-21 Thread Erik Rozendaal
Hi all, After some performance testing on my HBase 0.20.1 development environment (running in pseudo- and full-distributed mode on a single laptop) I noticed that scanners do not get closed properly on the region server. After creating a heap dump with Netbeans I can see the StoreScanner

Re: how to use hbase with eclipse?

2009-10-21 Thread Doss_IPH
Hi, If you are using hbase 0.20 version, you need to use hadoop 0.20.x. You can find sample sources on this URL: http://intellipowerhive.com/IPH-HBase.zip Download jdh wrote: > > Hi, > I just install hbase 0.20 with pseduo-distributed style according to the > hbase wiki. Then I wrote a jav