RE: Mass dumping of data has issues

2012-09-25 Thread Ramkrishna.S.Vasudevan
For the NPE that you got, is the same HTable instance shared by different threads. This is a common problem users encounter while using HTable across multiple threads. Pls check and ensure that the HTable is not shared. Regards Ram > -Original Message- > From: Naveen [mailto:naveen.moorj

RE: Mass dumping of data has issues

2012-09-25 Thread Naveen
Hi Dan, I'm actually trying to simulate the kind of load we're expecting on production servers(the intention is not to migrate data), hence a self-written program over sqoop. (PS: I've actually tried sqoop) Warm regards, Naveen -Original Message- From: Dan Han [mailto:dannahan2...@gmail.

RE: Mass dumping of data has issues

2012-09-25 Thread Naveen
Thank you for the quick response Paul. I've switched off the autoFlush (haven't increased the writebuffer though). And the splits are pretty effective, I think because of the similar number of requests each region gets before they split further. As per your suggestion, I tried the same task two mo

RE: Tuning HBase for random reads

2012-09-25 Thread Anoop Sam John
Can you try with bloom filters? This can help in get() -Anoop- From: Jonathan Bishop [jbishop@gmail.com] Sent: Wednesday, September 26, 2012 11:34 AM To: user@hbase.apache.org Subject: Tuning HBase for random reads Hi, I am running hbase-0.92.1 and hav

Re: question CDH license

2012-09-25 Thread Stack
On Tue, Sep 25, 2012 at 7:50 PM, Xiang Hua wrote: > Hi, >As we know CDH could be used freely. >quesiton : can CDH be used for our custom? is it legal or not. > Its Apache licensed. This article is pretty good on what that means: http://en.wikipedia.org/wiki/Apache_license St.Ack

RE: Distribution of regions to servers

2012-09-25 Thread Anoop Sam John
Hi Can u share more details pls? What work you are doing within the CPs -Anoop- From: Dan Han [dannahan2...@gmail.com] Sent: Wednesday, September 26, 2012 5:55 AM To: user@hbase.apache.org Subject: Distribution of regions to servers Hi all, I am doing

Re: hbase cluster high loads

2012-09-25 Thread Stack
On Tue, Sep 25, 2012 at 9:02 PM, Yusup Ashrap wrote: > Hi Otis thanks for reply, > servers are identical in terms of hardware, jvm. > right now I cannot afford to restart my any machines, it's in the > production environment :D. > I will give a shot for some other clusters some time later. > Wha

Re: hbase cluster high loads

2012-09-25 Thread Yusup Ashrap
Hi Otis thanks for reply, servers are identical in terms of hardware, jvm. right now I cannot afford to restart my any machines, it's in the production environment :D. I will give a shot for some other clusters some time later. On Wed, Sep 26, 2012 at 11:50 AM, Otis Gospodnetic < otis.gospodne...

RE: Distribution of regions to servers

2012-09-25 Thread Ramkrishna.S.Vasudevan
Hi Dan Generally if the region distribution is not done properly as per the need then always we end up in region server getting overloaded due to region hotspotting. Write thro put can go down. It is not like the coprocessor performance alone is slow. Please check if the regions are properly ba

Re: Mass dumping of data has issues

2012-09-25 Thread Dan Han
Hi Naveen, There is tool called Sqoop which supports importing the data from relational database to HBase. https://blogs.apache.org/sqoop/entry/apache_sqoop_graduates_from_incubator Maybe it can help you migrate the data easily. Best Wishes Dan Han On Mon, Sep 24, 2012 at 9:20 AM, Paul Mackl

Re: Does hbase 0.90 client work with 0.92 server?

2012-09-25 Thread Jean-Daniel Cryans
It's not compatible. Like the guide says[1]: "replace your hbase 0.90.x with hbase 0.92.0 binaries (be sure you clear out all 0.90.x instances) and restart (You cannot do a rolling restart from 0.90.x to 0.92.x -- you must restart)" This includes the client. J-D 1. http://hbase.apache.org/book.

Does hbase 0.90 client work with 0.92 server?

2012-09-25 Thread Agarwal, Saurabh
Hi, We recently upgraded hbase 0.90.4 to HBase 0.92. Our HBase app worked fine in hbase 0.90.4. Our new setup has HBase 0.92 server and hbase 0.90.4 client. And throw following exception when client would like to connect to server. Is anyone running HBase 0.92 server and hbase 0.90.4 client

Re: : Hregionserver instance runs endlessly

2012-09-25 Thread lars hofhansl
Before you do this (or delete the ZK data), can you take a jstack (just run jstack ) and send us the output? (Please use http://pastebin.com/ and only send a link to the mailing list). Thanks. -- Lars - Original Message - From: Dhaval Shah To: user@hbase.apache.org Cc: Sent: Tuesday

Re: Should rowkey be always the first column of the upload file?

2012-09-25 Thread anil gupta
Hi Ramasubramanian, By default, the rowkey is supposed to be the first column. However, you can write a custom mapper and use it with bulk loader to user other columns as Rowkey. Have a look at this link for some more info: http://hbase.apache.org/book/ops_mgt.html#importtsv HTH, Anil Gupta On

Re: [Schema] Put or Increment ?

2012-09-25 Thread lars hofhansl
Increment is slightly more expensive, since the RegionServer executing the Increment needs to retrieve the old value(s) first (while holding the row lock). -- Lars - Original Message - From: Shrijeet Paliwal To: user@hbase.apache.org Cc: Sent: Tuesday, September 25, 2012 10:02 AM Sub

Re: [Schema] Put or Increment ?

2012-09-25 Thread Shrijeet Paliwal
On Tue, Sep 25, 2012 at 9:56 AM, Pamecha, Abhishek wrote: > Hi Shrijeet > > What's your usecase? That should drive your decision. Put will overwrite > in case your userid and ip address is same. Increment would just bump up > the counter. > #1 Keep a list of distinct IPs #2 Counts per IP (only i

RE: [Schema] Put or Increment ?

2012-09-25 Thread Pamecha, Abhishek
Hi Shrijeet What's your usecase? That should drive your decision. Put will overwrite in case your userid and ip address is same. Increment would just bump up the counter. -abhishek -Original Message- From: Shrijeet Paliwal [mailto:shrij...@rocketfuel.com] Sent: Tuesday, September

Should rowkey be always the first column of the upload file?

2012-09-25 Thread Ramasubramanian
Hi, Should rowkey be always the first column of the upload file? If not can someone pls let me know how to define it? Regards, Rams

[Schema] Put or Increment ?

2012-09-25 Thread Shrijeet Paliwal
Hi, Suppose I am tracking user activity by storing his IP each time he hits the web service. The row id will be uid of user and column qualifiers will be IPs themselves. I am contemplating whether to use a Put or Increment API. The must have requirement is distinct IPs associated with the user. It

Re:: Hregionserver instance runs endlessly

2012-09-25 Thread Dhaval Shah
Try killing the old process manually ( ps -ef ) -- On Tue 25 Sep, 2012 11:28 AM IST iwannaplay games wrote: >Hi, > >My hbase was working properly.But now it shows two instances of >hregionserver , the starting time of one is of 4 days back.If i try >stopping hbase i

RE: HBase BatchMutations - HOT Region Problem

2012-09-25 Thread Pankaj Misra
Hi Anoop, Thanks Anoop. I am creating the splits using the hex split example in the HBase documentation. I am specifically passing the splits during table creation. The leading zeros were lost in pasting from some of the key ranges as the spreadsheet took them to be numbers while assumed the o

RE: HBase BatchMutations - HOT Region Problem

2012-09-25 Thread Anoop Sam John
Hi There is a util class Bytes available in HBase and there is toBytes(int) using which u can convert an int to byte[] In the split keys why leading zeros for some region keys? How you have made the splits? U have passed explicitely the splits or splitkey creation done by HBase code? How you hav

Multiple ColumnPrefixFilter

2012-09-25 Thread Shagun Agarwal
Hi All, HBase has a filter called MultipleColumnPrefixFilter which behaves like ColumnPrefixFilter but allows specifying multiple prefixes. Example: Find all columns in a row and family that start with "abc" or "xyz". However i could not find any filter which can return all columns in a row that

RE: HBase BatchMutations - HOT Region Problem

2012-09-25 Thread Ramkrishna.S.Vasudevan
Hi Pankaj If your threads are generating data (0..9, 10...19, 20...29, ...) of this format, your splits also should be like 0.1 1...2 2...3 and so on right? May be am missing something here. But the data generation that creates the rowkey and the pre s

RE: Hregionserver instance runs endlessly

2012-09-25 Thread Ramkrishna.S.Vasudevan
The zookeeper installation has a dataDir configuration in zoo.cfg. By default it comes in the /tmp directory. Regards Ram > -Original Message- > From: iwannaplay games [mailto:funnlearnfork...@gmail.com] > Sent: Tuesday, September 25, 2012 11:45 AM > To: user@hbase.apache.org > Subject: R

Re: Simple way to unit test HBase Map reduce jobs?

2012-09-25 Thread Bertrand Dechoux
"Apache MRUnit ™ is a Java library that helps developers unit test Apache Hadoop map reduce jobs." That is every kind of mapper and reducer can be tested with it. That is to say, there is no specific support of HBase. If you can manage by yourself to provide the key/value that you should received

RE: HBase BatchMutations - HOT Region Problem

2012-09-25 Thread Pankaj Misra
Please find attached the table split and the snapshot below. Start Key End Key 19 19 32 32 004b 004b64 64 007d 007fff