date:20130915

Hi rajesh,

I use Hbase 0.94.11 and Hadoop 1.2.1. The file system of bulkload
output directory and hbase cluster are the same, too.

I've also coded a MapReduce job using HFileOutputFormat. When I use
LoadIncrementalHFiles
to move the output of my job to HBase table, it still copies instead of cut.

Thanks


On Sat, Sep 14, 2013 at 2:50 PM, rajesh babu Chintaguntla 
chrajeshbab...@gmail.com wrote:

 Hi BagherEsmaeily,

which version of hbase you are using? Is the file system of bulkload
 output directory and hbase cluster are same?

If you are using hbase older than 0.94.5 version, the Storefiles
 generated by importtsv are getting copied instead ofmoving even if
 the file system of bulkload output directory and hbase cluster are same.

Its a bug and solved in 0.94.5 (HBASE-5498).

 Thanks.
 Rajeshbabu

 On Sat, Sep 14, 2013 at 12:01 PM, M. BagherEsmaeily mbesmae...@gmail.com
 wrote:

  Hello,
 
  I was using HBase complete bulk load to transfer the output of ImportTsv
 to
  a table in HBase, and I noticed that it copies the output instead of
  cutting. This takes long time for my gigabytes of data.
 
  In HBase documentation (
  http://hbase.apache.org/book/ops_mgt.html#completebulkload) I read that
  the
  files would be moved not copied. Can anyone help me with this?
 
  Kind Regards

Re: High cpu usage on a region server

2013-09-15 Thread OpenSource Dev

We patched HBase 0.94.6 with HBASE-9428, and now the difference is as
day and night.
Read latency has been very consistent and haven't seen any cpu load
issue in last 24+hrs

Thank you all for helping us out to resolve this issue.

Bikrant

On Thu, Sep 12, 2013 at 10:25 AM, lars hofhansl la...@apache.org wrote:
Not that I am aware of. Reduce the HFile block size will lessen this problem
(but then cause other issues).

It's just a fix to the RegexStringFilter. You can just recompile that and
deploy it to the RegionServers (need to make it's in the class path before
the HBase jars).
Probably easier to roll a new release. It's a shame we did not see this
earlier.

-- Lars

From: OpenSource Dev dev.opensou...@gmail.com
To: user@hbase.apache.org; lars hofhansl la...@apache.org
Sent: Thursday, September 12, 2013 9:52 AM
Subject: Re: High cpu usage on a region server

Thanks Lars.

Are there any other workarounds for this issue until we get the fix ?
If not we might have to do the patch and rollout custom pkg.

On Thu, Sep 12, 2013 at 8:36 AM, lars hofhansl la...@apache.org wrote:
Yep... Very likely HBASE-9428:

8 threads:
java.lang.Thread.State: RUNNABLE
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.lang.StringCoding.decode(StringCoding.java:178)
at java.lang.String.init(String.java:483)
at
org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)
...

4 threads:
java.lang.Thread.State: RUNNABLE
at sun.nio.cs.ISO_8859_1$Decoder.decodeArrayLoop(ISO_8859_1.java:79)
at sun.nio.cs.ISO_8859_1$Decoder.decodeLoop(ISO_8859_1.java:106)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:544)
at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:140)
at java.lang.StringCoding.decode(StringCoding.java:179)
at java.lang.String.init(String.java:483)
at
org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)

It's also consistent with what you see: Lots of garbage (hence tweaking your
GC options had a significant effect)
The fix is in 0.94.12, which is in RC right now, probably to be released
early next week.

-- Lars

From: OpenSource Dev dev.opensou...@gmail.com
To: user@hbase.apache.org
Sent: Thursday, September 12, 2013 8:15 AM
Subject: Re: High cpu usage on a region server

A server started getting busy last night, but this time it took ~5 hrs
to get from 15% busy to 75% busy. It is not running 80% flat-out yet.
But this is still very high compared to other servers that are running
under ~25% cpu usage. Only change that I made yesterday was the
addition of -XX:+UseParNewGC to hbase startup command.

http://pastebin.com/VRmujgyH

On Wed, Sep 11, 2013 at 2:28 PM, Stack st...@duboce.net wrote:
Can you thread dump the busy server and pastebin it?
Thanks,
St.Ack

On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev
dev.opensou...@gmail.comwrote:

Hi,

I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
issues with writes/puts. System is handles upto 800k puts per seconds
without issue. On average we do 250k puts per second.

I am having the problem with Reads, I've also isolated where the
problem is but not been able to find the root cause.

I have 16 machines running hbase-region server, each has ~35 regions.
Once in a while cpu goes flatout 80% in 1 region server. These are the
things i've noticed in ganglia:

hbase.regionserver.request - evenly distributed. Not seeing any spikes
on the busy server
hbase.regionserver.blockCacheSize - between 500MB and 1000MB
hbase.regionserver.compactionQueueSize - avg 2 or less
hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other
nodes

JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC

I've noticed the system load moves to a different region, sometimes
within a minute, if the busy region is restarted.

Any suggestion what could be causing the load and/or what other
metrics should I check ?

Thank you!

Re: High cpu usage on a region server

2013-09-15 Thread lars hofhansl

Thanks for reporting back Bikrant, glad that that turned out to be issue.

From: OpenSource Dev dev.opensou...@gmail.com
To: user@hbase.apache.org; lars hofhansl la...@apache.org 
Sent: Saturday, September 14, 2013 11:21 PM
Subject: Re: High cpu usage on a region server

We patched HBase 0.94.6 with HBASE-9428, and now the difference is as
day and night.
Read latency has been very consistent and haven't seen any cpu load
issue in last 24+hrs

Thank you all for helping us out to resolve this issue.

Bikrant

On Thu, Sep 12, 2013 at 10:25 AM, lars hofhansl la...@apache.org wrote:
 Not that I am aware of. Reduce the HFile block size will lessen this problem 
 (but then cause other issues).

 It's just a fix to the RegexStringFilter. You can just recompile that and 
 deploy it to the RegionServers (need to make it's in the class path before 
 the HBase jars).
 Probably easier to roll a new release. It's a shame we did not see this 
 earlier.

 -- Lars

  From: OpenSource Dev dev.opensou...@gmail.com
 To: user@hbase.apache.org; lars hofhansl la...@apache.org
 Sent: Thursday, September 12, 2013 9:52 AM
 Subject: Re: High cpu usage on a region server

 Thanks Lars.

 Are there any other workarounds for this issue until we get the fix ?
 If not we might have to do the patch and rollout custom pkg.

 On Thu, Sep 12, 2013 at 8:36 AM, lars hofhansl la...@apache.org wrote:
 Yep... Very likely HBASE-9428:

 8 threads:
    java.lang.Thread.State: RUNNABLE
         at java.util.Arrays.copyOf(Arrays.java:2786)
         at java.lang.StringCoding.decode(StringCoding.java:178)
         at java.lang.String.init(String.java:483)
         at 
org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)
         ...

 4 threads:
    java.lang.Thread.State: RUNNABLE
         at sun.nio.cs.ISO_8859_1$Decoder.decodeArrayLoop(ISO_8859_1.java:79)
         at sun.nio.cs.ISO_8859_1$Decoder.decodeLoop(ISO_8859_1.java:106)
         at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:544)
         at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:140)
         at java.lang.StringCoding.decode(StringCoding.java:179)
         at java.lang.String.init(String.java:483)
         at 
org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)

 It's also consistent with what you see: Lots of garbage (hence tweaking your 
 GC options had a significant effect)
 The fix is in 0.94.12, which is in RC right now, probably to be released 
 early next week.

 -- Lars

  From: OpenSource Dev dev.opensou...@gmail.com
 To: user@hbase.apache.org
 Sent: Thursday, September 12, 2013 8:15 AM
 Subject: Re: High cpu usage on a region server

 A server started getting busy last night, but this time it took ~5 hrs
 to get from 15% busy to 75% busy. It is not running 80% flat-out yet.
 But this is still very high compared to other servers that are running
 under ~25% cpu usage. Only change that I made yesterday was the
 addition of -XX:+UseParNewGC to hbase startup command.

 http://pastebin.com/VRmujgyH

 On Wed, Sep 11, 2013 at 2:28 PM, Stack st...@duboce.net wrote:
 Can you thread dump the busy server and pastebin it?
 Thanks,
 St.Ack

 On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev 
 dev.opensou...@gmail.comwrote:

 Hi,

 I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
 issues with writes/puts. System is handles upto 800k puts per seconds
 without issue. On average we do 250k puts per second.

 I am having the problem with Reads, I've also isolated where the
 problem is but not been able to find the root cause.

 I have 16 machines running hbase-region server, each has ~35 regions.
 Once in a while cpu goes flatout 80% in 1 region server. These are the
 things i've noticed in ganglia:

 hbase.regionserver.request - evenly distributed. Not seeing any spikes
 on the busy server
 hbase.regionserver.blockCacheSize - between 500MB and 1000MB
 hbase.regionserver.compactionQueueSize - avg 2 or less
 hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other
 nodes

 JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC

 I've noticed the system load moves to a different region, sometimes
 within a minute, if the busy region is restarted.

 Any suggestion what could be causing the load and/or what other
 metrics should I check ?

 Thank you!

Re: completebulkload does 'copy' StoreFiles instead of 'cut'

File a JIRA for the issue ?

On Sep 14, 2013, at 11:10 PM, M. BagherEsmaeily mbesmae...@gmail.com wrote:

 Hi rajesh,
 
 I use Hbase 0.94.11 and Hadoop 1.2.1. The file system of bulkload
 output directory and hbase cluster are the same, too.
 
 I've also coded a MapReduce job using HFileOutputFormat. When I use
 LoadIncrementalHFiles
 to move the output of my job to HBase table, it still copies instead of cut.
 
 Thanks
 
 
 On Sat, Sep 14, 2013 at 2:50 PM, rajesh babu Chintaguntla 
 chrajeshbab...@gmail.com wrote:
 
 Hi BagherEsmaeily,
 
   which version of hbase you are using? Is the file system of bulkload
 output directory and hbase cluster are same?
 
   If you are using hbase older than 0.94.5 version, the Storefiles
 generated by importtsv are getting copied instead ofmoving even if
 the file system of bulkload output directory and hbase cluster are same.
 
   Its a bug and solved in 0.94.5 (HBASE-5498).
 
 Thanks.
 Rajeshbabu
 
 On Sat, Sep 14, 2013 at 12:01 PM, M. BagherEsmaeily mbesmae...@gmail.com
 wrote:
 
 Hello,
 
 I was using HBase complete bulk load to transfer the output of ImportTsv
 to
 a table in HBase, and I noticed that it copies the output instead of
 cutting. This takes long time for my gigabytes of data.
 
 In HBase documentation (
 http://hbase.apache.org/book/ops_mgt.html#completebulkload) I read that
 the
 files would be moved not copied. Can anyone help me with this?
 
 Kind Regards

Re: completebulkload does 'copy' StoreFiles instead of 'cut'

HBASE-9537


On Sun, Sep 15, 2013 at 11:56 AM, Ted Yu yuzhih...@gmail.com wrote:

 File a JIRA for the issue ?

 On Sep 14, 2013, at 11:10 PM, M. BagherEsmaeily mbesmae...@gmail.com
 wrote:

  Hi rajesh,
 
  I use Hbase 0.94.11 and Hadoop 1.2.1. The file system of bulkload
  output directory and hbase cluster are the same, too.
 
  I've also coded a MapReduce job using HFileOutputFormat. When I use
  LoadIncrementalHFiles
  to move the output of my job to HBase table, it still copies instead of
 cut.
 
  Thanks
 
 
  On Sat, Sep 14, 2013 at 2:50 PM, rajesh babu Chintaguntla 
  chrajeshbab...@gmail.com wrote:
 
  Hi BagherEsmaeily,
 
which version of hbase you are using? Is the file system of
 bulkload
  output directory and hbase cluster are same?
 
If you are using hbase older than 0.94.5 version, the Storefiles
  generated by importtsv are getting copied instead ofmoving even
 if
  the file system of bulkload output directory and hbase cluster are same.
 
Its a bug and solved in 0.94.5 (HBASE-5498).
 
  Thanks.
  Rajeshbabu
 
  On Sat, Sep 14, 2013 at 12:01 PM, M. BagherEsmaeily 
 mbesmae...@gmail.com
  wrote:
 
  Hello,
 
  I was using HBase complete bulk load to transfer the output of
 ImportTsv
  to
  a table in HBase, and I noticed that it copies the output instead of
  cutting. This takes long time for my gigabytes of data.
 
  In HBase documentation (
  http://hbase.apache.org/book/ops_mgt.html#completebulkload) I read
 that
  the
  files would be moved not copied. Can anyone help me with this?
 
  Kind Regards

Connecting to a remote distributed HBASE env.

Hi

I defined a distributed HBASE environment on a cluster of three servers
server1
server2
server3

I would like to run a program using this HBASE from a fourth server :
server4

I am connecting to HBASE using GORA.
However, I do not succeed to tell the program to connect by remote to the
hbase quorum...

Is it possible?

Thanks

Benjamin

Re: completebulkload does 'copy' StoreFiles instead of 'cut'

Looks like HBASE-9538 is for the same issue. 

On Sep 15, 2013, at 1:09 AM, M. BagherEsmaeily mbesmae...@gmail.com wrote:

 HBASE-9537
 
 
 On Sun, Sep 15, 2013 at 11:56 AM, Ted Yu yuzhih...@gmail.com wrote:
 
 File a JIRA for the issue ?
 
 On Sep 14, 2013, at 11:10 PM, M. BagherEsmaeily mbesmae...@gmail.com
 wrote:
 
 Hi rajesh,
 
 I use Hbase 0.94.11 and Hadoop 1.2.1. The file system of bulkload
 output directory and hbase cluster are the same, too.
 
 I've also coded a MapReduce job using HFileOutputFormat. When I use
 LoadIncrementalHFiles
 to move the output of my job to HBase table, it still copies instead of
 cut.
 
 Thanks
 
 
 On Sat, Sep 14, 2013 at 2:50 PM, rajesh babu Chintaguntla 
 chrajeshbab...@gmail.com wrote:
 
 Hi BagherEsmaeily,
 
  which version of hbase you are using? Is the file system of
 bulkload
 output directory and hbase cluster are same?
 
  If you are using hbase older than 0.94.5 version, the Storefiles
 generated by importtsv are getting copied instead ofmoving even
 if
 the file system of bulkload output directory and hbase cluster are same.
 
  Its a bug and solved in 0.94.5 (HBASE-5498).
 
 Thanks.
 Rajeshbabu
 
 On Sat, Sep 14, 2013 at 12:01 PM, M. BagherEsmaeily 
 mbesmae...@gmail.com
 wrote:
 
 Hello,
 
 I was using HBase complete bulk load to transfer the output of
 ImportTsv
 to
 a table in HBase, and I noticed that it copies the output instead of
 cutting. This takes long time for my gigabytes of data.
 
 In HBase documentation (
 http://hbase.apache.org/book/ops_mgt.html#completebulkload) I read
 that
 the
 files would be moved not copied. Can anyone help me with this?
 
 Kind Regards

Re: Connecting to a remote distributed HBASE env.

Have you read http://hbase.apache.org/book.html#zookeeper ?

What HBase version are you using ?

Can you pastebin the error(s) you encountered ?

Thanks

On Sep 15, 2013, at 1:26 AM, Sznajder ForMailingList bs4mailingl...@gmail.com 
wrote:

 Hi
 
 I defined a distributed HBASE environment on a cluster of three servers
 server1
 server2
 server3
 
 I would like to run a program using this HBASE from a fourth server :
 server4
 
 I am connecting to HBASE using GORA.
 However, I do not succeed to tell the program to connect by remote to the
 hbase quorum...
 
 Is it possible?
 
 Thanks
 
 Benjamin

Re: completebulkload does 'copy' StoreFiles instead of 'cut'

Actually that was my fault duplicating the issue.I had no permission to
delete the second issue. sorry!


On Sun, Sep 15, 2013 at 2:12 PM, Ted Yu yuzhih...@gmail.com wrote:

 Looks like HBASE-9538 is for the same issue.

 On Sep 15, 2013, at 1:09 AM, M. BagherEsmaeily mbesmae...@gmail.com
 wrote:

  HBASE-9537
 
 
  On Sun, Sep 15, 2013 at 11:56 AM, Ted Yu yuzhih...@gmail.com wrote:
 
  File a JIRA for the issue ?
 
  On Sep 14, 2013, at 11:10 PM, M. BagherEsmaeily mbesmae...@gmail.com
 
  wrote:
 
  Hi rajesh,
 
  I use Hbase 0.94.11 and Hadoop 1.2.1. The file system of bulkload
  output directory and hbase cluster are the same, too.
 
  I've also coded a MapReduce job using HFileOutputFormat. When I use
  LoadIncrementalHFiles
  to move the output of my job to HBase table, it still copies instead of
  cut.
 
  Thanks
 
 
  On Sat, Sep 14, 2013 at 2:50 PM, rajesh babu Chintaguntla 
  chrajeshbab...@gmail.com wrote:
 
  Hi BagherEsmaeily,
 
   which version of hbase you are using? Is the file system of
  bulkload
  output directory and hbase cluster are same?
 
   If you are using hbase older than 0.94.5 version, the Storefiles
  generated by importtsv are getting copied instead ofmoving
 even
  if
  the file system of bulkload output directory and hbase cluster are
 same.
 
   Its a bug and solved in 0.94.5 (HBASE-5498).
 
  Thanks.
  Rajeshbabu
 
  On Sat, Sep 14, 2013 at 12:01 PM, M. BagherEsmaeily 
  mbesmae...@gmail.com
  wrote:
 
  Hello,
 
  I was using HBase complete bulk load to transfer the output of
  ImportTsv
  to
  a table in HBase, and I noticed that it copies the output instead of
  cutting. This takes long time for my gigabytes of data.
 
  In HBase documentation (
  http://hbase.apache.org/book/ops_mgt.html#completebulkload) I read
  that
  the
  files would be moved not copied. Can anyone help me with this?
 
  Kind Regards

Re: Connecting to a remote distributed HBASE env.

2013-09-15 Thread Jean-Marc Spaggiari

Also, I will add to Ted:
what have you tried so far?

JM


2013/9/15 Ted Yu yuzhih...@gmail.com

 Have you read http://hbase.apache.org/book.html#zookeeper ?

 What HBase version are you using ?

 Can you pastebin the error(s) you encountered ?

 Thanks

 On Sep 15, 2013, at 1:26 AM, Sznajder ForMailingList 
 bs4mailingl...@gmail.com wrote:

  Hi
 
  I defined a distributed HBASE environment on a cluster of three servers
  server1
  server2
  server3
 
  I would like to run a program using this HBASE from a fourth server :
  server4
 
  I am connecting to HBASE using GORA.
  However, I do not succeed to tell the program to connect by remote to
 the
  hbase quorum...
 
  Is it possible?
 
  Thanks
 
  Benjamin

Re: Connecting to a remote distributed HBASE env.

 I am using Hbase 0.90.4

Benjamin


On Sun, Sep 15, 2013 at 4:31 PM, Sznajder ForMailingList 
bs4mailingl...@gmail.com wrote:

 Hi

 When I am connecting from server4 via my java code:
 (again, hbase runs on server1, server2 and server3)


 I am getting the following:

  [java] Exception in thread main java.lang.RuntimeException:
 org.apache.gora.util.GoraException: java.lang.RuntimeException:
 org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to
 connect to ZooKeeper but the connection closes immediately. This could be a
 sign that the server has too many connections (30 is the default). Consider
 inspecting your ZK server logs for that error and then make sure you are
 reusing HBaseConfiguration as often as you can. See HTable's javadoc for
 more information.
  [java] at
 com.ibm.hrl.crawldb.CrawlDataBase.init(CrawlDataBase.java:43)
  [java] at
 com.ibm.hrl.main.CrawlQueueMain.main(CrawlQueueMain.java:72)
  [java] Caused by: org.apache.gora.util.GoraException:
 java.lang.RuntimeException:
 org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to
 connect to ZooKeeper but the connection closes immediately. This could be a
 sign that the server has too many connections (30 is the default). Consider
 inspecting your ZK server logs for that error and then make sure you are
 reusing HBaseConfiguration as often as you can. See HTable's javadoc for
 more information.



 On Sun, Sep 15, 2013 at 3:34 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Also, I will add to Ted:
 what have you tried so far?

 JM


 2013/9/15 Ted Yu yuzhih...@gmail.com

  Have you read http://hbase.apache.org/book.html#zookeeper ?
 
  What HBase version are you using ?
 
  Can you pastebin the error(s) you encountered ?
 
  Thanks
 
  On Sep 15, 2013, at 1:26 AM, Sznajder ForMailingList 
  bs4mailingl...@gmail.com wrote:
 
   Hi
  
   I defined a distributed HBASE environment on a cluster of three
 servers
   server1
   server2
   server3
  
   I would like to run a program using this HBASE from a fourth server :
   server4
  
   I am connecting to HBASE using GORA.
   However, I do not succeed to tell the program to connect by remote
 to
  the
   hbase quorum...
  
   Is it possible?
  
   Thanks
  
   Benjamin

Re: Connecting to a remote distributed HBASE env.

Hi

When I am connecting from server4 via my java code:
(again, hbase runs on server1, server2 and server3)


I am getting the following:

 [java] Exception in thread main java.lang.RuntimeException:
org.apache.gora.util.GoraException: java.lang.RuntimeException:
org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to
connect to ZooKeeper but the connection closes immediately. This could be a
sign that the server has too many connections (30 is the default). Consider
inspecting your ZK server logs for that error and then make sure you are
reusing HBaseConfiguration as often as you can. See HTable's javadoc for
more information.
 [java] at
com.ibm.hrl.crawldb.CrawlDataBase.init(CrawlDataBase.java:43)
 [java] at
com.ibm.hrl.main.CrawlQueueMain.main(CrawlQueueMain.java:72)
 [java] Caused by: org.apache.gora.util.GoraException:
java.lang.RuntimeException:
org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to
connect to ZooKeeper but the connection closes immediately. This could be a
sign that the server has too many connections (30 is the default). Consider
inspecting your ZK server logs for that error and then make sure you are
reusing HBaseConfiguration as often as you can. See HTable's javadoc for
more information.



On Sun, Sep 15, 2013 at 3:34 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Also, I will add to Ted:
 what have you tried so far?

 JM


 2013/9/15 Ted Yu yuzhih...@gmail.com

  Have you read http://hbase.apache.org/book.html#zookeeper ?
 
  What HBase version are you using ?
 
  Can you pastebin the error(s) you encountered ?
 
  Thanks
 
  On Sep 15, 2013, at 1:26 AM, Sznajder ForMailingList 
  bs4mailingl...@gmail.com wrote:
 
   Hi
  
   I defined a distributed HBASE environment on a cluster of three servers
   server1
   server2
   server3
  
   I would like to run a program using this HBASE from a fourth server :
   server4
  
   I am connecting to HBASE using GORA.
   However, I do not succeed to tell the program to connect by remote to
  the
   hbase quorum...
  
   Is it possible?
  
   Thanks
  
   Benjamin

Re: Connecting to a remote distributed HBASE env.

2013-09-15 Thread Jean-Marc Spaggiari

Is your HBase server working well? You have access to the UI and everything
is displayed correctly? Are you able to use the shell too?


2013/9/15 Sznajder ForMailingList bs4mailingl...@gmail.com

  I am using Hbase 0.90.4

 Benjamin


 On Sun, Sep 15, 2013 at 4:31 PM, Sznajder ForMailingList 
 bs4mailingl...@gmail.com wrote:

  Hi
 
  When I am connecting from server4 via my java code:
  (again, hbase runs on server1, server2 and server3)
 
 
  I am getting the following:
 
   [java] Exception in thread main java.lang.RuntimeException:
  org.apache.gora.util.GoraException: java.lang.RuntimeException:
  org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to
  connect to ZooKeeper but the connection closes immediately. This could
 be a
  sign that the server has too many connections (30 is the default).
 Consider
  inspecting your ZK server logs for that error and then make sure you are
  reusing HBaseConfiguration as often as you can. See HTable's javadoc for
  more information.
   [java] at
  com.ibm.hrl.crawldb.CrawlDataBase.init(CrawlDataBase.java:43)
   [java] at
  com.ibm.hrl.main.CrawlQueueMain.main(CrawlQueueMain.java:72)
   [java] Caused by: org.apache.gora.util.GoraException:
  java.lang.RuntimeException:
  org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to
  connect to ZooKeeper but the connection closes immediately. This could
 be a
  sign that the server has too many connections (30 is the default).
 Consider
  inspecting your ZK server logs for that error and then make sure you are
  reusing HBaseConfiguration as often as you can. See HTable's javadoc for
  more information.
 
 
 
  On Sun, Sep 15, 2013 at 3:34 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Also, I will add to Ted:
  what have you tried so far?
 
  JM
 
 
  2013/9/15 Ted Yu yuzhih...@gmail.com
 
   Have you read http://hbase.apache.org/book.html#zookeeper ?
  
   What HBase version are you using ?
  
   Can you pastebin the error(s) you encountered ?
  
   Thanks
  
   On Sep 15, 2013, at 1:26 AM, Sznajder ForMailingList 
   bs4mailingl...@gmail.com wrote:
  
Hi
   
I defined a distributed HBASE environment on a cluster of three
  servers
server1
server2
server3
   
I would like to run a program using this HBASE from a fourth server
 :
server4
   
I am connecting to HBASE using GORA.
However, I do not succeed to tell the program to connect by remote
  to
   the
hbase quorum...
   
Is it possible?
   
Thanks
   
Benjamin

Re: Connecting to a remote distributed HBASE env.

yep, I am able to use the shell on
server1
server2 and
server3

but **not** on server 4.
Should I define something to be able?



On Sun, Sep 15, 2013 at 4:36 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Is your HBase server working well? You have access to the UI and everything
 is displayed correctly? Are you able to use the shell too?


 2013/9/15 Sznajder ForMailingList bs4mailingl...@gmail.com

   I am using Hbase 0.90.4
 
  Benjamin
 
 
  On Sun, Sep 15, 2013 at 4:31 PM, Sznajder ForMailingList 
  bs4mailingl...@gmail.com wrote:
 
   Hi
  
   When I am connecting from server4 via my java code:
   (again, hbase runs on server1, server2 and server3)
  
  
   I am getting the following:
  
[java] Exception in thread main java.lang.RuntimeException:
   org.apache.gora.util.GoraException: java.lang.RuntimeException:
   org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to
   connect to ZooKeeper but the connection closes immediately. This could
  be a
   sign that the server has too many connections (30 is the default).
  Consider
   inspecting your ZK server logs for that error and then make sure you
 are
   reusing HBaseConfiguration as often as you can. See HTable's javadoc
 for
   more information.
[java] at
   com.ibm.hrl.crawldb.CrawlDataBase.init(CrawlDataBase.java:43)
[java] at
   com.ibm.hrl.main.CrawlQueueMain.main(CrawlQueueMain.java:72)
[java] Caused by: org.apache.gora.util.GoraException:
   java.lang.RuntimeException:
   org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to
   connect to ZooKeeper but the connection closes immediately. This could
  be a
   sign that the server has too many connections (30 is the default).
  Consider
   inspecting your ZK server logs for that error and then make sure you
 are
   reusing HBaseConfiguration as often as you can. See HTable's javadoc
 for
   more information.
  
  
  
   On Sun, Sep 15, 2013 at 3:34 PM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org wrote:
  
   Also, I will add to Ted:
   what have you tried so far?
  
   JM
  
  
   2013/9/15 Ted Yu yuzhih...@gmail.com
  
Have you read http://hbase.apache.org/book.html#zookeeper ?
   
What HBase version are you using ?
   
Can you pastebin the error(s) you encountered ?
   
Thanks
   
On Sep 15, 2013, at 1:26 AM, Sznajder ForMailingList 
bs4mailingl...@gmail.com wrote:
   
 Hi

 I defined a distributed HBASE environment on a cluster of three
   servers
 server1
 server2
 server3

 I would like to run a program using this HBASE from a fourth
 server
  :
 server4

 I am connecting to HBASE using GORA.
 However, I do not succeed to tell the program to connect by
 remote
   to
the
 hbase quorum...

 Is it possible?

 Thanks

 Benjamin

Re: completebulkload does 'copy' StoreFiles instead of 'cut'

In the region server log file, do you see the following ?

  LOG.info(File  + srcPath +  on different filesystem than  +
  destination store - moving to this filesystem.);


On Sun, Sep 15, 2013 at 2:52 AM, M. BagherEsmaeily mbesmae...@gmail.comwrote:

 Actually that was my fault duplicating the issue.I had no permission to
 delete the second issue. sorry!


 On Sun, Sep 15, 2013 at 2:12 PM, Ted Yu yuzhih...@gmail.com wrote:

  Looks like HBASE-9538 is for the same issue.
 
  On Sep 15, 2013, at 1:09 AM, M. BagherEsmaeily mbesmae...@gmail.com
  wrote:
 
   HBASE-9537
  
  
   On Sun, Sep 15, 2013 at 11:56 AM, Ted Yu yuzhih...@gmail.com wrote:
  
   File a JIRA for the issue ?
  
   On Sep 14, 2013, at 11:10 PM, M. BagherEsmaeily 
 mbesmae...@gmail.com
  
   wrote:
  
   Hi rajesh,
  
   I use Hbase 0.94.11 and Hadoop 1.2.1. The file system of bulkload
   output directory and hbase cluster are the same, too.
  
   I've also coded a MapReduce job using HFileOutputFormat. When I use
   LoadIncrementalHFiles
   to move the output of my job to HBase table, it still copies instead
 of
   cut.
  
   Thanks
  
  
   On Sat, Sep 14, 2013 at 2:50 PM, rajesh babu Chintaguntla 
   chrajeshbab...@gmail.com wrote:
  
   Hi BagherEsmaeily,
  
which version of hbase you are using? Is the file system of
   bulkload
   output directory and hbase cluster are same?
  
If you are using hbase older than 0.94.5 version, the
 Storefiles
   generated by importtsv are getting copied instead ofmoving
  even
   if
   the file system of bulkload output directory and hbase cluster are
  same.
  
Its a bug and solved in 0.94.5 (HBASE-5498).
  
   Thanks.
   Rajeshbabu
  
   On Sat, Sep 14, 2013 at 12:01 PM, M. BagherEsmaeily 
   mbesmae...@gmail.com
   wrote:
  
   Hello,
  
   I was using HBase complete bulk load to transfer the output of
   ImportTsv
   to
   a table in HBase, and I noticed that it copies the output instead
 of
   cutting. This takes long time for my gigabytes of data.
  
   In HBase documentation (
   http://hbase.apache.org/book/ops_mgt.html#completebulkload) I read
   that
   the
   files would be moved not copied. Can anyone help me with this?
  
   Kind Regards

Re: Connecting to a remote distributed HBASE env.

2013-09-15 Thread Jean-Marc Spaggiari

What is your HBase WebUI showing? Are you abl to see all your servers and
that everything is going well?


2013/9/15 Sznajder ForMailingList bs4mailingl...@gmail.com

 yep, I am able to use the shell on
 server1
 server2 and
 server3

 but **not** on server 4.
 Should I define something to be able?



 On Sun, Sep 15, 2013 at 4:36 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

  Is your HBase server working well? You have access to the UI and
 everything
  is displayed correctly? Are you able to use the shell too?
 
 
  2013/9/15 Sznajder ForMailingList bs4mailingl...@gmail.com
 
I am using Hbase 0.90.4
  
   Benjamin
  
  
   On Sun, Sep 15, 2013 at 4:31 PM, Sznajder ForMailingList 
   bs4mailingl...@gmail.com wrote:
  
Hi
   
When I am connecting from server4 via my java code:
(again, hbase runs on server1, server2 and server3)
   
   
I am getting the following:
   
 [java] Exception in thread main java.lang.RuntimeException:
org.apache.gora.util.GoraException: java.lang.RuntimeException:
org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able
 to
connect to ZooKeeper but the connection closes immediately. This
 could
   be a
sign that the server has too many connections (30 is the default).
   Consider
inspecting your ZK server logs for that error and then make sure you
  are
reusing HBaseConfiguration as often as you can. See HTable's javadoc
  for
more information.
 [java] at
com.ibm.hrl.crawldb.CrawlDataBase.init(CrawlDataBase.java:43)
 [java] at
com.ibm.hrl.main.CrawlQueueMain.main(CrawlQueueMain.java:72)
 [java] Caused by: org.apache.gora.util.GoraException:
java.lang.RuntimeException:
org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able
 to
connect to ZooKeeper but the connection closes immediately. This
 could
   be a
sign that the server has too many connections (30 is the default).
   Consider
inspecting your ZK server logs for that error and then make sure you
  are
reusing HBaseConfiguration as often as you can. See HTable's javadoc
  for
more information.
   
   
   
On Sun, Sep 15, 2013 at 3:34 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:
   
Also, I will add to Ted:
what have you tried so far?
   
JM
   
   
2013/9/15 Ted Yu yuzhih...@gmail.com
   
 Have you read http://hbase.apache.org/book.html#zookeeper ?

 What HBase version are you using ?

 Can you pastebin the error(s) you encountered ?

 Thanks

 On Sep 15, 2013, at 1:26 AM, Sznajder ForMailingList 
 bs4mailingl...@gmail.com wrote:

  Hi
 
  I defined a distributed HBASE environment on a cluster of three
servers
  server1
  server2
  server3
 
  I would like to run a program using this HBASE from a fourth
  server
   :
  server4
 
  I am connecting to HBASE using GORA.
  However, I do not succeed to tell the program to connect by
  remote
to
 the
  hbase quorum...
 
  Is it possible?
 
  Thanks
 
  Benjamin

Row Filters using BitComparator

2013-09-15 Thread abhinavpundir

I have rows in my Hbase whose keys are made up of 5 components. I would like
to search my hbase using rowFilters by using only some(May be only the first
2 components ) of the components. I don't want to use RegexStringComparator.

I would like to use BitComparator because it is fast.

how would I do that?




--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/Row-Filters-using-BitComparator-tp4050739.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: GSSException: No valid credentials provided happens when user re-login has been done

Have you read http://hbase.apache.org/book.html#d0e5135 ?

See description w.r.t. principal's maxrenewlife


On Sun, Sep 15, 2013 at 3:57 AM, Sujun Cheng chengsj@gmail.com wrote:

 Hi all,

 In our project, we met a problem that though we use use
 checkTGTAndReloginFromKeytab() every time when we do the hbase access, but
 the SASL authentication failed still will happen after 1-2 days running,
 the error log is shown below. If restart the program, it will runs normal,
 but exception will happen again after running same time. So we used another
 method hope can solve it, we will catch exception such as SaslException and
 then let user login again, but this problem still exists. What the reason
 of that and how can we solve it? Many thanks!

 java.lang.RuntimeException: SASL authentication failed. The most likely
 cause is missing or invalid credentials. Consider 'kinit'.
 at

 org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$1.run(SecureClient.java:242)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at

 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
 at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
 at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at org.apache.hadoop.hbase.util.Methods.call(Methods.java:37)
 at org.apache.hadoop.hbase.security.User.call(User.java:590)
 at org.apache.hadoop.hbase.security.User.access$700(User.java:51)
 at
 org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:444)
 at

 org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.handleSaslConnectionFailure(SecureClient.java:203)
 at

 org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.setupIOstreams(SecureClient.java:291)
 at

 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
 at

 org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:104)
 at $Proxy7.getClosestRowBefore(Unknown Source)
 at

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1016)
 at

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:882)
 at

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:984)
 at

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:886)
 at

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:843)
 at

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1533)
 at

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1418)
 at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:918)
 at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:774)
 at org.apache.hadoop.hbase.client.HTable.put(HTable.java:749)
 at

 org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:394)
 ..
 backtype.storm.daemon.executor$eval3836$fn_*3837$tuple_action_fn*
 _3839.invoke(executor.clj:566)
 at

 backtype.storm.daemon.executor$mk_task_receiver$fn__3760.invoke(executor.clj:345)
 at

 backtype.storm.disruptor$clojure_handler$reify__1583.onEvent(disruptor.clj:43)
 at

 backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
 at

 backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58)
 at

 backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
 at backtype.storm.daemon.executor$eval3836$fn_*3837$fn**3846$fn*
 _3893.invoke(executor.clj:658)
 at backtype.storm.util$async_loop$fn__357.invoke(util.clj:377)
 at clojure.lang.AFn.run(AFn.java:24)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused
 by GSSException: No valid credentials provided (Mechanism level: Failed to
 find any Kerberos tgt)]
 at

 com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
 at

 org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:156)
 at

 org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.setupSaslConnection(SecureClient.java:177)
 at

 org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.access$500(SecureClient.java:85)
 at

 org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$2.run(SecureClient.java:284)
 at

 org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$2.run(SecureClient.java:281)
 at java.security.AccessController.doPrivileged(Native Method)
 at

Re: completebulkload does 'copy' StoreFiles instead of 'cut'

2013-09-15 Thread lars hofhansl

This also happens when the split points of the HFiles to be imported do not 
match that of the table. In that case the files are split and the split files 
are written before they get moved into place.

 From: Ted Yu yuzhih...@gmail.com
To: user@hbase.apache.org user@hbase.apache.org 
Sent: Sunday, September 15, 2013 7:04 AM
Subject: Re: completebulkload does 'copy' StoreFiles instead of 'cut'

In the region server log file, do you see the following ?

      LOG.info(File  + srcPath +  on different filesystem than  +
          destination store - moving to this filesystem.);

On Sun, Sep 15, 2013 at 2:52 AM, M. BagherEsmaeily mbesmae...@gmail.comwrote:

 Actually that was my fault duplicating the issue.I had no permission to
 delete the second issue. sorry!

 On Sun, Sep 15, 2013 at 2:12 PM, Ted Yu yuzhih...@gmail.com wrote:

  Looks like HBASE-9538 is for the same issue.

  On Sep 15, 2013, at 1:09 AM, M. BagherEsmaeily mbesmae...@gmail.com
  wrote:

   HBASE-9537

   On Sun, Sep 15, 2013 at 11:56 AM, Ted Yu yuzhih...@gmail.com wrote:

   File a JIRA for the issue ?

   On Sep 14, 2013, at 11:10 PM, M. BagherEsmaeily 
 mbesmae...@gmail.com

   wrote:

   Hi rajesh,

   I use Hbase 0.94.11 and Hadoop 1.2.1. The file system of bulkload
   output directory and hbase cluster are the same, too.

   I've also coded a MapReduce job using HFileOutputFormat. When I use
   LoadIncrementalHFiles
   to move the output of my job to HBase table, it still copies instead
 of
   cut.

   Thanks

   On Sat, Sep 14, 2013 at 2:50 PM, rajesh babu Chintaguntla 
   chrajeshbab...@gmail.com wrote:

   Hi BagherEsmaeily,

        which version of hbase you are using? Is the file system of
   bulkload
   output directory and hbase cluster are same?

        If you are using hbase older than 0.94.5 version, the
 Storefiles
   generated by importtsv are getting copied instead of        moving
  even
   if
   the file system of bulkload output directory and hbase cluster are
  same.

        Its a bug and solved in 0.94.5 (HBASE-5498).

   Thanks.
   Rajeshbabu

   On Sat, Sep 14, 2013 at 12:01 PM, M. BagherEsmaeily 
   mbesmae...@gmail.com
   wrote:

   Hello,

   I was using HBase complete bulk load to transfer the output of
   ImportTsv
   to
   a table in HBase, and I noticed that it copies the output instead
 of
   cutting. This takes long time for my gigabytes of data.

   In HBase documentation (
   http://hbase.apache.org/book/ops_mgt.html#completebulkload) I read
   that
   the
   files would be moved not copied. Can anyone help me with this?

   Kind Regards

Re: Row Filters using BitComparator

2013-09-15 Thread anil gupta

Inline.

On Sun, Sep 15, 2013 at 12:04 AM, abhinavpundir abhinavmast...@gmail.comwrote:

 I have rows in my Hbase whose keys are made up of 5 components. I would
 like
 to search my hbase using rowFilters by using only some(May be only the
 first
 2 components ) of the components. I don't want to use
 RegexStringComparator.

 If you are using the first two components(i.e. prefix of the rowkey) then
you can use PrefixFilter(
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/PrefixFilter.html).
Also, dont forget to set startRow and StopRow.

 I would like to use BitComparator because it is fast.

 how would I do that?




 --
 View this message in context:
 http://apache-hbase.679495.n3.nabble.com/Row-Filters-using-BitComparator-tp4050739.html
 Sent from the HBase User mailing list archive at Nabble.com.




-- 
Thanks  Regards,
Anil Gupta

Re: completebulkload does 'copy' StoreFiles instead of 'cut'