Re: No applicable class implementing Serialization in conf at io.serializations: class org.apache.hadoop.hbase.client.Put

2014-11-03 Thread Serega Sheypak
Ok, I got it. Thank you!

2014-11-03 2:20 GMT+03:00 Sean Busbey bus...@cloudera.com:

 On Sun, Nov 2, 2014 at 5:09 PM, Ted Yu yuzhih...@gmail.com wrote:

  bq. context.write(hbaseKey, put); //Exception here
 
  I am not mrunit expert. But as long as you call the following method
 prior
  to the above method invocation, you should be able to proceed:
 
  conf.setStrings(io.serializations, conf.get(io.serializations),
 
  MutationSerialization.class.getName(), ResultSerialization.class
  .getName(),
 
  KeyValueSerialization.class.getName());
 
 

 Those classes are not a part of the public HBase API, so directly
 referencing them is a bad idea. Doing so just sets them up to break on some
 future HBase upgrade.

 The OP needs a place in MRUnit to call one of
 HFileOutputFormat.configureIncrementalLoad,
 HFileOutputFormat2.configureIncrementalLoad, or
 TableMapReduceUtil.initTableReducerJob. Those are the only public API ways
 to configure the needed Serialization.

 --
 Sean



Filter on Values of Cells

2014-11-03 Thread Sznajder ForMailingList
Hi

I would like to filtering rows that contain specific value at specific
{family, qualifier}.

For example, if my table contains the following lines, the cells are of the
form {fam, qual, val}

Row 1 : {fam-1, col0, val1}, {fam-1, col1, val11}
Row 2 : {fam-1, col0, val11}, {fam-1, col1, val21}

I would like a filter that for the constraint
{fam-1, col0, val1} returns only Row1

and for the filter
{fam-1, col0, val11} returns Row1 and Row2

I thought using the ValueFilter , but it does not give the ability to add
constraint also on {fam, qual}


Thanks!

Benjamin


Re: Filter on Values of Cells

2014-11-03 Thread Ted Yu
Would SingleColumnValueFilter serve your need ?

Cheers

On Mon, Nov 3, 2014 at 7:28 AM, Sznajder ForMailingList 
bs4mailingl...@gmail.com wrote:

 Hi

 I would like to filtering rows that contain specific value at specific
 {family, qualifier}.

 For example, if my table contains the following lines, the cells are of the
 form {fam, qual, val}

 Row 1 : {fam-1, col0, val1}, {fam-1, col1, val11}
 Row 2 : {fam-1, col0, val11}, {fam-1, col1, val21}

 I would like a filter that for the constraint
 {fam-1, col0, val1} returns only Row1

 and for the filter
 {fam-1, col0, val11} returns Row1 and Row2

 I thought using the ValueFilter , but it does not give the ability to add
 constraint also on {fam, qual}


 Thanks!

 Benjamin



Re: Filter on Values of Cells

2014-11-03 Thread Sznajder ForMailingList
!!

Thanks a lot!

Benjamin

On Mon, Nov 3, 2014 at 5:31 PM, Ted Yu yuzhih...@gmail.com wrote:

 Would SingleColumnValueFilter serve your need ?

 Cheers

 On Mon, Nov 3, 2014 at 7:28 AM, Sznajder ForMailingList 
 bs4mailingl...@gmail.com wrote:

  Hi
 
  I would like to filtering rows that contain specific value at specific
  {family, qualifier}.
 
  For example, if my table contains the following lines, the cells are of
 the
  form {fam, qual, val}
 
  Row 1 : {fam-1, col0, val1}, {fam-1, col1, val11}
  Row 2 : {fam-1, col0, val11}, {fam-1, col1, val21}
 
  I would like a filter that for the constraint
  {fam-1, col0, val1} returns only Row1
 
  and for the filter
  {fam-1, col0, val11} returns Row1 and Row2
 
  I thought using the ValueFilter , but it does not give the ability to add
  constraint also on {fam, qual}
 
 
  Thanks!
 
  Benjamin
 



How to findout Hbase PUT inserts a new Row or Update an exisiting row

2014-11-03 Thread Bora, Venu
Hello,
We have a requirement to determine whether a PUT will create a new row or 
update an existing one. I looked at using preBatchMutate in a co-processor and 
have the code below.

Few things I need to ask:
1) Is there a more efficient way of doing this?
2) Will region.getClosestRowBefore() add additional IO to go to disk? or will 
the row be in memory since the row lock was already acquired before 
preBatchMutate is called?
3) Will region.getClosestRowBefore() always give the correct result? Or are 
there scenarios where the previous state will not be visible?


@Override
public void preBatchMutate(ObserverContextRegionCoprocessorEnvironment c, 
MiniBatchOperationInProgressMutation miniBatchOp) throws IOException {
for (int i = 0; i  miniBatchOp.size(); i++) {
Mutation operation = miniBatchOp.getOperation(i);
byte[] rowKey = operation.getRow();
NavigableMapbyte[], ListCell familyCellMap = 
operation.getFamilyCellMap();

for (Entrybyte[], ListCell entry : familyCellMap.entrySet()) {
for (IteratorCell iterator = entry.getValue().iterator(); 
iterator.hasNext();) {
Cell cell = iterator.next();
byte[] family = CellUtil.cloneFamily(cell);
Result closestRowBefore = 
c.getEnvironment().getRegion().getClosestRowBefore(rowKey, family);
// closestRowBefore would return null if there is not 
record for the rowKey and family
if (closestRowBefore != null) {
// PUT is doing an update for the given rowKey, family
} else {
// PUT is doing an insert for the given rowKey, family
}
}
}
}
super.preBatchMutate(c, miniBatchOp);
}


Thanks
Venu Bora




This e-mail and files transmitted with it are confidential, and are intended 
solely for the use of the individual or entity to whom this e-mail is 
addressed. If you are not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution or copying of this communication is 
strictly prohibited. If you are not one of the named recipient(s) or otherwise 
have reason to believe that you received this message in error, please 
immediately notify sender by e-mail, and destroy the original message. Thank 
You.


Fwd: error in starting hbase

2014-11-03 Thread beeshma r
Hi Ted

Any update on this error? i tried Pseudo-Distributed mode But i still have
error

hbase(main):001:0 create 't1','c1'

ERROR: org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative()V

Here is some help for this command:
Creates a table. Pass a table name, and a set of column family
specifications (at least one), and, optionally, table configuration.
Column specification can be a simple string (name), or a dictionary
(dictionaries are described below in main help output), necessarily
including NAME attribute.
Examples:



-- Forwarded message --
From: beeshma r beeshm...@gmail.com
Date: Sun, Nov 2, 2014 at 7:22 AM
Subject: Re: error in starting hbase
To: user@hbase.apache.org


Hi Ted,

Thanks for your reply. Yes i am running  standalone mode
After changing my zookeeper property its resolved .And now i have another
two issues .

2014-11-02 07:06:32,948 DEBUG [main] master.HMaster:
master/ubuntu.ubuntu-domain/127.0.1.1:0 HConnection server-to-server
retries=350
2014-11-02 07:06:33,458 INFO  [main] ipc.RpcServer:
master/ubuntu.ubuntu-domain/127.0.1.1:0: started 10 reader(s).
2014-11-02 07:06:33,670 INFO  [main] impl.MetricsConfig: loaded properties
from hadoop-metrics2-hbase.properties
2014-11-02 07:06:33,766 INFO  [main] impl.MetricsSystemImpl: Scheduled
snapshot period at 10 second(s).
2014-11-02 07:06:33,766 INFO  [main] impl.MetricsSystemImpl: HBase metrics
system started
2014-11-02 07:06:34,592 ERROR [main] master.HMasterCommandLine: Master
exiting
java.lang.RuntimeException: Failed construction of Master: class
org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMasternull
at
org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:140)
at
org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:202)
at
org.apache.hadoop.hbase.LocalHBaseCluster.init(LocalHBaseCluster.java:152)
at
org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:179)
at
org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:135)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2785)
Caused by: java.lang.RuntimeException:
java.lang.reflect.InvocationTargetException
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
at org.apache.hadoop.security.Groups.init(Groups.java:55)
at
org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:182)
at
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:235)
at
org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:214)
at
org.apache.hadoop.security.UserGroupInformation.isAuthenticationMethodEnabled(UserGroupInformation.java:275)

--

And when i create table

hbase(main):001:0 create 't1','e1'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/home/beeshma/hbase-0.98.6.1-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/home/beeshma/hadoop-1.2.1/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.

ERROR: org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative()V

-

 hbase(main):002:0 list
TABLE


ERROR: Could not initialize class
org.apache.hadoop.security.JniBasedUnixGroupsMapping

Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:

  hbase list
  hbase list 'abc.*'
  hbase list 'ns:abc.*'
  hbase list 'ns:.*'


hbase(main):003:0 beeshma@ubuntu:~/hbase-0.98.6.1-hadoop2/bin$



On Sun, Nov 2, 2014 at 7:01 AM, Ted Yu yuzhih...@gmail.com wrote:

 Are you running hbase in standalone mode ?

 See http://hbase.apache.org/book.html#zookeeper

 bq. To toggle HBase management of ZooKeeper, use the HBASE_MANAGES_ZK
 variable
 in conf/hbase-env.sh.

 Cheers

 On Sun, Nov 2, 2014 at 6:41 AM, beeshma r beeshm...@gmail.com wrote:

  HI
 
  When i start hbase fallowing error is occurred .How to  solve this? i
  haven't add any zokeeper path anywhere?
 
  Please suggest this.
 
  2014-11-01 20:01:51,196 INFO  [main] server.ZooKeeperServer: Server
  environment:java.io.tmpdir=/tmp
  2014-11-01 20:01:51,196 INFO  [main] server.ZooKeeperServer: Server
  environment:java.compiler=NA
  2014-11-01 20:01:51,196 INFO  [main] server.ZooKeeperServer: Server
  environment:os.name=Linux
  2014-11-01 20:01:51,196 INFO  [main] server.ZooKeeperServer: Server
  environment:os.arch=amd64
  2014-11-01 20:01:51,196 INFO  [main] server.ZooKeeperServer: Server
  

Re: error in starting hbase

2014-11-03 Thread Ted Yu
Here is the method in JniBasedUnixGroupsMapping which appears in stack
trace:

  native static void anchorNative();

It is a native method.

Which hadoop release are you using ? How did you install it ?


Cheers

On Mon, Nov 3, 2014 at 9:25 AM, beeshma r beeshm...@gmail.com wrote:

 Hi Ted

 Any update on this error? i tried Pseudo-Distributed mode But i still have
 error

 hbase(main):001:0 create 't1','c1'

 ERROR: org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative()V

 Here is some help for this command:
 Creates a table. Pass a table name, and a set of column family
 specifications (at least one), and, optionally, table configuration.
 Column specification can be a simple string (name), or a dictionary
 (dictionaries are described below in main help output), necessarily
 including NAME attribute.
 Examples:



 -- Forwarded message --
 From: beeshma r beeshm...@gmail.com
 Date: Sun, Nov 2, 2014 at 7:22 AM
 Subject: Re: error in starting hbase
 To: user@hbase.apache.org


 Hi Ted,

 Thanks for your reply. Yes i am running  standalone mode
 After changing my zookeeper property its resolved .And now i have another
 two issues .

 2014-11-02 07:06:32,948 DEBUG [main] master.HMaster:
 master/ubuntu.ubuntu-domain/127.0.1.1:0 HConnection server-to-server
 retries=350
 2014-11-02 07:06:33,458 INFO  [main] ipc.RpcServer:
 master/ubuntu.ubuntu-domain/127.0.1.1:0: started 10 reader(s).
 2014-11-02 07:06:33,670 INFO  [main] impl.MetricsConfig: loaded properties
 from hadoop-metrics2-hbase.properties
 2014-11-02 07:06:33,766 INFO  [main] impl.MetricsSystemImpl: Scheduled
 snapshot period at 10 second(s).
 2014-11-02 07:06:33,766 INFO  [main] impl.MetricsSystemImpl: HBase metrics
 system started
 2014-11-02 07:06:34,592 ERROR [main] master.HMasterCommandLine: Master
 exiting
 java.lang.RuntimeException: Failed construction of Master: class
 org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMasternull
 at

 org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:140)
 at

 org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:202)
 at

 org.apache.hadoop.hbase.LocalHBaseCluster.init(LocalHBaseCluster.java:152)
 at

 org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:179)
 at

 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:135)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at

 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
 at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2785)
 Caused by: java.lang.RuntimeException:
 java.lang.reflect.InvocationTargetException
 at

 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at org.apache.hadoop.security.Groups.init(Groups.java:55)
 at

 org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:182)
 at

 org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:235)
 at

 org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:214)
 at

 org.apache.hadoop.security.UserGroupInformation.isAuthenticationMethodEnabled(UserGroupInformation.java:275)

 --

 And when i create table

 hbase(main):001:0 create 't1','e1'
 SLF4J: Class path contains multiple SLF4J bindings.
 SLF4J: Found binding in

 [jar:file:/home/beeshma/hbase-0.98.6.1-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: Found binding in

 [jar:file:/home/beeshma/hadoop-1.2.1/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
 explanation.

 ERROR: org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative()V

 -

  hbase(main):002:0 list
 TABLE


 ERROR: Could not initialize class
 org.apache.hadoop.security.JniBasedUnixGroupsMapping

 Here is some help for this command:
 List all tables in hbase. Optional regular expression parameter could
 be used to filter the output. Examples:

   hbase list
   hbase list 'abc.*'
   hbase list 'ns:abc.*'
   hbase list 'ns:.*'


 hbase(main):003:0 beeshma@ubuntu:~/hbase-0.98.6.1-hadoop2/bin$



 On Sun, Nov 2, 2014 at 7:01 AM, Ted Yu yuzhih...@gmail.com wrote:

  Are you running hbase in standalone mode ?
 
  See http://hbase.apache.org/book.html#zookeeper
 
  bq. To toggle HBase management of ZooKeeper, use the HBASE_MANAGES_ZK
  variable
  in conf/hbase-env.sh.
 
  Cheers
 
  On Sun, Nov 2, 2014 at 6:41 AM, beeshma r beeshm...@gmail.com wrote:
 
   HI
  
   When i start hbase fallowing error is occurred .How to  solve this? i
   haven't add any zokeeper path anywhere?
  
   Please suggest this.
  
   2014-11-01 20:01:51,196 INFO  [main] server.ZooKeeperServer: Server
   

Hbase Dead region Server

2014-11-03 Thread Nishanth S
Hey folks,

How do I remove a dead region server?.I manually failed over the hbase
master but this is still appearing in master UI and also on the status
command that I run.

Thanks,
Nishan


Re: Hbase Dead region Server

2014-11-03 Thread Pere Kyle
Nishanth,

In my experience the only way I have been able to clear the dead region
servers is to restart the master daemon.

-Pere

On Mon, Nov 3, 2014 at 9:49 AM, Nishanth S nishanth.2...@gmail.com wrote:

 Hey folks,

 How do I remove a dead region server?.I manually failed over the hbase
 master but this is still appearing in master UI and also on the status
 command that I run.

 Thanks,
 Nishan



Re: Hbase Dead region Server

2014-11-03 Thread Nishanth S
Thanks Pere. I just did that and still  has the dead region server  showing
up in Master UI as well as  in status command.I have replication turned on
 in hbase and seeing few issues.Below is the stack trace I am seeing.

2014-11-03 18:31:00,215 WARN
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't
replicate because of a local or network error:
java.io.IOException: No replication sinks are available
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSinkManager.getReplicationSink(ReplicationSinkManager.java:117)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:652)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:350)
2014-11-03 18:31:00,459 WARN
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't
replicate because of a local or network error:
java.io.IOException: No replication sinks are available
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSinkManager.getReplicationSink(ReplicationSinkManager.java:117)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:652)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:350)

On Mon, Nov 3, 2014 at 11:18 AM, Pere Kyle p...@whisper.sh wrote:

 Nishanth,

 In my experience the only way I have been able to clear the dead region
 servers is to restart the master daemon.

 -Pere

 On Mon, Nov 3, 2014 at 9:49 AM, Nishanth S nishanth.2...@gmail.com
 wrote:

  Hey folks,
 
  How do I remove a dead region server?.I manually failed over the hbase
  master but this is still appearing in master UI and also on the status
  command that I run.
 
  Thanks,
  Nishan
 



Re: OOM when fetching all versions of single row

2014-11-03 Thread Michael Segel
St.Ack, 

I  think you're side stepping the issue concerning schema design. 

Since HBase isn't my core focus, I also have to ask since when has heap sizes 
over 16GB been the norm? 
(Really 8GB seems to be quite a large heap size... ) 


On Oct 31, 2014, at 11:15 AM, Stack st...@duboce.net wrote:

 On Thu, Oct 30, 2014 at 8:20 AM, Andrejs Dubovskis dubis...@gmail.com
 wrote:
 
 Hi!
 
 We have a bunch of rows on HBase which store varying sizes of data
 (1-50MB). We use HBase versioning and keep up to 1 column
 versions. Typically each column has only few versions. But in rare
 cases it may has thousands versions.
 
 The Mapreduce alghoritm uses full scan and our algorithm requires all
 versions to produce the result. So, we call scan.setMaxVersions().
 
 In worst case Region Server returns one row only, but huge one. The
 size is unpredictable and can not be controlled, because using
 parameters we can control row count only. And the MR task can throws
 OOME even if it has 50Gb heap.
 
 Is it possible to handle this situation? For example, RS should not
 send the raw to client, if the last has no memory to handle the row.
 In this case client can handle error and fetch each row's version in a
 separate get request.
 
 
 See HBASE-11544 [Ergonomics] hbase.client.scanner.caching is dogged and
 will try to return batch even if it means OOME.
 St.Ack



smime.p7s
Description: S/MIME cryptographic signature


Hbase Incremental Export/ImportTable

2014-11-03 Thread Pere Kyle
Hi,

I am implementing disaster recovery for our Hbase cluster and had one quick 
question about import/export of the s3n file system.

I know that ExportTable can be given a start time and end time enabling 
incremental backups. My question is how to properly store these incremental 
backups on s3.

My idea is to have them in a nested folders in a bucket like so:
s3://backups/mytable/YEAR/MONTH/DAY/HR

So say the initial backup goes is now
s3://backups/mytable/2014/11/2/11

Tomorrow at the same time would be:
s3://backups/mytable/2014/11/3/11

Would a ImportTable be able to take just the s3://backups/mytable and 
recursively import all the sequence files from there?

Thanks,
Pere

Re: OOM when fetching all versions of single row

2014-11-03 Thread Bryan Beaudreault
There are many blog posts and articles about people turning for  16GB
heaps since java7 and the G1 collector became mainstream.  We run with 25GB
heap ourselves with very short GC pauses using a mostly untuned G1
collector.  Just one example is the excellent blog post by Intel,
https://software.intel.com/en-us/blogs/2014/06/18/part-1-tuning-java-garbage-collection-for-hbase

That said, two things:

1) St.Ack's reply is very relevant, because as HBase matures it needs to
make it harder for new people to shoot themselves in the foot.  I'd love to
see more tickets like HBASE-11544. This is something we run into often,
with 10s of developers writing queries against a few shared clusters.

2) Since none of these enhancements are available yet, I recommend
rethinking your schema if possible.You could change the cardinality
such that you end up with more rows with less versions each, instead of
these fat rows.  While not exactly the same, you might be able to use TTL
or your own purge job to keep the number of rows limited.

On Mon, Nov 3, 2014 at 2:02 PM, Michael Segel mse...@segel.com wrote:

 St.Ack,

 I  think you're side stepping the issue concerning schema design.

 Since HBase isn't my core focus, I also have to ask since when has heap
 sizes over 16GB been the norm?
 (Really 8GB seems to be quite a large heap size... )


 On Oct 31, 2014, at 11:15 AM, Stack st...@duboce.net wrote:

 On Thu, Oct 30, 2014 at 8:20 AM, Andrejs Dubovskis dubis...@gmail.com
 wrote:

 Hi!

 We have a bunch of rows on HBase which store varying sizes of data
 (1-50MB). We use HBase versioning and keep up to 1 column
 versions. Typically each column has only few versions. But in rare
 cases it may has thousands versions.

 The Mapreduce alghoritm uses full scan and our algorithm requires all
 versions to produce the result. So, we call scan.setMaxVersions().

 In worst case Region Server returns one row only, but huge one. The
 size is unpredictable and can not be controlled, because using
 parameters we can control row count only. And the MR task can throws
 OOME even if it has 50Gb heap.

 Is it possible to handle this situation? For example, RS should not
 send the raw to client, if the last has no memory to handle the row.
 In this case client can handle error and fetch each row's version in a
 separate get request.


 See HBASE-11544 [Ergonomics] hbase.client.scanner.caching is dogged and
 will try to return batch even if it means OOME.
 St.Ack





Re: OOM when fetching all versions of single row

2014-11-03 Thread Michael Segel
Bryan,

I wasn’t saying St.Ack’s post wasn’t relevant, but that its not addressing the 
easiest thing to fix. Schema design. 
IMHO, that’s shooting one’s self in the foot. 

You shouldn’t be using versioning to capture temporal data. 


On Nov 3, 2014, at 1:54 PM, Bryan Beaudreault bbeaudrea...@hubspot.com wrote:

 There are many blog posts and articles about people turning for  16GB
 heaps since java7 and the G1 collector became mainstream.  We run with 25GB
 heap ourselves with very short GC pauses using a mostly untuned G1
 collector.  Just one example is the excellent blog post by Intel,
 https://software.intel.com/en-us/blogs/2014/06/18/part-1-tuning-java-garbage-collection-for-hbase
 
 That said, two things:
 
 1) St.Ack's reply is very relevant, because as HBase matures it needs to
 make it harder for new people to shoot themselves in the foot.  I'd love to
 see more tickets like HBASE-11544. This is something we run into often,
 with 10s of developers writing queries against a few shared clusters.
 
 2) Since none of these enhancements are available yet, I recommend
 rethinking your schema if possible.You could change the cardinality
 such that you end up with more rows with less versions each, instead of
 these fat rows.  While not exactly the same, you might be able to use TTL
 or your own purge job to keep the number of rows limited.
 
 On Mon, Nov 3, 2014 at 2:02 PM, Michael Segel mse...@segel.com wrote:
 
 St.Ack,
 
 I  think you're side stepping the issue concerning schema design.
 
 Since HBase isn't my core focus, I also have to ask since when has heap
 sizes over 16GB been the norm?
 (Really 8GB seems to be quite a large heap size... )
 
 
 On Oct 31, 2014, at 11:15 AM, Stack st...@duboce.net wrote:
 
 On Thu, Oct 30, 2014 at 8:20 AM, Andrejs Dubovskis dubis...@gmail.com
 wrote:
 
 Hi!
 
 We have a bunch of rows on HBase which store varying sizes of data
 (1-50MB). We use HBase versioning and keep up to 1 column
 versions. Typically each column has only few versions. But in rare
 cases it may has thousands versions.
 
 The Mapreduce alghoritm uses full scan and our algorithm requires all
 versions to produce the result. So, we call scan.setMaxVersions().
 
 In worst case Region Server returns one row only, but huge one. The
 size is unpredictable and can not be controlled, because using
 parameters we can control row count only. And the MR task can throws
 OOME even if it has 50Gb heap.
 
 Is it possible to handle this situation? For example, RS should not
 send the raw to client, if the last has no memory to handle the row.
 In this case client can handle error and fetch each row's version in a
 separate get request.
 
 
 See HBASE-11544 [Ergonomics] hbase.client.scanner.caching is dogged and
 will try to return batch even if it means OOME.
 St.Ack
 
 
 



MultiTableInputFormat to compare 2 tables taking about 80 mins

2014-11-03 Thread krish_571
Hi,

I am using Hbase MultiTableInputFormat to compare 2 tables: Table1 (7
million), Table2 (30 million).

In the driver, i am passing to scans ( without any filters). In my mapper i
am doing a compare and writing the summary in Reducer.

Any settings specific to this scenario that might speed up the process.
Thanks.



--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/MultiTableInputFormat-to-compare-2-tables-taking-about-80-mins-tp4065638.html
Sent from the HBase User mailing list archive at Nabble.com.


Re: Authenticate from SQL

2014-11-03 Thread Bharath Vissapragada
Hi,

What do you mean by auth in SQL? It supports SPNEGO incase you are
interested.

On Mon, Nov 3, 2014 at 12:16 PM, Margusja mar...@roo.ee wrote:

 Hi

 I am looking solutions where users before using HBase rest will be
 authenticate from SQL (in example from Oracle).
 Is there any best practices or ready solutions for HBase?

 --
 Best regards, Margus Roo
 skype: margusja
 phone: +372 51 48 780
 web: http://margus.roo.ee




-- 
Bharath Vissapragada
http://www.cloudera.com


Re: Hbase Dead region Server

2014-11-03 Thread Talat Uyarer
Hi Pere and Nishanth,

In master branch i developt a bash script to same problem. Its name is
considerAsDead.sh [1] It mark as dead and start the recovery process.

[1] https://github.com/apache/hbase/blob/master/bin/considerAsDead.sh

Talat
On Nov 3, 2014 8:32 PM, Nishanth S nishanth.2...@gmail.com wrote:

 Thanks Pere. I just did that and still  has the dead region server  showing
 up in Master UI as well as  in status command.I have replication turned on
  in hbase and seeing few issues.Below is the stack trace I am seeing.

 2014-11-03 18:31:00,215 WARN
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't
 replicate because of a local or network error:
 java.io.IOException: No replication sinks are available
 at

 org.apache.hadoop.hbase.replication.regionserver.ReplicationSinkManager.getReplicationSink(ReplicationSinkManager.java:117)
 at

 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:652)
 at

 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:350)
 2014-11-03 18:31:00,459 WARN
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't
 replicate because of a local or network error:
 java.io.IOException: No replication sinks are available
 at

 org.apache.hadoop.hbase.replication.regionserver.ReplicationSinkManager.getReplicationSink(ReplicationSinkManager.java:117)
 at

 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:652)
 at

 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:350)

 On Mon, Nov 3, 2014 at 11:18 AM, Pere Kyle p...@whisper.sh wrote:

  Nishanth,
 
  In my experience the only way I have been able to clear the dead region
  servers is to restart the master daemon.
 
  -Pere
 
  On Mon, Nov 3, 2014 at 9:49 AM, Nishanth S nishanth.2...@gmail.com
  wrote:
 
   Hey folks,
  
   How do I remove a dead region server?.I manually failed over the hbase
   master but this is still appearing in master UI and also on the status
   command that I run.
  
   Thanks,
   Nishan
  
 



Re: Re: Hbase Dead region Server

2014-11-03 Thread yeweichen2...@gmail.com
Thanks Pere. I just did that and still  has the dead region server  showing
up in Master UI as well as  in status command.I have replication turned on
 in hbase and seeing few issues.Below is the stack trace I am seeing.

2014-11-03 18:31:00,215 WARN
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't
replicate because of a local or network error:
java.io.IOException: No replication sinks are available
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSinkManager.getReplicationSink(ReplicationSinkManager.java:117)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:652)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:350)
2014-11-03 18:31:00,459 WARN
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't
replicate because of a local or network error:
java.io.IOException: No replication sinks are available
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSinkManager.getReplicationSink(ReplicationSinkManager.java:117)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:652)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:350)

On Mon, Nov 3, 2014 at 11:18 AM, Pere Kyle p...@whisper.sh wrote:

 Nishanth,

 In my experience the only way I have been able to clear the dead region
 servers is to restart the master daemon.

 -Pere

 On Mon, Nov 3, 2014 at 9:49 AM, Nishanth S nishanth.2...@gmail.com
 wrote:

  Hey folks,
 
  How do I remove a dead region server?.I manually failed over the hbase
  master but this is still appearing in master UI and also on the status
  command that I run.
 
  Thanks,
  Nishan
 



Re: Authenticate from SQL

2014-11-03 Thread Margusja

Hi

In one old project where usernames and passwords are in RDB, we need 
authenticate users from RDB before they can go via REST to HBase.

So the first thing was Knox.

Best regards, Margus Roo
skype: margusja
phone: +372 51 48 780
web: http://margus.roo.ee

On 04/11/14 04:02, Bharath Vissapragada wrote:

Hi,

What do you mean by auth in SQL? It supports SPNEGO incase you are
interested.

On Mon, Nov 3, 2014 at 12:16 PM, Margusja mar...@roo.ee wrote:


Hi

I am looking solutions where users before using HBase rest will be
authenticate from SQL (in example from Oracle).
Is there any best practices or ready solutions for HBase?

--
Best regards, Margus Roo
skype: margusja
phone: +372 51 48 780
web: http://margus.roo.ee