from:"Serega Sheypak"

No applicable class implementing Serialization in conf at io.serializations: class org.apache.hadoop.hbase.client.Put

2014-11-02 Thread Serega Sheypak

Hi, I'm migrating from CDH4 to CDH5 (hbase 0.98.6-cdh5.2.0)
I had a unit test for mapper used to create HFile and bulk load later.

I've bumped maven deps from cdh4 to cdh5 0.98.6-cdh5.2.0
Now I've started to get exception

java.lang.IllegalStateException: No applicable class implementing
Serialization in conf at io.serializations: class
org.apache.hadoop.hbase.client.Put
at com.google.common.base.Preconditions.checkState(Preconditions.java:149)
at
org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:75)
at
org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:97)
at
org.apache.hadoop.mrunit.internal.output.MockOutputCollector.collect(MockOutputCollector.java:48)
at
org.apache.hadoop.mrunit.internal.mapreduce.AbstractMockContextWrapper$4.answer(AbstractMockContextWrapper.java:90)
at
org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:34)
at
org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:91)
at
org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29)
at
org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:38)
at
org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:51)
at
org.apache.hadoop.mapreduce.Mapper$Context$$EnhancerByMockitoWithCGLIB$$ba4633fb.write()


And here is mapper code:



public class ItemRecommendationHBaseMapper extends Mapper {

private final ImmutableBytesWritable hbaseKey = new
ImmutableBytesWritable();
private final DynamicObjectSerDe serde = new
DynamicObjectSerDe(ItemRecommendation.class);

@Override
protected void map(LongWritable key, BytesWritable value, Context
context) throws IOException, InterruptedException {
checkPreconditions(key, value);
hbaseKey.set(Bytes.toBytes(key.get()));

ItemRecommendation item = serde.deserialize(value.getBytes());
checkPreconditions(item);
Put put = PutFactory.createPut(serde, item, getColumnFamily());

context.write(hbaseKey, put); //Exception here
}

Whatcan i do in order to make unit-test pass?

Re: No applicable class implementing Serialization in conf at io.serializations: class org.apache.hadoop.hbase.client.Put

2014-11-02 Thread Serega Sheypak

 public static Put createPut(DynamicObjectSerDe serde,
ItemRecommendation item, String columnFamily){
Put put = new Put(Bytes.toBytes(Long.valueOf(item.getId(;
put.add(Bytes.toBytes(columnFamily), Bytes.toBytes(item.getRank()),
serde.serialize(item));
return put;
}

2014-11-03 0:12 GMT+03:00 Ted Yu :

> bq. PutFactory.createPut(
>
> Can you reveal how PutFactory creates the Put ?
>
> Thanks
>
> On Sun, Nov 2, 2014 at 1:02 PM, Serega Sheypak 
> wrote:
>
> > Hi, I'm migrating from CDH4 to CDH5 (hbase 0.98.6-cdh5.2.0)
> > I had a unit test for mapper used to create HFile and bulk load later.
> >
> > I've bumped maven deps from cdh4 to cdh5 0.98.6-cdh5.2.0
> > Now I've started to get exception
> >
> > java.lang.IllegalStateException: No applicable class implementing
> > Serialization in conf at io.serializations: class
> > org.apache.hadoop.hbase.client.Put
> > at
> com.google.common.base.Preconditions.checkState(Preconditions.java:149)
> > at
> >
> >
> org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:75)
> > at
> >
> >
> org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:97)
> > at
> >
> >
> org.apache.hadoop.mrunit.internal.output.MockOutputCollector.collect(MockOutputCollector.java:48)
> > at
> >
> >
> org.apache.hadoop.mrunit.internal.mapreduce.AbstractMockContextWrapper$4.answer(AbstractMockContextWrapper.java:90)
> > at
> >
> >
> org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:34)
> > at
> >
> >
> org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:91)
> > at
> >
> >
> org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29)
> > at
> >
> >
> org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:38)
> > at
> >
> >
> org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:51)
> > at
> >
> >
> org.apache.hadoop.mapreduce.Mapper$Context$$EnhancerByMockitoWithCGLIB$$ba4633fb.write()
> >
> >
> > And here is mapper code:
> >
> >
> >
> > public class ItemRecommendationHBaseMapper extends Mapper > BytesWritable, ImmutableBytesWritable, Put> {
> >
> > private final ImmutableBytesWritable hbaseKey = new
> > ImmutableBytesWritable();
> > private final DynamicObjectSerDe serde = new
> > DynamicObjectSerDe(ItemRecommendation.class);
> >
> > @Override
> > protected void map(LongWritable key, BytesWritable value, Context
> > context) throws IOException, InterruptedException {
> > checkPreconditions(key, value);
> > hbaseKey.set(Bytes.toBytes(key.get()));
> >
> > ItemRecommendation item = serde.deserialize(value.getBytes());
> > checkPreconditions(item);
> > Put put = PutFactory.createPut(serde, item, getColumnFamily());
> >
> > context.write(hbaseKey, put); //Exception here
> > }
> >
> > Whatcan i do in order to make unit-test pass?
> >
>

Re: No applicable class implementing Serialization in conf at io.serializations: class org.apache.hadoop.hbase.client.Put

2014-11-02 Thread Serega Sheypak

I use it to prepare HFile using my custom mapper emitting Put and
  HFileOutputFormat.configureIncrementalLoad(job, createHTable())
//connection to target table

and then bulk load data to table using LoadIncrementalHFiles

P.S.
HFileOutputFormat is also deprecated... so many changes... (((


2014-11-03 0:41 GMT+03:00 Sean Busbey :

> In the 0.94.x API, Put implemented Writable[1]. This meant that MR code,
> like yours, could use it as a Key or Value between Mapper and Reducer.
>
> In 0.96 and later APIs, Put no longer directly implements Writable[2].
> Instead, HBase now includes a Hadoop Seriazliation implementation.
> Normally, this would be configured via the TableMapReduceUtil class for
> either a TableMapper or TableReducer.
>
> Presuming that the intention of your MR job is to have all the Puts write
> to some HBase table, you should be able to follow the "write to HBase" part
> of the examples for reading and writing HBase via mapreduce in the
> reference guide[3].
>
> Specifically, you should have your Driver call one of the
> initTableReducerJob methods on TableMapReduceUtil, where it currently sets
> the Mapper class for your application[4].
>
> -Sean
>
> [1]:
>
> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/Put.html
> [2]:
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html
> [3]: http://hbase.apache.org/book/mapreduce.example.html
> [4]:
>
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.html
>
>
> On Sun, Nov 2, 2014 at 3:02 PM, Serega Sheypak 
> wrote:
>
> > Hi, I'm migrating from CDH4 to CDH5 (hbase 0.98.6-cdh5.2.0)
> > I had a unit test for mapper used to create HFile and bulk load later.
> >
> > I've bumped maven deps from cdh4 to cdh5 0.98.6-cdh5.2.0
> > Now I've started to get exception
> >
> > java.lang.IllegalStateException: No applicable class implementing
> > Serialization in conf at io.serializations: class
> > org.apache.hadoop.hbase.client.Put
> > at
> com.google.common.base.Preconditions.checkState(Preconditions.java:149)
> > at
> >
> >
> org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:75)
> > at
> >
> >
> org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:97)
> > at
> >
> >
> org.apache.hadoop.mrunit.internal.output.MockOutputCollector.collect(MockOutputCollector.java:48)
> > at
> >
> >
> org.apache.hadoop.mrunit.internal.mapreduce.AbstractMockContextWrapper$4.answer(AbstractMockContextWrapper.java:90)
> > at
> >
> >
> org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:34)
> > at
> >
> >
> org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:91)
> > at
> >
> >
> org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29)
> > at
> >
> >
> org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:38)
> > at
> >
> >
> org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:51)
> > at
> >
> >
> org.apache.hadoop.mapreduce.Mapper$Context$$EnhancerByMockitoWithCGLIB$$ba4633fb.write()
> >
> >
> > And here is mapper code:
> >
> >
> >
> > public class ItemRecommendationHBaseMapper extends Mapper > BytesWritable, ImmutableBytesWritable, Put> {
> >
> > private final ImmutableBytesWritable hbaseKey = new
> > ImmutableBytesWritable();
> > private final DynamicObjectSerDe serde = new
> > DynamicObjectSerDe(ItemRecommendation.class);
> >
> > @Override
> > protected void map(LongWritable key, BytesWritable value, Context
> > context) throws IOException, InterruptedException {
> > checkPreconditions(key, value);
> > hbaseKey.set(Bytes.toBytes(key.get()));
> >
> > ItemRecommendation item = serde.deserialize(value.getBytes());
> > checkPreconditions(item);
> > Put put = PutFactory.createPut(serde, item, getColumnFamily());
> >
> > context.write(hbaseKey, put); //Exception here
> > }
> >
> > Whatcan i do in order to make unit-test pass?
> >
>
>
>
> --
> Sean
>

Re: No applicable class implementing Serialization in conf at io.serializations: class org.apache.hadoop.hbase.client.Put

2014-11-02 Thread Serega Sheypak

Sean, I've started to catch that serialization problem on unit-test level
while using mrunit.
I don't see any possibility to call HFileOutputFormat.configureIncrementalLoad
before mrunit mocking stuff.
I was workngi w/o any problem in 0.94 :)


2014-11-03 1:08 GMT+03:00 Sean Busbey :

> If you're calling HFileOutputFormat.configureIncrementalLoad, that should
> be setting up the Serialization for you.
>
> Can you look at the job configuration and see what's present for the key
> "io.serializations"?
>
> -Sean
>
> On Sun, Nov 2, 2014 at 3:53 PM, Serega Sheypak 
> wrote:
>
> > I use it to prepare HFile using my custom mapper emitting Put and
> >   HFileOutputFormat.configureIncrementalLoad(job, createHTable())
> > //connection to target table
> >
> > and then bulk load data to table using LoadIncrementalHFiles
> >
> > P.S.
> > HFileOutputFormat is also deprecated... so many changes... (((
> >
> >
> > 2014-11-03 0:41 GMT+03:00 Sean Busbey :
> >
> > > In the 0.94.x API, Put implemented Writable[1]. This meant that MR
> code,
> > > like yours, could use it as a Key or Value between Mapper and Reducer.
> > >
> > > In 0.96 and later APIs, Put no longer directly implements Writable[2].
> > > Instead, HBase now includes a Hadoop Seriazliation implementation.
> > > Normally, this would be configured via the TableMapReduceUtil class for
> > > either a TableMapper or TableReducer.
> > >
> > > Presuming that the intention of your MR job is to have all the Puts
> write
> > > to some HBase table, you should be able to follow the "write to HBase"
> > part
> > > of the examples for reading and writing HBase via mapreduce in the
> > > reference guide[3].
> > >
> > > Specifically, you should have your Driver call one of the
> > > initTableReducerJob methods on TableMapReduceUtil, where it currently
> > sets
> > > the Mapper class for your application[4].
> > >
> > > -Sean
> > >
> > > [1]:
> > >
> > >
> >
> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/Put.html
> > > [2]:
> > >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html
> > > [3]: http://hbase.apache.org/book/mapreduce.example.html
> > > [4]:
> > >
> > >
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.html
> > >
> > >
> > > On Sun, Nov 2, 2014 at 3:02 PM, Serega Sheypak <
> serega.shey...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi, I'm migrating from CDH4 to CDH5 (hbase 0.98.6-cdh5.2.0)
> > > > I had a unit test for mapper used to create HFile and bulk load
> later.
> > > >
> > > > I've bumped maven deps from cdh4 to cdh5 0.98.6-cdh5.2.0
> > > > Now I've started to get exception
> > > >
> > > > java.lang.IllegalStateException: No applicable class implementing
> > > > Serialization in conf at io.serializations: class
> > > > org.apache.hadoop.hbase.client.Put
> > > > at
> > > com.google.common.base.Preconditions.checkState(Preconditions.java:149)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:75)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:97)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.mrunit.internal.output.MockOutputCollector.collect(MockOutputCollector.java:48)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.mrunit.internal.mapreduce.AbstractMockContextWrapper$4.answer(AbstractMockContextWrapper.java:90)
> > > > at
> > > >
> > > >
> > >
> >
> org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:34)
> > > > at
> > > >
> > > >
> > >
> >
> org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:91)
> > > > at
> > > >
> > > >
> > >
> >
> org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29)
> > > > at
> > > >
> > > >
> > >
> >
> org.mockito.internal.handler.InvocationNotifierHandl

Re: No applicable class implementing Serialization in conf at io.serializations: class org.apache.hadoop.hbase.client.Put

2014-11-02 Thread Serega Sheypak

Cool, is it this stuff?
http://hbase.apache.org/book/hfilev2.html

2014-11-03 1:10 GMT+03:00 Sean Busbey :

> On Sun, Nov 2, 2014 at 3:53 PM, Serega Sheypak 
> wrote:
>
> > P.S.
> > HFileOutputFormat is also deprecated... so many changes... (((
> >
> >
>
> Incidentally, you should consider switching to HFileOutputFormat2, since
> you rely on the version that has a Mapper outputting Put values instead of
> KeyValue the impact on you should be negligible.
>
>
> --
> Sean
>

Re: No applicable class implementing Serialization in conf at io.serializations: class org.apache.hadoop.hbase.client.Put

2014-11-03 Thread Serega Sheypak

Ok, I got it. Thank you!

2014-11-03 2:20 GMT+03:00 Sean Busbey :

> On Sun, Nov 2, 2014 at 5:09 PM, Ted Yu  wrote:
>
> > bq. context.write(hbaseKey, put); //Exception here
> >
> > I am not mrunit expert. But as long as you call the following method
> prior
> > to the above method invocation, you should be able to proceed:
> >
> > conf.setStrings("io.serializations", conf.get("io.serializations"),
> >
> > MutationSerialization.class.getName(), ResultSerialization.class
> > .getName(),
> >
> > KeyValueSerialization.class.getName());
> >
> >
>
> Those classes are not a part of the public HBase API, so directly
> referencing them is a bad idea. Doing so just sets them up to break on some
> future HBase upgrade.
>
> The OP needs a place in MRUnit to call one of
> HFileOutputFormat.configureIncrementalLoad,
> HFileOutputFormat2.configureIncrementalLoad, or
> TableMapReduceUtil.initTableReducerJob. Those are the only public API ways
> to configure the needed Serialization.
>
> --
> Sean
>

What can corrupt HBase table and what is Cannot find row in .META. for table?

2014-11-18 Thread Serega Sheypak

hi, sometimes I do get im my web application log:
org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META.
for table: my_favorite_table

I've found this
http://grokbase.com/t/hbase/user/143bn79wf2/cannot-find-row-in-meta-for-table

I did run hbase hbck
result:
0 inconsistencies detected.
Status: OK

What can I try next?
I'm using Cloudera CDH 5.2 HBase 0.98

Thanks!

Re: What can corrupt HBase table and what is Cannot find row in .META. for table?

2014-11-19 Thread Serega Sheypak

Hi, I'm using Java API. I see mentioned exception in java log.
I'll provide full stacktrace next time.

2014-11-19 1:01 GMT+03:00 Ted Yu :

> The thread you mentioned was more about thrift API rather than
> TableNotFoundException.
>
> Can you show us the stack trace of TableNotFoundException (vicinity of app
> log around the exception) ?
> Please also check master log / meta region server log.
>
> I assume you could access the table using hbase shell.
>
> Cheers
>
> On Tue, Nov 18, 2014 at 12:57 PM, Serega Sheypak  >
> wrote:
>
> > hi, sometimes I do get im my web application log:
> > org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META.
> > for table: my_favorite_table
> >
> > I've found this
> >
> >
> http://grokbase.com/t/hbase/user/143bn79wf2/cannot-find-row-in-meta-for-table
> >
> > I did run hbase hbck
> > result:
> > 0 inconsistencies detected.
> > Status: OK
> >
> > What can I try next?
> > I'm using Cloudera CDH 5.2 HBase 0.98
> >
> > Thanks!
> >
>

HConnectionManager leaks with zookeeper conection oo many connections from /my.tomcat.server.com - max is 60

2014-12-12 Thread Serega Sheypak

Hi, I'm using HConnectionManager from java servlet
Looks like it's leaking, all my zookeepers complains that there is too many
connections from servlet hosts.
Typical line from lZK log:

oo many connections from /my.tomcat.server.com  - max is 60


Here is code sample

public class BaseConnection {

private static final Logger LOG =
LoggerFactory.getLogger(BaseConnection.class);

protected void close(HConnection connection){
try{
if(connection == null){
return;
}
connection.close();
}
catch (Exception e){
LOG.warn("Error while closing HConnection", e);
}
}

protected void close(HTableInterface hTable){
try{
if(hTable == null){
return;
}
hTable.close();
}
catch (Exception e){
LOG.warn("Error while closing HTable", e);
}
}
}

sample PUT code from subclass:

public void put(List entries){
HConnection hConnection = null;
HTableInterface hTable = null;
try {
List puts = new ArrayList(entries.size());
for(MyBean myBean : entries){
puts.add(new MyBeanSerDe().createPut(myBean));
}
hConnection = HConnectionManager.createConnection(configuration);
hTable = hConnection.getTable(NAME_B);
hTable.put(puts);

}catch (Exception e){
LOG.error("Error while doing bulk put", e);
}finally{
close(hTable);
close(hConnection);
}
}

Re: HConnectionManager leaks with zookeeper conection oo many connections from /my.tomcat.server.com - max is 60

2014-12-12 Thread Serega Sheypak

Hi, I'm using CDH 5.2, 0.98
I don't know how to use it correctly. I've just used this sample:
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HConnectionManager.html

 HConnection connection = HConnectionManager.createConnection(config);
 HTableInterface table = connection.getTable(TableName.valueOf("table1"));
 try {
   // Use the table as needed, for a single operation and a single thread
 } finally {
   table.close();
   connection.close();
 }


Functional testing OK, performance fails :



2014-12-12 19:32 GMT+03:00 Stack :
>
> Does zk count go up on each request to the servlet? Which version of hbase
> so can try on this end?  Do you have some client-side log? Better if you
> cache the connection rather than make it each time since setup is costly
> but lets fix first problem first.
> St.Ack
>
> On Fri, Dec 12, 2014 at 2:47 AM, Serega Sheypak 
> wrote:
>
> > Hi, I'm using HConnectionManager from java servlet
> > Looks like it's leaking, all my zookeepers complains that there is too
> many
> > connections from servlet hosts.
> > Typical line from lZK log:
> >
> > oo many connections from /my.tomcat.server.com  - max is 60
> >
> >
> > Here is code sample
> >
> > public class BaseConnection {
> >
> > private static final Logger LOG =
> > LoggerFactory.getLogger(BaseConnection.class);
> >
> > protected void close(HConnection connection){
> > try{
> > if(connection == null){
> > return;
> > }
> > connection.close();
> > }
> > catch (Exception e){
> > LOG.warn("Error while closing HConnection", e);
> > }
> > }
> >
> > protected void close(HTableInterface hTable){
> > try{
> > if(hTable == null){
> > return;
> > }
> > hTable.close();
> > }
> > catch (Exception e){
> > LOG.warn("Error while closing HTable", e);
> > }
> > }
> > }
> >
> > sample PUT code from subclass:
> >
> > public void put(List entries){
> > HConnection hConnection = null;
> > HTableInterface hTable = null;
> > try {
> > List puts = new ArrayList(entries.size());
> > for(MyBean myBean : entries){
> > puts.add(new MyBeanSerDe().createPut(myBean));
> > }
> > hConnection = HConnectionManager.createConnection(configuration);
> > hTable = hConnection.getTable(NAME_B);
> > hTable.put(puts);
> >
> > }catch (Exception e){
> > LOG.error("Error while doing bulk put", e);
> > }finally{
> > close(hTable);
> > close(hConnection);
> > }
> > }
> >
>

Re: HConnectionManager leaks with zookeeper conection oo many connections from /my.tomcat.server.com - max is 60

2014-12-12 Thread Serega Sheypak

eper.ClientCnxn: Socket connection established to
> localhost/0:0:0:0:0:0:0:1:2181, initiating session
> 2014-12-12 11:26:10,515 INFO  [main-SendThread(localhost:2181)]
> zookeeper.ClientCnxn: Session establishment complete on server
> localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x14a3fe0664b083b, negotiated
> timeout = 4
> 2014-12-12 11:26:10,519 DEBUG [main] client.ClientSmallScanner: Finished
> with small scan at {ENCODED => 1588230740, NAME => 'hbase:meta,,1',
> STARTKEY => '', ENDKEY => ''}
> 2014-12-12 11:26:10,521 INFO  [main]
> client.HConnectionManager$HConnectionImplementation: Closing zookeeper
> sessionid=0x14a3fe0664b083b
> 2014-12-12 11:26:10,521 INFO  [main] zookeeper.ZooKeeper: Session:
> 0x14a3fe0664b083b closed
> 2014-12-12 11:26:10,521 INFO  [main-EventThread] zookeeper.ClientCnxn:
> EventThread shut down
> i=998
> 2014-12-12 11:26:10,627 INFO  [main] zookeeper.RecoverableZooKeeper:
> Process identifier=hconnection-0x3822f407 connecting to ZooKeeper
> ensemble=localhost:2181
> 2014-12-12 11:26:10,627 INFO  [main] zookeeper.ZooKeeper: Initiating client
> connection, connectString=localhost:2181 sessionTimeout=9
> watcher=hconnection-0x3822f407, quorum=localhost:2181, baseZNode=/hbase
> 2014-12-12 11:26:10,628 INFO  [main-SendThread(localhost:2181)]
> zookeeper.ClientCnxn: Opening socket connection to server localhost/
> 127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown
> error)
> 2014-12-12 11:26:10,629 INFO  [main-SendThread(localhost:2181)]
> zookeeper.ClientCnxn: Socket connection established to localhost/
> 127.0.0.1:2181, initiating session
> 2014-12-12 11:26:10,631 INFO  [main-SendThread(localhost:2181)]
> zookeeper.ClientCnxn: Session establishment complete on server localhost/
> 127.0.0.1:2181, sessionid = 0x14a3fe0664b083c, negotiated timeout = 4
> 2014-12-12 11:26:10,637 DEBUG [main] client.ClientSmallScanner: Finished
> with small scan at {ENCODED => 1588230740, NAME => 'hbase:meta,,1',
> STARTKEY => '', ENDKEY => ''}
> 2014-12-12 11:26:10,638 INFO  [main]
> client.HConnectionManager$HConnectionImplementation: Closing zookeeper
> sessionid=0x14a3fe0664b083c
> 2014-12-12 11:26:10,639 INFO  [main] zookeeper.ZooKeeper: Session:
> 0x14a3fe0664b083c closed
> 2014-12-12 11:26:10,639 INFO  [main-EventThread] zookeeper.ClientCnxn:
> EventThread shut down
> i=999
>
>
>
>
> On Fri, Dec 12, 2014 at 10:22 AM, Serega Sheypak  >
> wrote:
>
> > Hi, I'm using CDH 5.2, 0.98
> > I don't know how to use it correctly. I've just used this sample:
> >
> >
> https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HConnectionManager.html
> >
> >  HConnection connection = HConnectionManager.createConnection(config);
> >  HTableInterface table =
> connection.getTable(TableName.valueOf("table1"));
> >  try {
> >// Use the table as needed, for a single operation and a single thread
> >  } finally {
> >table.close();
> >connection.close();
> >  }
> >
> >
> > Functional testing OK, performance fails :
> >
> >
> >
> > 2014-12-12 19:32 GMT+03:00 Stack :
> > >
> > > Does zk count go up on each request to the servlet? Which version of
> > hbase
> > > so can try on this end?  Do you have some client-side log? Better if
> you
> > > cache the connection rather than make it each time since setup is
> costly
> > > but lets fix first problem first.
> > > St.Ack
> > >
> > > On Fri, Dec 12, 2014 at 2:47 AM, Serega Sheypak <
> > serega.shey...@gmail.com>
> > > wrote:
> > >
> > > > Hi, I'm using HConnectionManager from java servlet
> > > > Looks like it's leaking, all my zookeepers complains that there is
> too
> > > many
> > > > connections from servlet hosts.
> > > > Typical line from lZK log:
> > > >
> > > > oo many connections from /my.tomcat.server.com  - max is 60
> > > >
> > > >
> > > > Here is code sample
> > > >
> > > > public class BaseConnection {
> > > >
> > > > private static final Logger LOG =
> > > > LoggerFactory.getLogger(BaseConnection.class);
> > > >
> > > > protected void close(HConnection connection){
> > > > try{
> > > > if(connection == null){
> > > > return;
> > > > }
> > > > connection.close();
> > > > }
>

Re: HConnectionManager leaks with zookeeper conection oo many connections from /my.tomcat.server.com - max is 60

2014-12-13 Thread Serega Sheypak

Hm... confusing,
So here is the code path:
1. Servlet has doPost method
2. HistoryController instantiated during each doPost request.
HistoryController is Logic wrapper around low-level HBase persistence stuff
Each doPost invokes:


final HistoryController getHistoryController(){
return new HistoryController(configuration: getHBaseConfiguration())
}

getHistoryController().updateHistory(historyEntryVisitorId)

3. HistoryController has method:

private HistoryEntryConnection getConnection(){
new HistoryEntryConnection(configuration)
}


so each

updateHistory(historyEntryVisitorId)


gets new low-level HistoryEntryConnection

instance.

HistoryEntryConnection is dead simple, I've shown it previously

public void put(List entries){
HConnection hConnection = null;
HTableInterface hTable = null;
try {
List puts = new ArrayList(entries.size());
for(MyBean myBean : entries){
puts.add(new MyBeanSerDe().createPut(myBean));
}
hConnection = HConnectionManager.createConnection(configuration);
hTable = hConnection.getTable(NAME_B);
hTable.put(puts);

}catch (Exception e){
LOG.error("Error while doing bulk put", e);
}finally{
close(hTable);
close(hConnection);
}
}


Manager, Connection, HTable usage copy-pasted from javadoc. Nothing new.
BTW, is there any possibility to limit quantity of opened connections in
HConnectionManager internals ?
I have several web-apps using HBase as backends. Maybe their cumulative
pressure makes ZK go down...


2014-12-13 4:03 GMT+03:00 Stack :
>
> On Fri, Dec 12, 2014 at 11:45 AM, Serega Sheypak  >
> wrote:
> >
> > i have 10K doPost/doGet requests per second.
> >
>
> How many concurrent threads going on in your tomcat?  Is the Connection
> shared amongst all threads?  Perhaps you have > 60 concurrent connections
> running at a time and each Connection has its own zk instance.  Thread
> dump?
>
> Post all the code?
>
> I tried multithreaded with 100 threads but still can't repro (see code
> below).  Next would be to get zk to dump concurrent connection count but
> lets see what your answers above are first.
>
> St.Ack
>
>
> public class TestConnnection {
>
>   static class Putter extends Thread {
>
> Configuration c;
>
> Putter(final Configuration c, final String i) {
>
>   super(i);
>
>   this.c = c;
>
> }
>
>
> @Override
>
> public void run() {
>
>   try {
>
>   for (int i = 0; i < 100; i++) {
>
> HConnection connection = HConnectionManager.createConnection(this.c
> );
>
> HTableInterface table = connection.getTable(TableName.valueOf(
> "table1"));
>
> byte [] cf = Bytes.toBytes("t");
>
> try {
>
>   byte [] bytes = Bytes.toBytes(i);
>
>   Put p = new Put(bytes);
>
>   p.add(cf, cf, bytes);
>
>   table.put(p);
>
> } finally {
>
>   table.close();
>
>   connection.close();
>
> }
>
> System.out.println(getName() + ", i=" + i);
>
>   }
>
>   } catch (Throwable t) {
>
> throw new RuntimeException(t);
>
>   }
>
> }
>
>   };
>
>
>   public static void main(String[] args) throws IOException {
>
> Configuration config = HBaseConfiguration.create();
>
> for (int i = 0; i < 100; i++) {
>
>   Thread t = new Putter(config, "" + i);
>
>   t.start();
>
> }
>
>   }
>
> }
>
>
>
>
>
> > Servlet is NOT single-threaded. each doPost/doGet
> > invokes these lines (encapsulated in DAO):
> >
> >  16   HConnection connection =
> > HConnectionManager.createConnection(config);
> >  17   HTableInterface table =
> > connection.getTable(TableName.valueOf("table1"));
> >
> > and
> >
> > 24   } finally {
> >  25 table.close();
> >  26 connection.close();
> >  27   }
> >
> > I assumed that this static construction
> >
> > 16   HConnection connection =
> > HConnectionManager.createConnection(config);
> >
> > correctly handles multi-threaded access somewhere deep inside.
> > Right now I don't understand what do I do wrong.
> >
> > Try to wrap each your request in Runnable to emulate multi-threaded
> > pressure on ZK. Your code is linear and mine is not, it's concurrent.
> >
> > Thanks
> >
> >
> >
> >
> > 2014-12-12 22:28 GMT+03:00 Stack :
> > >
> > > I ca

Re: HConnectionManager leaks with zookeeper conection oo many connections from /my.tomcat.server.com - max is 60

2014-12-13 Thread Serega Sheypak

So the idea is
1. instantiate HConnection using HConnectionManager once
2. Create HTable instance for each Servlet.doPost and close after operation
is done.
Is that correct?

Do region locations cached in this case?
Are ZK connections to region/ZK reused?
Can I "harm" put/get data because of Servlet concurrency? Each connection
could be used in many threads fro different HTable instances for the same
HBase table?


2014-12-13 22:21 GMT+03:00 lars hofhansl :
>
> Note also that the createConnection part is somewhat expensive (creates a
> new thread pool for use with Puts, also does a ZK lookup, etc).If possible
> create the connection ahead of time and only get/close an HTable per
> request/thread.
> -- Lars
>   From: Serega Sheypak 
>  To: user 
>  Sent: Friday, December 12, 2014 11:45 AM
>  Subject: Re: HConnectionManager leaks with zookeeper conection oo many
> connections from /my.tomcat.server.com - max is 60
>
> i have 10K doPost/doGet requests per second.
> Servlet is NOT single-threaded. each doPost/doGet
> invokes these lines (encapsulated in DAO):
>
>  16  HConnection connection =
> HConnectionManager.createConnection(config);
>  17  HTableInterface table =
> connection.getTable(TableName.valueOf("table1"));
>
> and
>
> 24  } finally {
>  25table.close();
>  26connection.close();
>  27  }
>
> I assumed that this static construction
>
> 16  HConnection connection =
> HConnectionManager.createConnection(config);
>
> correctly handles multi-threaded access somewhere deep inside.
> Right now I don't understand what do I do wrong.
>
> Try to wrap each your request in Runnable to emulate multi-threaded
> pressure on ZK. Your code is linear and mine is not, it's concurrent.
>
> Thanks
>
>
>
>
>
>
> 2014-12-12 22:28 GMT+03:00 Stack :
> >
> > I cannot reproduce. I stood up a cdh5.2 server and then copy/pasted your
> > code adding in a put for each cycle.  I ran loop 1000 times and no
> > complaint from zk.
> >
> > Tell me more (Is servlet doing single-threaded model?  A single
> > Configuration is being used or new ones are being created per servlet
> > invocation?
> >
> > Below is code and output.
> >
> > (For better perf, cache the connection)
> > St.Ack
> >
> >
> >  1 package org.apache.hadoop.hbase;
> >  2
> >  3 import java.io.IOException;
> >  4
> >  5 import org.apache.hadoop.conf.Configuration;
> >  6 import org.apache.hadoop.hbase.client.HConnection;
> >  7 import org.apache.hadoop.hbase.client.HConnectionManager;
> >  8 import org.apache.hadoop.hbase.client.HTableInterface;
> >  9 import org.apache.hadoop.hbase.client.Put;
> >  10 import org.apache.hadoop.hbase.util.Bytes;
> >  11
> >  12 public class TestConnnection {
> >  13  public static void main(String[] args) throws IOException {
> >  14Configuration config = HBaseConfiguration.create();
> >  15for (int i = 0; i < 1000; i++) {
> >  16  HConnection connection =
> > HConnectionManager.createConnection(config);
> >  17  HTableInterface table =
> > connection.getTable(TableName.valueOf("table1"));
> >  18  byte [] cf = Bytes.toBytes("t");
> >  19  try {
> >  20byte [] bytes = Bytes.toBytes(i);
> >  21Put p = new Put(bytes);
> >  22p.add(cf, cf, bytes);
> >  23table.put(p);
> >  24  } finally {
> >  25table.close();
> >  26connection.close();
> >  27  }
> >  28  System.out.println("i=" + i);
> >  29}
> >  30  }
> >  31 }
> >
> >
> > 
> > 2014-12-12 11:26:10,397 INFO  [main] zookeeper.RecoverableZooKeeper:
> > Process identifier=hconnection-0x70dfa475 connecting to ZooKeeper
> > ensemble=localhost:2181
> > 2014-12-12 11:26:10,397 INFO  [main] zookeeper.ZooKeeper: Initiating
> client
> > connection, connectString=localhost:2181 sessionTimeout=9
> > watcher=hconnection-0x70dfa475, quorum=localhost:2181, baseZNode=/hbase
> > 2014-12-12 11:26:10,398 INFO  [main-SendThread(localhost:2181)]
> > zookeeper.ClientCnxn: Opening socket connection to server localhost/
> > 127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown
> > error)
> > 2014-12-12 11:26:10,398 INFO  [main-SendThread(localhost:2181)]
> > zookeeper.ClientCnxn: Socket connection established to localhost/
> > 127.0.0.1:2181, initiating session
> > 2014-12-12 11:26:10,401 INFO  [main-SendThread(localhost:2181)]
> > zookeeper.ClientCnxn: Session est

Re: HConnectionManager leaks with zookeeper conection oo many connections from /my.tomcat.server.com - max is 60

2014-12-13 Thread Serega Sheypak

Great, I'll refactor the code. and report back

2014-12-13 22:36 GMT+03:00 Stack :
>
> On Sat, Dec 13, 2014 at 11:33 AM, Serega Sheypak  >
> wrote:
> >
> > So the idea is
> > 1. instantiate HConnection using HConnectionManager once
> > 2. Create HTable instance for each Servlet.doPost and close after
> operation
> > is done.
> > Is that correct?
> >
>
>
> Yes.
>
>
>
> >
> > Do region locations cached in this case?
> > Are ZK connections to region/ZK reused?
> >
>
> Yes
>
>
> > Can I "harm" put/get data because of Servlet concurrency? Each connection
> > could be used in many threads fro different HTable instances for the same
> > HBase table?
> >
> >
> Its a bug if the above has issues.
>
> St.Ack
>
>
>
> >
> > 2014-12-13 22:21 GMT+03:00 lars hofhansl :
> > >
> > > Note also that the createConnection part is somewhat expensive
> (creates a
> > > new thread pool for use with Puts, also does a ZK lookup, etc).If
> > possible
> > > create the connection ahead of time and only get/close an HTable per
> > > request/thread.
> > > -- Lars
> > >   From: Serega Sheypak 
> > >  To: user 
> > >  Sent: Friday, December 12, 2014 11:45 AM
> > >  Subject: Re: HConnectionManager leaks with zookeeper conection oo many
> > > connections from /my.tomcat.server.com - max is 60
> > >
> > > i have 10K doPost/doGet requests per second.
> > > Servlet is NOT single-threaded. each doPost/doGet
> > > invokes these lines (encapsulated in DAO):
> > >
> > >  16  HConnection connection =
> > > HConnectionManager.createConnection(config);
> > >  17  HTableInterface table =
> > > connection.getTable(TableName.valueOf("table1"));
> > >
> > > and
> > >
> > > 24  } finally {
> > >  25table.close();
> > >  26connection.close();
> > >  27  }
> > >
> > > I assumed that this static construction
> > >
> > > 16  HConnection connection =
> > > HConnectionManager.createConnection(config);
> > >
> > > correctly handles multi-threaded access somewhere deep inside.
> > > Right now I don't understand what do I do wrong.
> > >
> > > Try to wrap each your request in Runnable to emulate multi-threaded
> > > pressure on ZK. Your code is linear and mine is not, it's concurrent.
> > >
> > > Thanks
> > >
> > >
> > >
> > >
> > >
> > >
> > > 2014-12-12 22:28 GMT+03:00 Stack :
> > > >
> > > > I cannot reproduce. I stood up a cdh5.2 server and then copy/pasted
> > your
> > > > code adding in a put for each cycle.  I ran loop 1000 times and no
> > > > complaint from zk.
> > > >
> > > > Tell me more (Is servlet doing single-threaded model?  A single
> > > > Configuration is being used or new ones are being created per servlet
> > > > invocation?
> > > >
> > > > Below is code and output.
> > > >
> > > > (For better perf, cache the connection)
> > > > St.Ack
> > > >
> > > >
> > > >  1 package org.apache.hadoop.hbase;
> > > >  2
> > > >  3 import java.io.IOException;
> > > >  4
> > > >  5 import org.apache.hadoop.conf.Configuration;
> > > >  6 import org.apache.hadoop.hbase.client.HConnection;
> > > >  7 import org.apache.hadoop.hbase.client.HConnectionManager;
> > > >  8 import org.apache.hadoop.hbase.client.HTableInterface;
> > > >  9 import org.apache.hadoop.hbase.client.Put;
> > > >  10 import org.apache.hadoop.hbase.util.Bytes;
> > > >  11
> > > >  12 public class TestConnnection {
> > > >  13  public static void main(String[] args) throws IOException {
> > > >  14Configuration config = HBaseConfiguration.create();
> > > >  15for (int i = 0; i < 1000; i++) {
> > > >  16  HConnection connection =
> > > > HConnectionManager.createConnection(config);
> > > >  17  HTableInterface table =
> > > > connection.getTable(TableName.valueOf("table1"));
> > > >  18  byte [] cf = Bytes.toBytes("t");
> > > >  19  try {
> > > >  20byte [] bytes = Bytes.toBytes(i);
> > > >  21Put p = new Put(by

Re: HConnectionManager leaks with zookeeper conection oo many connections from /my.tomcat.server.com - max is 60

2014-12-15 Thread Serega Sheypak

Hi, the problem is gone.
I did what you say :)
Thanks!

2014-12-13 22:38 GMT+03:00 Serega Sheypak :

> Great, I'll refactor the code. and report back
>
> 2014-12-13 22:36 GMT+03:00 Stack :
>>
>> On Sat, Dec 13, 2014 at 11:33 AM, Serega Sheypak <
>> serega.shey...@gmail.com>
>> wrote:
>> >
>> > So the idea is
>> > 1. instantiate HConnection using HConnectionManager once
>> > 2. Create HTable instance for each Servlet.doPost and close after
>> operation
>> > is done.
>> > Is that correct?
>> >
>>
>>
>> Yes.
>>
>>
>>
>> >
>> > Do region locations cached in this case?
>> > Are ZK connections to region/ZK reused?
>> >
>>
>> Yes
>>
>>
>> > Can I "harm" put/get data because of Servlet concurrency? Each
>> connection
>> > could be used in many threads fro different HTable instances for the
>> same
>> > HBase table?
>> >
>> >
>> Its a bug if the above has issues.
>>
>> St.Ack
>>
>>
>>
>> >
>> > 2014-12-13 22:21 GMT+03:00 lars hofhansl :
>> > >
>> > > Note also that the createConnection part is somewhat expensive
>> (creates a
>> > > new thread pool for use with Puts, also does a ZK lookup, etc).If
>> > possible
>> > > create the connection ahead of time and only get/close an HTable per
>> > > request/thread.
>> > > -- Lars
>> > >   From: Serega Sheypak 
>> > >  To: user 
>> > >  Sent: Friday, December 12, 2014 11:45 AM
>> > >  Subject: Re: HConnectionManager leaks with zookeeper conection oo
>> many
>> > > connections from /my.tomcat.server.com - max is 60
>> > >
>> > > i have 10K doPost/doGet requests per second.
>> > > Servlet is NOT single-threaded. each doPost/doGet
>> > > invokes these lines (encapsulated in DAO):
>> > >
>> > >  16  HConnection connection =
>> > > HConnectionManager.createConnection(config);
>> > >  17  HTableInterface table =
>> > > connection.getTable(TableName.valueOf("table1"));
>> > >
>> > > and
>> > >
>> > > 24  } finally {
>> > >  25table.close();
>> > >  26connection.close();
>> > >  27  }
>> > >
>> > > I assumed that this static construction
>> > >
>> > > 16  HConnection connection =
>> > > HConnectionManager.createConnection(config);
>> > >
>> > > correctly handles multi-threaded access somewhere deep inside.
>> > > Right now I don't understand what do I do wrong.
>> > >
>> > > Try to wrap each your request in Runnable to emulate multi-threaded
>> > > pressure on ZK. Your code is linear and mine is not, it's concurrent.
>> > >
>> > > Thanks
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > 2014-12-12 22:28 GMT+03:00 Stack :
>> > > >
>> > > > I cannot reproduce. I stood up a cdh5.2 server and then copy/pasted
>> > your
>> > > > code adding in a put for each cycle.  I ran loop 1000 times and no
>> > > > complaint from zk.
>> > > >
>> > > > Tell me more (Is servlet doing single-threaded model?  A single
>> > > > Configuration is being used or new ones are being created per
>> servlet
>> > > > invocation?
>> > > >
>> > > > Below is code and output.
>> > > >
>> > > > (For better perf, cache the connection)
>> > > > St.Ack
>> > > >
>> > > >
>> > > >  1 package org.apache.hadoop.hbase;
>> > > >  2
>> > > >  3 import java.io.IOException;
>> > > >  4
>> > > >  5 import org.apache.hadoop.conf.Configuration;
>> > > >  6 import org.apache.hadoop.hbase.client.HConnection;
>> > > >  7 import org.apache.hadoop.hbase.client.HConnectionManager;
>> > > >  8 import org.apache.hadoop.hbase.client.HTableInterface;
>> > > >  9 import org.apache.hadoop.hbase.client.Put;
>> > > >  10 import org.apache.hadoop.hbase.util.Bytes;
>> > > >  11
>> > > >  12 public class TestConnnection {
>> > > >  13  public stat

Re: HConnectionManager leaks with zookeeper conection oo many connections from /my.tomcat.server.com - max is 60

2014-12-15 Thread Serega Sheypak

was: 200-300 ms per request
now: <80 ms request

request=full trip from servlet to HBase and back to response.

2014-12-15 22:40 GMT+03:00 lars hofhansl :
>
> Excellent! Should be quite a bit faster too.
> -- Lars
>   From: Serega Sheypak 
>  To: user 
> Cc: lars hofhansl 
>  Sent: Monday, December 15, 2014 5:57 AM
>  Subject: Re: HConnectionManager leaks with zookeeper conection oo many
> connections from /my.tomcat.server.com - max is 60
>
> Hi, the problem is gone.
> I did what you say :)
> Thanks!
>
>
>
> 2014-12-13 22:38 GMT+03:00 Serega Sheypak :
>
> > Great, I'll refactor the code. and report back
> >
> > 2014-12-13 22:36 GMT+03:00 Stack :
> >>
> >> On Sat, Dec 13, 2014 at 11:33 AM, Serega Sheypak <
> >> serega.shey...@gmail.com>
> >> wrote:
> >> >
> >> > So the idea is
> >> > 1. instantiate HConnection using HConnectionManager once
> >> > 2. Create HTable instance for each Servlet.doPost and close after
> >> operation
> >> > is done.
> >> > Is that correct?
> >> >
> >>
> >>
> >> Yes.
> >>
> >>
> >>
> >> >
> >> > Do region locations cached in this case?
> >> > Are ZK connections to region/ZK reused?
> >> >
> >>
> >> Yes
> >>
> >>
> >> > Can I "harm" put/get data because of Servlet concurrency? Each
> >> connection
> >> > could be used in many threads fro different HTable instances for the
> >> same
> >> > HBase table?
> >> >
> >> >
> >> Its a bug if the above has issues.
> >>
> >> St.Ack
> >>
> >>
> >>
> >> >
> >> > 2014-12-13 22:21 GMT+03:00 lars hofhansl :
> >> > >
> >> > > Note also that the createConnection part is somewhat expensive
> >> (creates a
> >> > > new thread pool for use with Puts, also does a ZK lookup, etc).If
> >> > possible
> >> > > create the connection ahead of time and only get/close an HTable per
> >> > > request/thread.
> >> > > -- Lars
> >> > >  From: Serega Sheypak 
> >> > >  To: user 
> >> > >  Sent: Friday, December 12, 2014 11:45 AM
> >> > >  Subject: Re: HConnectionManager leaks with zookeeper conection oo
> >> many
> >> > > connections from /my.tomcat.server.com - max is 60
> >> > >
> >> > > i have 10K doPost/doGet requests per second.
> >> > > Servlet is NOT single-threaded. each doPost/doGet
> >> > > invokes these lines (encapsulated in DAO):
> >> > >
> >> > >  16  HConnection connection =
> >> > > HConnectionManager.createConnection(config);
> >> > >  17  HTableInterface table =
> >> > > connection.getTable(TableName.valueOf("table1"));
> >> > >
> >> > > and
> >> > >
> >> > > 24  } finally {
> >> > >  25table.close();
> >> > >  26connection.close();
> >> > >  27  }
> >> > >
> >> > > I assumed that this static construction
> >> > >
> >> > > 16  HConnection connection =
> >> > > HConnectionManager.createConnection(config);
> >> > >
> >> > > correctly handles multi-threaded access somewhere deep inside.
> >> > > Right now I don't understand what do I do wrong.
> >> > >
> >> > > Try to wrap each your request in Runnable to emulate multi-threaded
> >> > > pressure on ZK. Your code is linear and mine is not, it's
> concurrent.
> >> > >
> >> > > Thanks
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > 2014-12-12 22:28 GMT+03:00 Stack :
> >> > > >
> >> > > > I cannot reproduce. I stood up a cdh5.2 server and then
> copy/pasted
> >> > your
> >> > > > code adding in a put for each cycle.  I ran loop 1000 times and no
> >> > > > complaint from zk.
> >> > > >
> >> > > > Tell me more (Is servlet doing single-threaded model?  A single
> >> > > > Configuration is being used or new ones are being created per
> >> servl

Threads leaking from Apache tomcat application

2015-01-05 Thread Serega Sheypak

Hi, I'm still trying to deal with apache tomcat web-app and hbase HBase
0.98.6
The root problem is that user threads constantly grows. I do get thousands
of live threads on tomcat instance. Then it dies of course.

please see visualVM threads count dynamics
[image: Встроенное изображение 1]

Please see selected thread. It should be related to zookeeper (because of
thread-name suffix SendThread)

[image: Встроенное изображение 2]

The threaddump for this thread is:

"visit-thread-27799752116280271-EventThread" - Thread t@75
   java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <34671cea> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:494)

   Locked ownable synchronizers:
- None

Why does it live "forever"? I next 24 hours I would get ~1200 live theads.

"visit thread" does simple put/get by key, newrelic says it takes 30-40 ms
to respond.
I just set a name for the thread inside servlet method.

Here is CPU profiling result:
[image: Встроенное изображение 3]

Here are some Zookeeper metrics.

[image: Встроенное изображение 4]

Re: Threads leaking from Apache tomcat application

2015-01-05 Thread Serega Sheypak

Hi, which mail client you use? I'm using gmail from chrome and see my
letter with four inlined images.
There are no links, there are 3 images. I'll reattach them. Maybe the
problem is in them

2015-01-05 16:20 GMT+03:00 Ted Yu :

> There're several non-English phrases which seem to be links.
> But when I clicked on them, there was no response.
>
> Can you give the links in URL ?
>
> Cheers
>
>
>
> > On Jan 5, 2015, at 2:39 AM, Serega Sheypak 
> wrote:
> >
> > Hi, I'm still trying to deal with apache tomcat web-app and hbase HBase
> 0.98.6
> > The root problem is that user threads constantly grows. I do get
> thousands of live threads on tomcat instance. Then it dies of course.
> >
> > please see visualVM threads count dynamics
> >
> >
> > Please see selected thread. It should be related to zookeeper (because
> of thread-name suffix SendThread)
> >
> >
> >
> > The threaddump for this thread is:
> >
> > "visit-thread-27799752116280271-EventThread" - Thread t@75
> >java.lang.Thread.State: WAITING
> >   at sun.misc.Unsafe.park(Native Method)
> >   - parking to wait for <34671cea> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> >   at
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> >   at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
> >   at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> >   at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:494)
> >
> >Locked ownable synchronizers:
> >   - None
> >
> > Why does it live "forever"? I next 24 hours I would get ~1200 live
> theads.
> >
> > "visit thread" does simple put/get by key, newrelic says it takes 30-40
> ms to respond.
> > I just set a name for the thread inside servlet method.
> >
> > Here is CPU profiling result:
> >
> >
> > Here are some Zookeeper metrics.
> >
> >
>

Re: Threads leaking from Apache tomcat application

2015-01-05 Thread Serega Sheypak

Hi, here is repost with images link

Hi, I'm still trying to deal with apache tomcat web-app and hbase HBase
0.98.6
The root problem is that user threads constantly grows. I do get thousands
of live threads on tomcat instance. Then it dies of course.

please see visualVM threads count dynamics
http://bigdatapath.com/wp-content/uploads/2015/01/01_threads_count-grow.png


Please see selected thread. It should be related to zookeeper (because of
thread-name suffix SendThread)
http://bigdatapath.com/wp-content/uploads/2015/01/011_long_running_threads.png

The threaddump for this thread is:

"visit-thread-27799752116280271-EventThread" - Thread t@75
   java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <34671cea> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:494)

   Locked ownable synchronizers:
- None

Why does it live "forever"? I next 24 hours I would get ~1200 live theads.

"visit thread" does simple put/get by key, newrelic says it takes 30-40 ms
to respond.
I just set a name for the thread inside servlet method.

Here is CPU profiling result:
http://bigdatapath.com/wp-content/uploads/2015/01/03_cpu_prifling.png

Here is zookeeper status:
http://bigdatapath.com/wp-content/uploads/2015/01/022_zookeeper_metrics.png

How can I debug and get root cause for long living threads? Looks like I
got threads leaking, but have no Idea why...




2015-01-05 17:57 GMT+03:00 Ted Yu :

> I used gmail.
>
> Please consider using third party site where you can upload images.
>
>

Re: Threads leaking from Apache tomcat application

2015-01-06 Thread Serega Sheypak

yes, one of them (random) gets more connections than others.

9.3.1.1 Is OK.
I have 1 HConnection for logical module per application and each
ServletRequest gets it's own HTable. HTable closed each tme after
ServletRequest is done. HConnection is never closed.

2015-01-05 21:22 GMT+03:00 Ted Yu :

> In 022_zookeeper_metrics.png, server names are anonymized. Looks like only
> one server got high number of connections.
>
> Have you seen 9.3.1.1 of http://hbase.apache.org/book.html#client ?
>
> Cheers
>
> On Mon, Jan 5, 2015 at 8:57 AM, Serega Sheypak 
> wrote:
>
> > Hi, here is repost with images link
> >
> > Hi, I'm still trying to deal with apache tomcat web-app and hbase HBase
> > 0.98.6
> > The root problem is that user threads constantly grows. I do get
> thousands
> > of live threads on tomcat instance. Then it dies of course.
> >
> > please see visualVM threads count dynamics
> >
> http://bigdatapath.com/wp-content/uploads/2015/01/01_threads_count-grow.png
> >
> >
> > Please see selected thread. It should be related to zookeeper (because of
> > thread-name suffix SendThread)
> >
> >
> http://bigdatapath.com/wp-content/uploads/2015/01/011_long_running_threads.png
> >
> > The threaddump for this thread is:
> >
> > "visit-thread-27799752116280271-EventThread" - Thread t@75
> >java.lang.Thread.State: WAITING
> > at sun.misc.Unsafe.park(Native Method)
> > - parking to wait for <34671cea> (a
> > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> > at
> >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
> > at
> >
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:494)
> >
> >Locked ownable synchronizers:
> > - None
> >
> > Why does it live "forever"? I next 24 hours I would get ~1200 live
> theads.
> >
> > "visit thread" does simple put/get by key, newrelic says it takes 30-40
> ms
> > to respond.
> > I just set a name for the thread inside servlet method.
> >
> > Here is CPU profiling result:
> > http://bigdatapath.com/wp-content/uploads/2015/01/03_cpu_prifling.png
> >
> > Here is zookeeper status:
> >
> http://bigdatapath.com/wp-content/uploads/2015/01/022_zookeeper_metrics.png
> >
> > How can I debug and get root cause for long living threads? Looks like I
> > got threads leaking, but have no Idea why...
> >
> >
> >
> >
> > 2015-01-05 17:57 GMT+03:00 Ted Yu :
> >
> > > I used gmail.
> > >
> > > Please consider using third party site where you can upload images.
> > >
> > >
> >
>

Re: Threads leaking from Apache tomcat application

2015-01-06 Thread Serega Sheypak

Hi, yes, it was me.
I've followed advices, ZK connections on server side are stable.
Here is current state of Tomcat:
http://bigdatapath.com/wp-content/uploads/2015/01/002_jvisualvm_summary.png
There are more than 800 threads and daemon threads.

and the state of three ZK servers:
http://bigdatapath.com/wp-content/uploads/2015/01/001_zk_server_state.png

here is pastebin:
http://pastebin.com/Cq8ppg08

P.S.
Looks like tomcat is running on OpenJDK 64-Bit Server VM.
I'll ask to fix it, it should be Oracle 7 JDK

2015-01-06 20:43 GMT+03:00 Stack :

> On Tue, Jan 6, 2015 at 4:52 AM, Serega Sheypak 
> wrote:
>
> > yes, one of them (random) gets more connections than others.
> >
> > 9.3.1.1 Is OK.
> > I have 1 HConnection for logical module per application and each
> > ServletRequest gets it's own HTable. HTable closed each tme after
> > ServletRequest is done. HConnection is never closed.
> >
> >
> This is you, right: http://search-hadoop.com/m/DHED4lJSA32
>
> Then, we were leaking zk connections.  Is that fixed?
>
> Can you reproduce in the small?  By setting up your webapp deploy in test
> bed and watching it for leaking?
>
> For this issue, can you post a thread dump in postbin or gist so can see?
>
> Can you post code too?
>
> St.Ack
>
>
>
> > 2015-01-05 21:22 GMT+03:00 Ted Yu :
> >
> > > In 022_zookeeper_metrics.png, server names are anonymized. Looks like
> > only
> > > one server got high number of connections.
> > >
> > > Have you seen 9.3.1.1 of http://hbase.apache.org/book.html#client ?
> > >
> > > Cheers
> > >
> > > On Mon, Jan 5, 2015 at 8:57 AM, Serega Sheypak <
> serega.shey...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi, here is repost with images link
> > > >
> > > > Hi, I'm still trying to deal with apache tomcat web-app and hbase
> HBase
> > > > 0.98.6
> > > > The root problem is that user threads constantly grows. I do get
> > > thousands
> > > > of live threads on tomcat instance. Then it dies of course.
> > > >
> > > > please see visualVM threads count dynamics
> > > >
> > >
> >
> http://bigdatapath.com/wp-content/uploads/2015/01/01_threads_count-grow.png
> > > >
> > > >
> > > > Please see selected thread. It should be related to zookeeper
> (because
> > of
> > > > thread-name suffix SendThread)
> > > >
> > > >
> > >
> >
> http://bigdatapath.com/wp-content/uploads/2015/01/011_long_running_threads.png
> > > >
> > > > The threaddump for this thread is:
> > > >
> > > > "visit-thread-27799752116280271-EventThread" - Thread t@75
> > > >java.lang.Thread.State: WAITING
> > > > at sun.misc.Unsafe.park(Native Method)
> > > > - parking to wait for <34671cea> (a
> > > >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> > > > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> > > > at
> > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
> > > > at
> > > >
> > >
> >
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> > > > at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:494)
> > > >
> > > >Locked ownable synchronizers:
> > > > - None
> > > >
> > > > Why does it live "forever"? I next 24 hours I would get ~1200 live
> > > theads.
> > > >
> > > > "visit thread" does simple put/get by key, newrelic says it takes
> 30-40
> > > ms
> > > > to respond.
> > > > I just set a name for the thread inside servlet method.
> > > >
> > > > Here is CPU profiling result:
> > > >
> http://bigdatapath.com/wp-content/uploads/2015/01/03_cpu_prifling.png
> > > >
> > > > Here is zookeeper status:
> > > >
> > >
> >
> http://bigdatapath.com/wp-content/uploads/2015/01/022_zookeeper_metrics.png
> > > >
> > > > How can I debug and get root cause for long living threads? Looks
> like
> > I
> > > > got threads leaking, but have no Idea why...
> > > >
> > > >
> > > >
> > > >
> > > > 2015-01-05 17:57 GMT+03:00 Ted Yu :
> > > >
> > > > > I used gmail.
> > > > >
> > > > > Please consider using third party site where you can upload images.
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: Threads leaking from Apache tomcat application

2015-01-06 Thread Serega Sheypak

Hm, thanks, I'll check..

2015-01-06 23:31 GMT+03:00 Stack :

> The threads that are sticking around are tomcat threads out of a tomcat
> executor pool. IIRC, your server has high traffic.  The pool is running up
> to 800 connections on occasion and taking a while to die back down?
> Googling, seems like this issue comes up frequently enough. Try it
> yourself.  If you can't figure something like putting a bound on the
> executor, come back here and we'll try and help you out.
>
> St.Ack
>
> On Tue, Jan 6, 2015 at 12:10 PM, Serega Sheypak 
> wrote:
>
> > Hi, yes, it was me.
> > I've followed advices, ZK connections on server side are stable.
> > Here is current state of Tomcat:
> >
> http://bigdatapath.com/wp-content/uploads/2015/01/002_jvisualvm_summary.png
> > There are more than 800 threads and daemon threads.
> >
> > and the state of three ZK servers:
> >
> http://bigdatapath.com/wp-content/uploads/2015/01/001_zk_server_state.png
> >
> > here is pastebin:
> > http://pastebin.com/Cq8ppg08
> >
> > P.S.
> > Looks like tomcat is running on OpenJDK 64-Bit Server VM.
> > I'll ask to fix it, it should be Oracle 7 JDK
> >
> > 2015-01-06 20:43 GMT+03:00 Stack :
> >
> > > On Tue, Jan 6, 2015 at 4:52 AM, Serega Sheypak <
> serega.shey...@gmail.com
> > >
> > > wrote:
> > >
> > > > yes, one of them (random) gets more connections than others.
> > > >
> > > > 9.3.1.1 Is OK.
> > > > I have 1 HConnection for logical module per application and each
> > > > ServletRequest gets it's own HTable. HTable closed each tme after
> > > > ServletRequest is done. HConnection is never closed.
> > > >
> > > >
> > > This is you, right: http://search-hadoop.com/m/DHED4lJSA32
> > >
> > > Then, we were leaking zk connections.  Is that fixed?
> > >
> > > Can you reproduce in the small?  By setting up your webapp deploy in
> test
> > > bed and watching it for leaking?
> > >
> > > For this issue, can you post a thread dump in postbin or gist so can
> see?
> > >
> > > Can you post code too?
> > >
> > > St.Ack
> > >
> > >
> > >
> > > > 2015-01-05 21:22 GMT+03:00 Ted Yu :
> > > >
> > > > > In 022_zookeeper_metrics.png, server names are anonymized. Looks
> like
> > > > only
> > > > > one server got high number of connections.
> > > > >
> > > > > Have you seen 9.3.1.1 of http://hbase.apache.org/book.html#client
> ?
> > > > >
> > > > > Cheers
> > > > >
> > > > > On Mon, Jan 5, 2015 at 8:57 AM, Serega Sheypak <
> > > serega.shey...@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi, here is repost with images link
> > > > > >
> > > > > > Hi, I'm still trying to deal with apache tomcat web-app and hbase
> > > HBase
> > > > > > 0.98.6
> > > > > > The root problem is that user threads constantly grows. I do get
> > > > > thousands
> > > > > > of live threads on tomcat instance. Then it dies of course.
> > > > > >
> > > > > > please see visualVM threads count dynamics
> > > > > >
> > > > >
> > > >
> > >
> >
> http://bigdatapath.com/wp-content/uploads/2015/01/01_threads_count-grow.png
> > > > > >
> > > > > >
> > > > > > Please see selected thread. It should be related to zookeeper
> > > (because
> > > > of
> > > > > > thread-name suffix SendThread)
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://bigdatapath.com/wp-content/uploads/2015/01/011_long_running_threads.png
> > > > > >
> > > > > > The threaddump for this thread is:
> > > > > >
> > > > > > "visit-thread-27799752116280271-EventThread" - Thread t@75
> > > > > >java.lang.Thread.State: WAITING
> > > > > > at sun.misc.Unsafe.park(Native Method)
> > > > > > - parking to wait for <34671cea> (a
> > > > > >
> > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> > > > > > at
> > java.

Strange HBase failure

2015-01-11 Thread Serega Sheypak

Hi, I have PoC HBase cluster running on 3 VM
deployment schema is:
NODE01 NN, SN, HMaster (HM), RegionServer (RS), Zookeeper server (ZK), DN
NODE02 RegionServer, DN
NODE03 RegionServer, DN

Suddenly ONLY HBase went offline, all services: HM RS
HDFS was working, no alerts were there
ZK server was working, no alerts there.
VMWare didn't publish any alerts.
Only restart of HBase service helped.

We are using this:
http://www.cloudera.com/content/cloudera/en/downloads/cdh/cdh-4-7-0.html
hbase-0.94.15+113

I made a deep dive into logs and found this stuff:
08:15:51.968INFOorg.apache.hadoop.hbase.regionserver.HRegionServer

regionserver60020.periodicFlusher requesting flush for region
epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
after a delay of 3026

08:15:55.011INFOorg.apache.hadoop.hbase.regionserver.StoreFile

Bloom filter type for
hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8e68424066dc4c02a60ca57ec98128fb:
ROW, CompoundBloomFilterWriter

08:15:55.012INFOorg.apache.hadoop.hbase.regionserver.StoreFile

Delete Family Bloom filter type for
hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8e68424066dc4c02a60ca57ec98128fb:
CompoundBloomFilterWriter

08:15:55.035INFOorg.apache.hadoop.hbase.regionserver.StoreFile

General Bloom and NO DeleteFamily was added to HFile
(hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8e68424066dc4c02a60ca57ec98128fb)

08:15:55.035INFOorg.apache.hadoop.hbase.regionserver.Store

Flushed , sequenceid=229362, memsize=7.7 K, into tmp file
hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8e68424066dc4c02a60ca57ec98128fb

08:15:55.053INFOorg.apache.hadoop.hbase.regionserver.StoreFile$Reader

Loaded ROW (CompoundBloomFilter) metadata for 8e68424066dc4c02a60ca57ec98128fb

08:15:55.072INFOorg.apache.hadoop.hbase.regionserver.StoreFile$Reader

Loaded ROW (CompoundBloomFilter) metadata for 8e68424066dc4c02a60ca57ec98128fb

08:15:55.073INFOorg.apache.hadoop.hbase.regionserver.Store

Added 
hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/CF/8e68424066dc4c02a60ca57ec98128fb,
entries=8, sequenceid=229362, filesize=2.7 K

08:15:55.076INFOorg.apache.hadoop.hbase.regionserver.HRegion

Finished memstore flush of ~7.7 K/7840, currentsize=0/0 for region
epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
in 80ms, sequenceid=229362, compaction requested=true

08:15:55.077INFOorg.apache.hadoop.hbase.regionserver.HRegion

Starting compaction on CF in region
epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.

08:15:55.077INFOorg.apache.hadoop.hbase.regionserver.Store

Starting compaction of 4 file(s) in CF of
epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
into 
tmpdir=hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp,
seqid=229362, totalSize=76.6 M

08:15:55.096INFOorg.apache.hadoop.hbase.regionserver.StoreFile

Bloom filter type for
hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8bf8e92031834676b5d40b352120c5f2:
ROW, CompoundBloomFilterWriter

08:15:55.097INFOorg.apache.hadoop.hbase.regionserver.StoreFile

Delete Family Bloom filter type for
hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8bf8e92031834676b5d40b352120c5f2:
CompoundBloomFilterWriter

08:15:59.245INFOorg.apache.hadoop.hbase.regionserver.StoreFile

General Bloom and NO DeleteFamily was added to HFile
(hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8bf8e92031834676b5d40b352120c5f2)

08:15:59.255INFOorg.apache.hadoop.hbase.regionserver.StoreFile$Reader

Loaded ROW (CompoundBloomFilter) metadata for 8bf8e92031834676b5d40b352120c5f2

08:15:59.255INFOorg.apache.hadoop.hbase.regionserver.Store

Renaming compacted file at
hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8bf8e92031834676b5d40b352120c5f2
to 
hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/CF/8bf8e92031834676b5d40b352120c5f2

08:15:59.266INFOorg.apache.hadoop.hbase.regionserver.StoreFile$Reader

Loaded ROW (CompoundBloomFilter) metadata for 8bf8e92031834676b5d40b352120c5f2

08:15:59.282INFOorg.apache.hadoop.hbase.regionserver.Store

Completed major compaction of 4 file(s) in CF of
epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
into 8bf8e92031834676b5d40b352120c5f2, size=76.6 M; total size for
store is 76.6 M

08:15:59.283INFO
org.apache.hadoop.hbase.regionserver.compactions.CompactionRequ

Re: Strange HBase failure

2015-01-11 Thread Serega Sheypak

Hi, HBase was down during 08:25 to 09:15
I was looking into logs, and thinking. I've tried to find something more
clever. than dummy restart.
We are using Cloudera distro, each of daemons run in it's own JVM.
I'll try to find CPU load logs.
There is really low load,
Finished memstore flush of ~7.7 K/7840,

Flushed , sequenceid=229369, memsize=16.3 K


Completed major compaction of 4 file(s) in CF of
epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
into 8bf8e92031834676b5d40b352120c5f2, size=76.6 M; total size for
store is 76.6 M


See there are less than 100 MB of data for 3 VMs. It's nothing.



2015-01-12 6:38 GMT+03:00 Ted Yu :

> Serega:
> Was the snippet of log from NODE01 ? Looks like NODE01 may have been under
> heavy load - considering the number of daemons running on that node.
>
> Please check GC log.
>
> Cheers
>
> On Sun, Jan 11, 2015 at 6:57 PM, Shuai Lin  wrote:
>
> > From the log I see no log was produced during 08:25 to 09:15, why did
> this
> > happen?
> >
> > 08:25:06.274INFOorg.apache.
> > hadoop.hbase.regionserver.wal.HLog
> >
> > moving old hlog file
> >
> >
> /hbase/.logs/etp-hdfs-n1-sg.passport.local,60020,1414102905372/etp-hdfs-n1-sg.passport.local%2C60020%2C1414102905372.1420856706020
> > whose highest sequenceid is 229359 to
> >
> >
> /hbase/.oldlogs/etp-hdfs-n1-sg.passport.local%2C60020%2C1414102905372.1420856706020
> >
> > 09:15:52.020INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
> >
> > Regards,
> > Shuai
> >
> > On Mon, Jan 12, 2015 at 3:47 AM, Serega Sheypak <
> serega.shey...@gmail.com>
> > wrote:
> >
> > > Hi, I have PoC HBase cluster running on 3 VM
> > > deployment schema is:
> > > NODE01 NN, SN, HMaster (HM), RegionServer (RS), Zookeeper server (ZK),
> DN
> > > NODE02 RegionServer, DN
> > > NODE03 RegionServer, DN
> > >
> > > Suddenly ONLY HBase went offline, all services: HM RS
> > > HDFS was working, no alerts were there
> > > ZK server was working, no alerts there.
> > > VMWare didn't publish any alerts.
> > > Only restart of HBase service helped.
> > >
> > > We are using this:
> > >
> http://www.cloudera.com/content/cloudera/en/downloads/cdh/cdh-4-7-0.html
> > > hbase-0.94.15+113
> > >
> > > I made a deep dive into logs and found this stuff:
> > > 08:15:51.968INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
> > >
> > > regionserver60020.periodicFlusher requesting flush for region
> > >
> > >
> >
> epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
> > > after a delay of 3026
> > >
> > > 08:15:55.011INFOorg.apache.hadoop.hbase.regionserver.StoreFile
> > >
> > > Bloom filter type for
> > >
> > >
> >
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8e68424066dc4c02a60ca57ec98128fb:
> > > ROW, CompoundBloomFilterWriter
> > >
> > > 08:15:55.012INFOorg.apache.hadoop.hbase.regionserver.StoreFile
> > >
> > > Delete Family Bloom filter type for
> > >
> > >
> >
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8e68424066dc4c02a60ca57ec98128fb:
> > > CompoundBloomFilterWriter
> > >
> > > 08:15:55.035INFOorg.apache.hadoop.hbase.regionserver.StoreFile
> > >
> > > General Bloom and NO DeleteFamily was added to HFile
> > >
> > >
> >
> (hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8e68424066dc4c02a60ca57ec98128fb)
> > >
> > > 08:15:55.035INFOorg.apache.hadoop.hbase.regionserver.Store
> > >
> > > Flushed , sequenceid=229362, memsize=7.7 K, into tmp file
> > >
> > >
> >
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8e68424066dc4c02a60ca57ec98128fb
> > >
> > > 08:15:55.053INFOorg.apache.hadoop.hbase.regionserver.StoreFile$Reader
> > >
> > > Loaded ROW (CompoundBloomFilter) metadata for
> > > 8e68424066dc4c02a60ca57ec98128fb
> > >
> > > 08:15:55.072INFOorg.apache.hadoop.hbase.regionserver.StoreFile$Reader
> > >
> > > Loaded ROW (CompoundBloomFilter) metadata for
> > > 8e68424066dc4c02a60ca57ec98128fb
> > >
> > > 08:15:55.073INFOorg.apache.hadoop.hbase.regionserver.Stor

Re: Strange HBase failure

2015-01-12 Thread Serega Sheypak

Ok, thanks, we'll check it.

2015-01-12 11:28 GMT+03:00 Esteban Gutierrez :

> Hi Serega,
>
> Do you have enough resources allocated for each VM? Just some swapping on
> the VMs or the host can make things unstable. Also from the number of
> services on each VM sounds like your host should have at least 12GB of free
> RAM just for running things smoothly otherwise you might want to try with
> less VMs and with some RAM each.
>
> cheers,
> esteban.
>
>
>
> --
> Cloudera, Inc.
>
>
> On Sun, Jan 11, 2015 at 11:55 PM, Serega Sheypak  >
> wrote:
>
> > Hi, HBase was down during 08:25 to 09:15
> > I was looking into logs, and thinking. I've tried to find something more
> > clever. than dummy restart.
> > We are using Cloudera distro, each of daemons run in it's own JVM.
> > I'll try to find CPU load logs.
> > There is really low load,
> > Finished memstore flush of ~7.7 K/7840,
> >
> > Flushed , sequenceid=229369, memsize=16.3 K
> >
> >
> > Completed major compaction of 4 file(s) in CF of
> >
> >
> epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
> > into 8bf8e92031834676b5d40b352120c5f2, size=76.6 M; total size for
> > store is 76.6 M
> >
> >
> > See there are less than 100 MB of data for 3 VMs. It's nothing.
> >
> >
> >
> > 2015-01-12 6:38 GMT+03:00 Ted Yu :
> >
> > > Serega:
> > > Was the snippet of log from NODE01 ? Looks like NODE01 may have been
> > under
> > > heavy load - considering the number of daemons running on that node.
> > >
> > > Please check GC log.
> > >
> > > Cheers
> > >
> > > On Sun, Jan 11, 2015 at 6:57 PM, Shuai Lin 
> > wrote:
> > >
> > > > From the log I see no log was produced during 08:25 to 09:15, why did
> > > this
> > > > happen?
> > > >
> > > > 08:25:06.274INFOorg.apache.
> > > > hadoop.hbase.regionserver.wal.HLog
> > > >
> > > > moving old hlog file
> > > >
> > > >
> > >
> >
> /hbase/.logs/etp-hdfs-n1-sg.passport.local,60020,1414102905372/etp-hdfs-n1-sg.passport.local%2C60020%2C1414102905372.1420856706020
> > > > whose highest sequenceid is 229359 to
> > > >
> > > >
> > >
> >
> /hbase/.oldlogs/etp-hdfs-n1-sg.passport.local%2C60020%2C1414102905372.1420856706020
> > > >
> > > > 09:15:52.020INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
> > > >
> > > > Regards,
> > > > Shuai
> > > >
> > > > On Mon, Jan 12, 2015 at 3:47 AM, Serega Sheypak <
> > > serega.shey...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi, I have PoC HBase cluster running on 3 VM
> > > > > deployment schema is:
> > > > > NODE01 NN, SN, HMaster (HM), RegionServer (RS), Zookeeper server
> > (ZK),
> > > DN
> > > > > NODE02 RegionServer, DN
> > > > > NODE03 RegionServer, DN
> > > > >
> > > > > Suddenly ONLY HBase went offline, all services: HM RS
> > > > > HDFS was working, no alerts were there
> > > > > ZK server was working, no alerts there.
> > > > > VMWare didn't publish any alerts.
> > > > > Only restart of HBase service helped.
> > > > >
> > > > > We are using this:
> > > > >
> > >
> http://www.cloudera.com/content/cloudera/en/downloads/cdh/cdh-4-7-0.html
> > > > > hbase-0.94.15+113
> > > > >
> > > > > I made a deep dive into logs and found this stuff:
> > > > > 08:15:51.968INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
> > > > >
> > > > > regionserver60020.periodicFlusher requesting flush for region
> > > > >
> > > > >
> > > >
> > >
> >
> epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
> > > > > after a delay of 3026
> > > > >
> > > > > 08:15:55.011INFOorg.apache.hadoop.hbase.regionserver.StoreFile
> > > > >
> > > > > Bloom filter type for
> > > > >
> > > > >
> > > >
> > >
> >
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8e68424066dc4c02a60ca57ec98128fb:
> > > > > ROW, Co

404 for hbase book

2015-01-14 Thread Serega Sheypak

Hi, starting from monday (12.01.2015) I'm getting 404 on ony page of
http://hbase.apache.org
what it could be?
Sorry for stupid question. I've tried several providers, no luck

Re: 404 for hbase book

2015-01-14 Thread Serega Sheypak

404: http://hbase.apache.org/book/datamodel.html

https://www.google.ru/#newwindow=1&q=Chapter+5.+Data+Model+-+HBase
first link: http://hbase.apache.org/book/datamodel.html

Looks like links hasn't been reindexed?
Ok, nevermind, they can be accessed directly from hbase site.

2015-01-14 13:01 GMT+03:00 Shuai Lin :

> Works fine for me. Maybe you can try force refresh the page?
>
> On Wed, Jan 14, 2015 at 5:47 PM, Serega Sheypak 
> wrote:
>
> > Hi, starting from monday (12.01.2015) I'm getting 404 on ony page of
> > http://hbase.apache.org
> > what it could be?
> > Sorry for stupid question. I've tried several providers, no luck
> >
>

Show last 10 (100/1000) events

2015-01-14 Thread Serega Sheypak

Hi, I have event-processing system which uses hbase + a pack of tomcat
web-apps as front-end.
tomcat web-apps are similar and used for front-end load-balancing.
tomcat apps write events to hbase.

What is good pattern to show last 10/100/1000 events?
events table schema is:
row_key=user_id
each user_id has 128 versions. So I keep history for the last 128 user
events.

There is no way to get last events, I can get last event only for concrete
user.

I had an idea to create separate table named 'last_events'
and force all tomcats write there copy of event with the same key and set
versions count to 1000.
HBase would automatically delete old events.
Drawbacks are:
1. x2 traffic
2. x2 write ops on hbase to single region

Solution is bad.
Are they any good patterns to resolve such problem. The other option is to
use some kind of memcache or stuff like that.

Re: Show last 10 (100/1000) events

2015-01-14 Thread Serega Sheypak

Ok, I got it, thanks.

2015-01-14 19:22 GMT+03:00 Wilm Schumacher :

> Will the number of "last" will be much larger than 10 (100/1000)?
>
> If not, then I wouldn't bother with a real database after all and would
> hold the data in RAM.
>
> Either:
> * in an object in your "gateway" to hbase. E.g. simple java Array list
> in your java server which serves the api to the web servers. This would
> be super easy and super fast
>
> Or:
> * in-memory-db like redis if you haven't something like above
>
> Or you redisgn your datamodel to something timestamp based. Then it's a
> scan.
>
> Best wishes,
>
> Wilm
>
> Am 14.01.2015 um 16:51 schrieb Serega Sheypak:
> > Hi, I have event-processing system which uses hbase + a pack of tomcat
> > web-apps as front-end.
> > tomcat web-apps are similar and used for front-end load-balancing.
> > tomcat apps write events to hbase.
> >
> > What is good pattern to show last 10/100/1000 events?
> > events table schema is:
> > row_key=user_id
> > each user_id has 128 versions. So I keep history for the last 128 user
> > events.
> >
> > There is no way to get last events, I can get last event only for
> concrete
> > user.
> >
> > I had an idea to create separate table named 'last_events'
> > and force all tomcats write there copy of event with the same key and set
> > versions count to 1000.
> > HBase would automatically delete old events.
> > Drawbacks are:
> > 1. x2 traffic
> > 2. x2 write ops on hbase to single region
> >
> > Solution is bad.
> > Are they any good patterns to resolve such problem. The other option is
> to
> > use some kind of memcache or stuff like that.
> >
>
>

Re: Show last 10 (100/1000) events

2015-01-14 Thread Serega Sheypak

no it would be 10/100/1
1 is absolute limit. I understand that simple threadsafe java
collection can handle this.

2015-01-14 22:17 GMT+03:00 Serega Sheypak :

> Ok, I got it, thanks.
>
> 2015-01-14 19:22 GMT+03:00 Wilm Schumacher :
>
>> Will the number of "last" will be much larger than 10 (100/1000)?
>>
>> If not, then I wouldn't bother with a real database after all and would
>> hold the data in RAM.
>>
>> Either:
>> * in an object in your "gateway" to hbase. E.g. simple java Array list
>> in your java server which serves the api to the web servers. This would
>> be super easy and super fast
>>
>> Or:
>> * in-memory-db like redis if you haven't something like above
>>
>> Or you redisgn your datamodel to something timestamp based. Then it's a
>> scan.
>>
>> Best wishes,
>>
>> Wilm
>>
>> Am 14.01.2015 um 16:51 schrieb Serega Sheypak:
>> > Hi, I have event-processing system which uses hbase + a pack of tomcat
>> > web-apps as front-end.
>> > tomcat web-apps are similar and used for front-end load-balancing.
>> > tomcat apps write events to hbase.
>> >
>> > What is good pattern to show last 10/100/1000 events?
>> > events table schema is:
>> > row_key=user_id
>> > each user_id has 128 versions. So I keep history for the last 128 user
>> > events.
>> >
>> > There is no way to get last events, I can get last event only for
>> concrete
>> > user.
>> >
>> > I had an idea to create separate table named 'last_events'
>> > and force all tomcats write there copy of event with the same key and
>> set
>> > versions count to 1000.
>> > HBase would automatically delete old events.
>> > Drawbacks are:
>> > 1. x2 traffic
>> > 2. x2 write ops on hbase to single region
>> >
>> > Solution is bad.
>> > Are they any good patterns to resolve such problem. The other option is
>> to
>> > use some kind of memcache or stuff like that.
>> >
>>
>>
>

Re: Threads leaking from Apache tomcat application

2015-01-15 Thread Serega Sheypak

Hi, as I mentioned before devops put wrong java (OpenJDK-7) for tomcat.
HBase runs on oracle-jdk-7
I've asked them to set oracle-java for Tomcat. The problem is gone

2015-01-07 10:48 GMT+03:00 Serega Sheypak :

> Hm, thanks, I'll check..
>
> 2015-01-06 23:31 GMT+03:00 Stack :
>
>> The threads that are sticking around are tomcat threads out of a tomcat
>> executor pool. IIRC, your server has high traffic.  The pool is running up
>> to 800 connections on occasion and taking a while to die back down?
>> Googling, seems like this issue comes up frequently enough. Try it
>> yourself.  If you can't figure something like putting a bound on the
>> executor, come back here and we'll try and help you out.
>>
>> St.Ack
>>
>> On Tue, Jan 6, 2015 at 12:10 PM, Serega Sheypak > >
>> wrote:
>>
>> > Hi, yes, it was me.
>> > I've followed advices, ZK connections on server side are stable.
>> > Here is current state of Tomcat:
>> >
>> http://bigdatapath.com/wp-content/uploads/2015/01/002_jvisualvm_summary.png
>> > There are more than 800 threads and daemon threads.
>> >
>> > and the state of three ZK servers:
>> >
>> http://bigdatapath.com/wp-content/uploads/2015/01/001_zk_server_state.png
>> >
>> > here is pastebin:
>> > http://pastebin.com/Cq8ppg08
>> >
>> > P.S.
>> > Looks like tomcat is running on OpenJDK 64-Bit Server VM.
>> > I'll ask to fix it, it should be Oracle 7 JDK
>> >
>> > 2015-01-06 20:43 GMT+03:00 Stack :
>> >
>> > > On Tue, Jan 6, 2015 at 4:52 AM, Serega Sheypak <
>> serega.shey...@gmail.com
>> > >
>> > > wrote:
>> > >
>> > > > yes, one of them (random) gets more connections than others.
>> > > >
>> > > > 9.3.1.1 Is OK.
>> > > > I have 1 HConnection for logical module per application and each
>> > > > ServletRequest gets it's own HTable. HTable closed each tme after
>> > > > ServletRequest is done. HConnection is never closed.
>> > > >
>> > > >
>> > > This is you, right: http://search-hadoop.com/m/DHED4lJSA32
>> > >
>> > > Then, we were leaking zk connections.  Is that fixed?
>> > >
>> > > Can you reproduce in the small?  By setting up your webapp deploy in
>> test
>> > > bed and watching it for leaking?
>> > >
>> > > For this issue, can you post a thread dump in postbin or gist so can
>> see?
>> > >
>> > > Can you post code too?
>> > >
>> > > St.Ack
>> > >
>> > >
>> > >
>> > > > 2015-01-05 21:22 GMT+03:00 Ted Yu :
>> > > >
>> > > > > In 022_zookeeper_metrics.png, server names are anonymized. Looks
>> like
>> > > > only
>> > > > > one server got high number of connections.
>> > > > >
>> > > > > Have you seen 9.3.1.1 of http://hbase.apache.org/book.html#client
>> ?
>> > > > >
>> > > > > Cheers
>> > > > >
>> > > > > On Mon, Jan 5, 2015 at 8:57 AM, Serega Sheypak <
>> > > serega.shey...@gmail.com
>> > > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Hi, here is repost with images link
>> > > > > >
>> > > > > > Hi, I'm still trying to deal with apache tomcat web-app and
>> hbase
>> > > HBase
>> > > > > > 0.98.6
>> > > > > > The root problem is that user threads constantly grows. I do get
>> > > > > thousands
>> > > > > > of live threads on tomcat instance. Then it dies of course.
>> > > > > >
>> > > > > > please see visualVM threads count dynamics
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://bigdatapath.com/wp-content/uploads/2015/01/01_threads_count-grow.png
>> > > > > >
>> > > > > >
>> > > > > > Please see selected thread. It should be related to zookeeper
>> > > (because
>> > > > of
>> > > > > > thread-name suffix SendThread)
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >

Re: multiple data versions vs. multiple rows?

2015-01-19 Thread Serega Sheypak

does performance should differ significantly if row value size is small and
we don't have too much versions.
Assume, that a pack of versions for key is less than recommended HFile
block (8KB to 1MB
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/io/hfile/HFile.html),
which is minimal "read unit", should we see any difference at all?
Am I right?


2015-01-20 0:33 GMT+03:00 Jean-Marc Spaggiari :

> Hi Yong,
>
> If you want to compare the performances, you need to run way bigger and
> longer tests. Dont run them in parallete. Run them at least 10 time each to
> make sure you have a good trend. Is the difference between the 2
> significant? It should not.
>
> JM
>
> 2015-01-19 15:17 GMT-05:00 yonghu :
>
> > Hi,
> >
> > Thanks for your suggestion. I have already considered the first issue
> that
> > one row  is not allowed to be split between 2 regions.
> >
> > However, I have made a small scan-test with MapReduce. I first created a
> > table t1 with 1 million rows and allowed each column to store 10 data
> > versions. Then, I translated t1 into t2 in which multiple data versions
> in
> > t1 were transformed into multiple rows in t2. I wrote two MapReduce
> > programs to scan t1 and t2 individually. What I got is the table scanning
> > time of t1 is shorter than t2. So, I think for performance reason,
> multiple
> > data versions may be a better option than multiple rows.
> >
> > But just as you said, which approach to use depends on how many
> historical
> > events you want to keep.
> >
> > regards!
> >
> > Yong
> >
> >
> > On Mon, Jan 19, 2015 at 8:37 PM, Jean-Marc Spaggiari <
> > jean-m...@spaggiari.org> wrote:
> >
> > > Hi Yong,
> > >
> > > A row will not split between 2 regions. If you plan having thousands of
> > > versions, based on the size of your data, you might end up having a row
> > > bigger than your preferred region size.
> > >
> > > If you plan just keep few versions of the history to have a look at
> it, I
> > > will say go with it. If you plan to have one million version because
> you
> > > want to keep all the events history, go with the row approach.
> > >
> > > You can also consider going with the Column Qualifier approach. This
> has
> > > the same constraint as the versions regarding the split in 2 regions,
> but
> > > it might me easier to manage and still give you the consistency of
> being
> > > within a row.
> > >
> > > JM
> > >
> > > 2015-01-19 14:28 GMT-05:00 yonghu :
> > >
> > > > Dear all,
> > > >
> > > > I want to record the user history data. I know there exists two
> > options,
> > > > one is to store user events in a single row with multiple data
> versions
> > > and
> > > > the other one is to use multiple rows. I wonder which one is better
> for
> > > > performance?
> > > >
> > > > Thanks!
> > > >
> > > > Yong
> > > >
> > >
> >
>

Re: managing HConnection

2015-02-03 Thread Serega Sheypak

Hi, guys from group helped me a lot. I did solve pretty the same problem
(CRUD web-app)

1. Use single instance of HConnection per application.
2. Instantiate it once.
3. create HTable instance for each CRUD operation and safely close it
(try-catch-finally). Use the same HConnection to create any HTable for CRUD
operation.
4. DO NOT close HConnection after CRUD operation

I have logic controllers which get HConnection injection in
HttpServlet.init method.
So I have 5 HConnection instances per application created during servlet
initialization

2015-02-03 18:12 GMT+03:00 Ted Yu :

> Please see '61.1. Cluster Connections' under
> http://hbase.apache.org/book.html#architecture.client
>
> Cheers
>
> On Tue, Feb 3, 2015 at 6:47 AM, sleimanjneidi 
> wrote:
>
> > Hi all,
> > I am using hbase-0.98.1-cdh5.1.4 client and I am a bit confused by the
> > documentation of HConnection. The document says the following:
> >
> > HConnection instances can be shared. Sharing is usually what you want
> > because rather than each HConnection instance having to do its own
> > discovery of regions out on the cluster, instead, all clients get to
> share
> > the one cache of locations. HConnectionManager does the sharing for you
> if
> > you go by it getting connections. Sharing makes cleanup of HConnections
> > awkward. .
> >
> > So now I have a simple question: Can I share the same HConnection
> instance
> > in my entire application?
> > And write some magic code to know when to close or never close at all?
> > Or I have to create an instance and close it every time I do a CRUD
> > operation ?
> >
> > Many thanks
> >
> >
> >
>

Re: Re: managing HConnection

2015-02-13 Thread Serega Sheypak

Hi, really, I can share one Hconnection for the whole application.
It's done by design. I have several servlets. Each servlet has 1-2
controllers working with hbase internally (put/get/e.t.c)
Right now I don't see any reason to refactor code and share single
HConnection for all controllers in servlets.


2015-02-13 6:56 GMT+03:00 David chen :

> Hi Serega,
> I am very interesting in the reason why per application need to create 5
> instead of only one HConnection instances during servlet initialization?
>
>
>
>
>
>
>
>
> At 2015-02-04 01:01:38, "Serega Sheypak"  wrote:
> >Hi, guys from group helped me a lot. I did solve pretty the same problem
> >(CRUD web-app)
> >
> >1. Use single instance of HConnection per application.
> >2. Instantiate it once.
> >3. create HTable instance for each CRUD operation and safely close it
> >(try-catch-finally). Use the same HConnection to create any HTable for
> CRUD
> >operation.
> >4. DO NOT close HConnection after CRUD operation
> >
> >I have logic controllers which get HConnection injection in
> >HttpServlet.init method.
> >So I have 5 HConnection instances per application created during servlet
> >initialization
> >
> >
> >2015-02-03 18:12 GMT+03:00 Ted Yu :
> >
> >> Please see '61.1. Cluster Connections' under
> >> http://hbase.apache.org/book.html#architecture.client
> >>
> >> Cheers
> >>
> >> On Tue, Feb 3, 2015 at 6:47 AM, sleimanjneidi  >
> >> wrote:
> >>
> >> > Hi all,
> >> > I am using hbase-0.98.1-cdh5.1.4 client and I am a bit confused by the
> >> > documentation of HConnection. The document says the following:
> >> >
> >> > HConnection instances can be shared. Sharing is usually what you want
> >> > because rather than each HConnection instance having to do its own
> >> > discovery of regions out on the cluster, instead, all clients get to
> >> share
> >> > the one cache of locations. HConnectionManager does the sharing for
> you
> >> if
> >> > you go by it getting connections. Sharing makes cleanup of
> HConnections
> >> > awkward. .
> >> >
> >> > So now I have a simple question: Can I share the same HConnection
> >> instance
> >> > in my entire application?
> >> > And write some magic code to know when to close or never close at all?
> >> > Or I have to create an instance and close it every time I do a CRUD
> >> > operation ?
> >> >
> >> > Many thanks
> >> >
> >> >
> >> >
> >>
>

Re: Re: managing HConnection

2015-02-13 Thread Serega Sheypak

What are you trying to achieve?

2015-02-13 12:36 GMT+03:00 Sleiman Jneidi :

> To be honest guys I am still confused, especially that that HConnection
> implements Closeable  and hence everyone has the right to close the
> connection. I wrote this code to manage connections but I am not sure about
> its correctness.
>
>
> private static class HConnectionProvider {
>
>   private static HConnection hConnection;
>
>  private static final Lock LOCK = new ReentrantLock();
>
>   static {
>
>  hConnection = createNewConnection();
>
>Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() {
>
> @Override
>
>   public void run() {
>
>   if(hConnection!=null && !hConnection.isClosed()){
>
>try {
>
>hConnection.close();
>
>} catch (IOException e) {
>
>e.printStackTrace();
>
>}
>
>   }
>
>   }
>
>  }));
>
>  }
>
>   public static HConnection connection(){
>
>  if(!hConnection.isClosed()){
>
>   return hConnection;
>
>  }
>
>  boolean acquired = false;
>
>  try{
>
>   acquired = LOCK.tryLock(5,TimeUnit.SECONDS);
>
>   if(hConnection.isClosed()){
>
>   hConnection = createNewConnection();
>
>   }
>
>   return hConnection;
>
> } catch (InterruptedException e) {
>
>   throw new RuntimeException(e);
>
>  }finally{
>
>   if(acquired){
>
>   LOCK.unlock();
>
>   }
>
>  }
>
>}
>
>   private static HConnection createNewConnection(){
>
>  try {
>
>   HConnection connection = HConnectionManager.createConnection(config);
>
>   return connection;
>
>  } catch (IOException e) {
>
>   throw new RuntimeException(e);
>
>  }
>
>  }
>
>   }
>
> On Fri, Feb 13, 2015 at 8:57 AM, Serega Sheypak 
> wrote:
>
> > Hi, really, I can share one Hconnection for the whole application.
> > It's done by design. I have several servlets. Each servlet has 1-2
> > controllers working with hbase internally (put/get/e.t.c)
> > Right now I don't see any reason to refactor code and share single
> > HConnection for all controllers in servlets.
> >
> >
> > 2015-02-13 6:56 GMT+03:00 David chen :
> >
> > > Hi Serega,
> > > I am very interesting in the reason why per application need to create
> 5
> > > instead of only one HConnection instances during servlet
> initialization?
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > At 2015-02-04 01:01:38, "Serega Sheypak" 
> > wrote:
> > > >Hi, guys from group helped me a lot. I did solve pretty the same
> problem
> > > >(CRUD web-app)
> > > >
> > > >1. Use single instance of HConnection per application.
> > > >2. Instantiate it once.
> > > >3. create HTable instance for each CRUD operation and safely close it
> > > >(try-catch-finally). Use the same HConnection to create any HTable for
> > > CRUD
> > > >operation.
> > > >4. DO NOT close HConnection after CRUD operation
> > > >
> > > >I have logic controllers which get HConnection injection in
> > > >HttpServlet.init method.
> > > >So I have 5 HConnection instances per application created during
> servlet
> > > >initialization
> > > >
> > > >
> > > >2015-02-03 18:12 GMT+03:00 Ted Yu :
> > > >
> > > >> Please see '61.1. Cluster Connections' under
> > > >> http://hbase.apache.org/book.html#architecture.client
> > > >>
> > > >> Cheers
> > > >>
> > > >> On Tue, Feb 3, 2015 at 6:47 AM, sleimanjneidi <
> > jneidi.slei...@gmail.com
> > > >
> > > >> wrote:
> > > >>
> > > >> > Hi all,
> > > >> > I am using hbase-0.98.1-cdh5.1.4 client and I am a bit confused by
> > the
> > > >> > documentation of HConnection. The document says the following:
> > > >> >
> > > >> > HConnection instances can be shared. Sharing is usually what you
> > want
> > > >> > because rather than each HConnection instance having to do its own
> > > >> > discovery of regions out on the cluster, instead, all clients get
> to
> > > >> share
> > > >> > the one cache of locations. HConnectionManager does the sharing
> for
> > > you
> > > >> if
> > > >> > you go by it getting connections. Sharing makes cleanup of
> > > HConnections
> > > >> > awkward. .
> > > >> >
> > > >> > So now I have a simple question: Can I share the same HConnection
> > > >> instance
> > > >> > in my entire application?
> > > >> > And write some magic code to know when to close or never close at
> > all?
> > > >> > Or I have to create an instance and close it every time I do a
> CRUD
> > > >> > operation ?
> > > >> >
> > > >> > Many thanks
> > > >> >
> > > >> >
> > > >> >
> > > >>
> > >
> >
>

Re: Re: managing HConnection

2015-02-13 Thread Serega Sheypak

What's the problem to call HConnectionManager.getConnection in Servlet.init
method and pass it to your class responsible for HBase interaction?


2015-02-13 14:49 GMT+03:00 Sleiman Jneidi :

> a single HConnection
>
> On Fri, Feb 13, 2015 at 11:12 AM, Serega Sheypak  >
> wrote:
>
> > What are you trying to achieve?
> >
> > 2015-02-13 12:36 GMT+03:00 Sleiman Jneidi :
> >
> > > To be honest guys I am still confused, especially that that HConnection
> > > implements Closeable  and hence everyone has the right to close the
> > > connection. I wrote this code to manage connections but I am not sure
> > about
> > > its correctness.
> > >
> > >
> > > private static class HConnectionProvider {
> > >
> > >   private static HConnection hConnection;
> > >
> > >  private static final Lock LOCK = new ReentrantLock();
> > >
> > >   static {
> > >
> > >  hConnection = createNewConnection();
> > >
> > >Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() {
> > >
> > > @Override
> > >
> > >   public void run() {
> > >
> > >   if(hConnection!=null && !hConnection.isClosed()){
> > >
> > >try {
> > >
> > >hConnection.close();
> > >
> > >} catch (IOException e) {
> > >
> > >e.printStackTrace();
> > >
> > >}
> > >
> > >   }
> > >
> > >   }
> > >
> > >  }));
> > >
> > >  }
> > >
> > >   public static HConnection connection(){
> > >
> > >  if(!hConnection.isClosed()){
> > >
> > >   return hConnection;
> > >
> > >  }
> > >
> > >  boolean acquired = false;
> > >
> > >  try{
> > >
> > >   acquired = LOCK.tryLock(5,TimeUnit.SECONDS);
> > >
> > >   if(hConnection.isClosed()){
> > >
> > >   hConnection = createNewConnection();
> > >
> > >   }
> > >
> > >   return hConnection;
> > >
> > > } catch (InterruptedException e) {
> > >
> > >   throw new RuntimeException(e);
> > >
> > >  }finally{
> > >
> > >   if(acquired){
> > >
> > >   LOCK.unlock();
> > >
> > >   }
> > >
> > >  }
> > >
> > >}
> > >
> > >   private static HConnection createNewConnection(){
> > >
> > >  try {
> > >
> > >   HConnection connection = HConnectionManager.createConnection(config);
> > >
> > >   return connection;
> > >
> > >  } catch (IOException e) {
> > >
> > >   throw new RuntimeException(e);
> > >
> > >  }
> > >
> > >  }
> > >
> > >   }
> > >
> > > On Fri, Feb 13, 2015 at 8:57 AM, Serega Sheypak <
> > serega.shey...@gmail.com>
> > > wrote:
> > >
> > > > Hi, really, I can share one Hconnection for the whole application.
> > > > It's done by design. I have several servlets. Each servlet has 1-2
> > > > controllers working with hbase internally (put/get/e.t.c)
> > > > Right now I don't see any reason to refactor code and share single
> > > > HConnection for all controllers in servlets.
> > > >
> > > >
> > > > 2015-02-13 6:56 GMT+03:00 David chen :
> > > >
> > > > > Hi Serega,
> > > > > I am very interesting in the reason why per application need to
> > create
> > > 5
> > > > > instead of only one HConnection instances during servlet
> > > initialization?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > At 2015-02-04 01:01:38, "Serega Sheypak"  >
> > > > wrote:
> > > > > >Hi, guys from group helped me a lot. I did solve pretty the same
> > > problem
> > > > > >(CRUD web-app)
> > > > > >
> > > > > >1. Use single instance of HConnection per application.
> > > > > >2. Instantiate it once.
> > > > > >3. create HTable instance for each CRUD operation and safely close
> > it
> > > > > >(try-catch-finally). Use the same HConnection to create a

Re: Re: Re: managing HConnection

2015-02-15 Thread Serega Sheypak

I don't understand you.
There is a single instance of servlet per application.
Servlet.init method called once. Here you can instantiate HConnection and
solve ANY concurrency problems. HConnection is tread-safe. Just don't close
it and reuse.
Then just use HConnection to get HTable.

What problem you are trying to solve?


2015-02-15 9:23 GMT+03:00 David chen :

> If sharing one HConnection for the whole application, when concurrency to
> access servlets increases to a threshold, whether or not it will influence
> the application performance?  But if increasing the sharing HConnection
> number, the problem will be relieved?

Re: managing HConnection

2015-02-15 Thread Serega Sheypak

It can. 5 rpm, no problem.

понедельник, 16 февраля 2015 г. пользователь David chen написал:

> Sorry for the unclear represent.
> My problem is that whether or not a sharing Honnection can bear too many
> query requests?

Re: Re: managing HConnection

2015-02-16 Thread Serega Sheypak

Newrelic shows 50K RPM
each request to servlet == 1-3 put/get to HBase. I have mixed workload.

Is it strange :) ?

2015-02-16 10:37 GMT+03:00 David chen :

>  5 rpm?  I am curious how the result is concluded?

Re: managing HConnection

2015-02-16 Thread Serega Sheypak

Hi, I'm closing it in servlet.destroy. I didn't see any problems here for
months. I'm using servlet lifecycle to deal with hconnection.

вторник, 17 февраля 2015 г. пользователь Liu, Ming (HPIT-GADSC) написал:

> Hi,
>
> Thank you Serega for the helpful reply and thanks Jneidi for asking this.
> I have similar confusion.
> So Serega, when does your application finally close the HConnection? Or
> the connection is NEVER closed as long as your application is running? Is
> it OK to NOT close the HConnection and the application exit directly?
> My application is a long-running service, accept user request and do CRUD
> to hbase. So I would like to use your model here. But, is it reasonable to
> keep that HConnection open a very long time, for example months? Is there
> any potential problem I need to take care?
> Also as David Chen asked, if all threads share same HConnection, it may
> has limitation to support high throughput, so a pool of Connections maybe
> better?
>
> Thanks,
> Ming
>
> -Original Message-
> From: Serega Sheypak [mailto:serega.shey...@gmail.com ]
> Sent: Wednesday, February 04, 2015 1:02 AM
> To: user
> Subject: Re: managing HConnection
>
> Hi, guys from group helped me a lot. I did solve pretty the same problem
> (CRUD web-app)
>
> 1. Use single instance of HConnection per application.
> 2. Instantiate it once.
> 3. create HTable instance for each CRUD operation and safely close it
> (try-catch-finally). Use the same HConnection to create any HTable for CRUD
> operation.
> 4. DO NOT close HConnection after CRUD operation
>
> I have logic controllers which get HConnection injection in
> HttpServlet.init method.
> So I have 5 HConnection instances per application created during servlet
> initialization
>
>
> 2015-02-03 18:12 GMT+03:00 Ted Yu >:
>
> > Please see '61.1. Cluster Connections' under
> > http://hbase.apache.org/book.html#architecture.client
> >
> > Cheers
> >
> > On Tue, Feb 3, 2015 at 6:47 AM, sleimanjneidi
> > >
> > wrote:
> >
> > > Hi all,
> > > I am using hbase-0.98.1-cdh5.1.4 client and I am a bit confused by
> > > the documentation of HConnection. The document says the following:
> > >
> > > HConnection instances can be shared. Sharing is usually what you
> > > want because rather than each HConnection instance having to do its
> > > own discovery of regions out on the cluster, instead, all clients
> > > get to
> > share
> > > the one cache of locations. HConnectionManager does the sharing for
> > > you
> > if
> > > you go by it getting connections. Sharing makes cleanup of
> > > HConnections awkward. .
> > >
> > > So now I have a simple question: Can I share the same HConnection
> > instance
> > > in my entire application?
> > > And write some magic code to know when to close or never close at all?
> > > Or I have to create an instance and close it every time I do a CRUD
> > > operation ?
> > >
> > > Many thanks
> > >
> > >
> > >
> >
>

Re: Hbase not taking inserts from Remote Machine

2015-02-16 Thread Serega Sheypak

You need to open region server ports. Client directly sends put to
appropriate region server.

вторник, 17 февраля 2015 г. пользователь Vineet Mishra написал:

> -- Forwarded message --
> From: Vineet Mishra >
> Date: Tue, Feb 17, 2015 at 12:32 PM
> Subject: Hbase not taking inserts from Remote Machine
> To: cdh-u...@cloudera.org 
>
>
>
> Hi All,
>
> I am trying to communicate and insert some data to my
> Hbase(0.98.6-cdh5.3.0) running on Hadoop 2.5 using Hbase Java API.
>
> Although if I run the code within the cluster it connects fine but If I am
> outside cluster, even though I have opened the port on external IPs for
> Zookeeper and HMaster, I am stuck without any error on the logs, so the
> code hangs up on Table.put() during insertion,
>
> Attached below is the Stack Trace for the Job,
>
> 15/02/17 12:05:08 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 15/02/17 12:05:08 INFO zookeeper.RecoverableZooKeeper: Process
> identifier=hconnection-0x77dacebf connecting to ZooKeeper ensemble=
> namenode.com:2181,cloud-manager.com:2181
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> environment:zookeeper.version=3.4.5-cdh5.3.0--1, built on 12/17/2014 02:45
> GMT
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client environment:host.name
> =ip-20-0-0-75
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> environment:java.version=1.7.0_75
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> environment:java.vendor=Oracle Corporation
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> environment:java.home=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75.x86_64/jre
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> environment:java.class.path=.:hbase-connect-1.0.0.jar
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
>
> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> environment:java.io.tmpdir=/tmp
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> environment:java.compiler=
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client environment:os.name
> =Linux
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> environment:os.arch=amd64
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> environment:os.version=3.14.23-22.44.amzn1.x86_64
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client environment:user.name
> =tom
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> environment:user.home=/home/tom
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> environment:user.dir=/home/tom/jobs
> 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Initiating client connection,
> connectString=namenode.com:2181,cloud-manager.com:2181
> sessionTimeout=9
> watcher=hconnection-0x77dacebf, quorum=namenode.com:2181,
> cloud-manager.com:2181, baseZNode=/hbase
> 15/02/17 12:05:08 INFO zookeeper.ClientCnxn: Opening socket connection to
> server namenode.com/54.172.21.54:2181. Will not attempt to authenticate
> using SASL (unknown error)
> 15/02/17 12:05:09 INFO zookeeper.ClientCnxn: Socket connection established
> to namenode.com/54.172.21.54:2181, initiating session
> 15/02/17 12:05:09 INFO zookeeper.ClientCnxn: Session establishment complete
> on server namenode.com/54.172.21.54:2181, sessionid = 0x24b7c7ba3532b32,
> negotiated timeout = 6
> 15/02/17 12:05:11 INFO client.HConnectionManager$HConnectionImplementation:
> Closing master protocol: MasterService
> 15/02/17 12:05:11 INFO client.HConnectionManager$HConnectionImplementation:
> Closing zookeeper sessionid=0x24b7c7ba3532b32
> 15/02/17 12:05:11 INFO zookeeper.ZooKeeper: Session: 0x24b7c7ba3532b32
> closed
> 15/02/17 12:05:11 INFO zookeeper.ClientCnxn: EventThread shut down
> Hbase Running
> 15/02/17 12:05:11 INFO zookeeper.RecoverableZooKeeper: Process
> identifier=hconnection-0x1003cac6 connecting to ZooKeeper ensemble=
> namenode.com:2181,cloud-manager.com:2181
> 15/02/17 12:05:11 INFO zookeeper.ZooKeeper: Initiating client connection,
> connectString=namenode.com:2181,cloud-manager.com:2181
> sessionTimeout=9
> watcher=hconnection-0x1003cac6, quorum=namenode.com:2181,
> cloud-manager.com:2181, baseZNode=/hbase
> 15/02/17 12:05:11 INFO zookeeper.ClientCnxn: Opening socket connection to
> server namenode.com/54.172.21.54:2181. Will not attempt to authenticate
> using SASL (unknown error)
> 15/02/17 12:05:12 INFO zookeeper.ClientCnxn: Socket connection established
> to namenode.com/54.172.21.54:2181, initiating session
> 15/02/17 12:05:12 INFO zookeeper.ClientCnxn: Session establishment complete
> on server namenode.com/54.172.21.54:2181, sessionid = 0x24b7c7ba3532b34,
> negotiated timeout = 6
>
> Looking at the revert urgently!
>
> Thanks!
>

Re: Hbase not taking inserts from Remote Machine

2015-02-17 Thread Serega Sheypak

You are welcome!

2015-02-17 12:07 GMT+03:00 Vineet Mishra :

> Thanks Serega!
>
> Don't know how could I miss that, Its working good now!
>
> On Tue, Feb 17, 2015 at 12:41 PM, Serega Sheypak  >
> wrote:
>
> > You need to open region server ports. Client directly sends put to
> > appropriate region server.
> >
> > вторник, 17 февраля 2015 г. пользователь Vineet Mishra написал:
> >
> > > -- Forwarded message --
> > > From: Vineet Mishra >
> > > Date: Tue, Feb 17, 2015 at 12:32 PM
> > > Subject: Hbase not taking inserts from Remote Machine
> > > To: cdh-u...@cloudera.org 
> > >
> > >
> > >
> > > Hi All,
> > >
> > > I am trying to communicate and insert some data to my
> > > Hbase(0.98.6-cdh5.3.0) running on Hadoop 2.5 using Hbase Java API.
> > >
> > > Although if I run the code within the cluster it connects fine but If I
> > am
> > > outside cluster, even though I have opened the port on external IPs for
> > > Zookeeper and HMaster, I am stuck without any error on the logs, so the
> > > code hangs up on Table.put() during insertion,
> > >
> > > Attached below is the Stack Trace for the Job,
> > >
> > > 15/02/17 12:05:08 WARN util.NativeCodeLoader: Unable to load
> > native-hadoop
> > > library for your platform... using builtin-java classes where
> applicable
> > > 15/02/17 12:05:08 INFO zookeeper.RecoverableZooKeeper: Process
> > > identifier=hconnection-0x77dacebf connecting to ZooKeeper ensemble=
> > > namenode.com:2181,cloud-manager.com:2181
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> > > environment:zookeeper.version=3.4.5-cdh5.3.0--1, built on 12/17/2014
> > 02:45
> > > GMT
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client environment:
> host.name
> > > =ip-20-0-0-75
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> > > environment:java.version=1.7.0_75
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> > > environment:java.vendor=Oracle Corporation
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> > >
> environment:java.home=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75.x86_64/jre
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> > > environment:java.class.path=.:hbase-connect-1.0.0.jar
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> > >
> > >
> >
> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> > > environment:java.io.tmpdir=/tmp
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> > > environment:java.compiler=
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client environment:os.name
> > > =Linux
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> > > environment:os.arch=amd64
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> > > environment:os.version=3.14.23-22.44.amzn1.x86_64
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client environment:
> user.name
> > > =tom
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> > > environment:user.home=/home/tom
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Client
> > > environment:user.dir=/home/tom/jobs
> > > 15/02/17 12:05:08 INFO zookeeper.ZooKeeper: Initiating client
> connection,
> > > connectString=namenode.com:2181,cloud-manager.com:2181
> > > sessionTimeout=9
> > > watcher=hconnection-0x77dacebf, quorum=namenode.com:2181,
> > > cloud-manager.com:2181, baseZNode=/hbase
> > > 15/02/17 12:05:08 INFO zookeeper.ClientCnxn: Opening socket connection
> to
> > > server namenode.com/54.172.21.54:2181. Will not attempt to
> authenticate
> > > using SASL (unknown error)
> > > 15/02/17 12:05:09 INFO zookeeper.ClientCnxn: Socket connection
> > established
> > > to namenode.com/54.172.21.54:2181, initiating session
> > > 15/02/17 12:05:09 INFO zookeeper.ClientCnxn: Session establishment
> > complete
> > > on server namenode.com/54.172.21.54:2181, sessionid =
> 0x24b7c7ba3532b32,
> > > negotiated timeout = 6
> > > 15/02/17 12:05:11 INFO
> > client.HConnectionManager$HConnectionImplementation:
> > > Closing master protocol: MasterService
> > > 15/02/17 12:05:11 INFO
> > client.HConnectionManager$HConnectionImplementation:
> > > Closing zookeeper sess

Re: HTable or HConnectionManager, how a client connect to HBase?

2015-02-17 Thread Serega Sheypak

Hi, Enis Söztutar
You've wrote:
>>You are right that the constructor new HTable(Configuration, ..) will
share the underlying connection if same configuration object is used.

What do it mean "the same"? is equality checked using reference (java == )
or using equals(Object other) method?


2015-02-18 7:34 GMT+03:00 Enis Söztutar :

> Hi,
>
> You are right that the constructor new HTable(Configuration, ..) will share
> the underlying connection if same configuration object is used. Connection
> is a heavy weight object, that holds the zookeeper connection, rpc client,
> socket connections to multiple region servers, master, and the thread pool,
> etc. You definitely do not want to create multiple connections per process
> unless you know what you are doing.
>
> The model is changed, and the old way of HTable(Configuration, ..) is
> deprecated because, we want to make the Connection lifecycle management
> explicit. In the new model, an opened Connection is closed by the user
> again, and light weight Table instances are obtained from the Connection.
> Having HTable's share their connections implicitly makes reasoning about it
> too hard. The new model should be pretty easy to follow.
>
> Enis
>
> On Sat, Feb 14, 2015 at 6:45 AM, Liu, Ming (HPIT-GADSC) 
> wrote:
>
> > Hi,
> >
> > I am using HBase 0.98.6.
> >
> > I learned from this maillist before, that the recommended method to
> > 'connect' to HBase from client is to use HConnectionManager like this:
> > HConnection
> > con=HConnectionManager.createConnection(configuration);
> > HTableInterfacetable =
> > con.getTable("hbase_table1");
> > Instead of
> > HTableInterface table = new
> > HTable(configuration, "hbase_table1");
> >
> > I don't quite understand the reason. I was thinking that each time I
> > initialize a HTable instance, it needs to create a new HConnection. And
> > that is expensive. But using the first method, multiple HTable instances
> > can share the same HConnection. That is quite reasonable to me.
> > However, I was reading from some articles on internet that , even if I
> use
> > the 'new HTable(conf, tbl)' method, if the 'conf' object is the same one,
> > all the HTable instances will still share the same HConnection. I was
> > recently read yet another article and said when using 'new HTable(conf,
> > tbl)', one don't need to use the exactly same 'conf' object (same one in
> > memory). if two 'conf' objects, two different objects are all the same, I
> > mean all attributes of these two are same (for example, created from the
> > same hbase-site.xml and never change) then HTable objects can still share
> > the same HConnection.  I also try to read the HTable src code, it is very
> > hard, but it seems to me the last statement is correct: 'HTable will
> share
> > HConnection, if configuration is all the same'.
> >
> > Sorry for so verbose. My question:
> > If two 'configuration' objects are same, then two HTable object
> > instantiated with them respectively can still share the same HConnection
> or
> > not? Directly using the 'new HTable()' method.
> > If the answer is 'yes', then why I still need the HConnectionManager to
> > create a shared connection?
> > I am talking about 0.98.6.
> > I googled for days, and even try to read HBase src code, but still get
> > really confused. I try to do some tests also, but since I am too newbie,
> I
> > don't know how to verify the difference, I really don't know what a
> > HConnection do under the hood. I counted the ZooKeeper client requests,
> and
> > I found some difference. If this ZooKeeper requests difference is a
> correct
> > metrics, it means to me that two HTable do not share HConnetion even
> using
> > same 'configuration' in the constructor. So it confused me more and
> more
> >
> > Please someone kindly help me for this newbie question and thanks in
> > advance.
> >
> > Thanks,
> > Ming
> >
> >
> >
>

HBase mttr

2015-02-19 Thread Serega Sheypak

Hi, we are running HBase on super-low-cost HW :)
Sometimes random node goes down. And HBase needs a time to move regions
from failed RS.

What are the practices to:
1. minimize MTTR?
2. is there any possibility to gracefully handle situation when region is
not accessible for r/w?
I can just drop the data, but I can't wait seconds or minutes for response.
I would like to break request after 100ms and return empty result.

Re: HBase connection pool

2015-02-27 Thread Serega Sheypak

Did you check how many open connections each ZK server has?
I my hypothesis is that you have ZK connection leaking and ZK server starts
to drop connection to prevent DDoS attack since you hit limit for opened
connections.

2015-02-26 22:15 GMT+03:00 Nick Dimiduk :

> Can you tell when these WARN messages are produced? Is it related to the
> creation of the connection object or one of the HTable instances?
>
> On Thu, Feb 26, 2015 at 7:27 AM, Marcelo Valle (BLOOMBERG/ LONDON) <
> mvallemil...@bloomberg.net> wrote:
>
> > Nick,
> >
> > I tried what you suggested, 1 HConnection and 1 Configuration for the
> > entire app:
> >
> > this.config = HBaseConfiguration.create();
> > this.connection = HConnectionManager.createConnection(config);
> >
> > And Threaded pooled HTableInterfaces:
> >
> > final HConnection lconnection = this.connection;
> > this.tlTable = new ThreadLocal() {
> > @Override
> > protected HTableInterface initialValue() {
> > try {
> > return lconnection.getTable("HBaseSerialWritesPOC");
> > // return new HTable(tlConfig.get(),
> > // "HBaseSerialWritesPOC");
> > } catch (IOException e) {
> > throw new RuntimeException(e);
> > }
> > }
> > };
> >
> > I started getting this error in my application:
> >
> > 2015-02-26 10:23:17,833 INFO [main-SendThread(xxx)] zookeeper.ClientCnxn
> > (ClientCnxn.java:logStartConnect(966)) - Opening socket connection to
> > server xxx. Will not attempt to authenticate using SASL (unknown error)
> > 2015-02-26 10:23:17,834 INFO [main-SendThread(xxx)] zookeeper.ClientCnxn
> > (ClientCnxn.java:primeConnection(849)) - Socket connection established to
> > xxx, initiating session
> > 2015-02-26 10:23:17,836 WARN [main-SendThread(xxx)] zookeeper.ClientCnxn
> > (ClientCnxn.java:run(1089)) - Session 0x0 for server xxx, unexpected
> error,
> > closing socket connection and attempting reconnect
> > java.io.IOException: Connection reset by peer
> > at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> > at sun.nio.ch.IOUtil.read(IOUtil.java:192)
> > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
> > at
> >
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
> > at
> >
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
> > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
> >
> >
> > -Marcelo
> >
> > From: ndimi...@gmail.com
> > Subject: Re: HBase connection pool
> >
> > Okay, looks like you're using a implicitly managed connection. It should
> > be fine to share a single config instance across all threads. The
> advantage
> > of HTablePool over this approach is that the number of HTables would be
> > managed independently from the number of Threads. This may or not be a
> > concern for you, based on your memory requirements, &c. In your case,
> > you're not specifying an ExecutorService per HTable, so the HTable
> > instances will be relatively light weight. Each table will manage it's
> own
> > write buffer, which can be shared by multiple threads when autoFlush is
> > disabled and HTablePool is used. This may or may not be desirable,
> > depending on your use-case.
> >
> > For what it's worth, HTablePool is marked deprecated in 1.0, will likely
> > be removed in 2.0. To "future proof" this code, I would move to a single
> > shared HConnection for the whole application, and a thread-local HTable
> > created from/with that connection.
> >
> > -n
> >
> > On Wed, Feb 25, 2015 at 10:53 AM, Marcelo Valle (BLOOMBERG/ LONDON) <
> > mvallemil...@bloomberg.net> wrote:
> >
> >> Hi Nick,
> >>
> >> I am using HBase version 0.96, I sent the link from version 0.94 because
> >> I haven't found the java API docs for 0.96, sorry about that.
> >> I have created the HTable directly from the config object, as follows:
> >>
> >> this.tlConfig = new ThreadLocal() {
> >>
> >> @Override
> >> protected Configuration initialValue() {
> >> return HBaseConfiguration.create();
> >> }
> >> };
> >> this.tlTable = new ThreadLocal() {
> >> @Override
> >> protected HTable initialValue() {
> >> try {
> >> return new HTable(tlConfig.get(), "HBaseSerialWritesPOC");
> >> } catch (IOException e) {
> >> throw new RuntimeException(e);
> >> }
> >> }
> >> };
> >>
> >> I am now sure if the Configuration object should be 1 per thread as
> well,
> >> maybe I could share this one?
> >>
> >> So, just to clarify, would I get any advantage using HTablePool object
> >> instead of ThreadLocal as I did?
> >>
> >> -Marcelo
> >>
> >> From: ndimi...@gmail.com
> >> Subject: Re: HBase connection pool
> >>
> >> Hi Marcelo,
> >>
> >> First thing, to be clear, you're working with a 0.94 release? The reason
> >> I ask is we've been doing some work in this area to improve things, so
> >> semantics may be slightly different between 0.94, 0.98, and 1.0.
> >>
> >> How are you managing the HConnection object (or are yo

Re: HBase connection pool

2015-02-27 Thread Serega Sheypak

Create one HConnection for all threads and then share it.
Create HTable in each tread using HConnection.
Do stuff.
Close HTable, DO NOT close HConnection.
It works 100% I did have pretty the same problem. Group helped me to
resolve it the way I suggest you.

2015-02-27 13:23 GMT+03:00 Marcelo Valle (BLOOMBERG/ LONDON) <
mvallemil...@bloomberg.net>:

> But then wouldn't it happen when I had 1 Configuration per thread? I had
> more connections before start using 1 HConnection for the whole app, and it
> use to work fine.
>
> From: user@hbase.apache.org
> Subject: Re: HBase connection pool
>
> Did you check how many open connections each ZK server has?
> I my hypothesis is that you have ZK connection leaking and ZK server starts
> to drop connection to prevent DDoS attack since you hit limit for opened
> connections.
>
> 2015-02-26 22:15 GMT+03:00 Nick Dimiduk :
>
> > Can you tell when these WARN messages are produced? Is it related to the
> > creation of the connection object or one of the HTable instances?
> >
> > On Thu, Feb 26, 2015 at 7:27 AM, Marcelo Valle (BLOOMBERG/ LONDON) <
> > mvallemil...@bloomberg.net> wrote:
> >
> > > Nick,
> > >
> > > I tried what you suggested, 1 HConnection and 1 Configuration for the
> > > entire app:
> > >
> > > this.config = HBaseConfiguration.create();
> > > this.connection = HConnectionManager.createConnection(config);
> > >
> > > And Threaded pooled HTableInterfaces:
> > >
> > > final HConnection lconnection = this.connection;
> > > this.tlTable = new ThreadLocal() {
> > > @Override
> > > protected HTableInterface initialValue() {
> > > try {
> > > return lconnection.getTable("HBaseSerialWritesPOC");
> > > // return new HTable(tlConfig.get(),
> > > // "HBaseSerialWritesPOC");
> > > } catch (IOException e) {
> > > throw new RuntimeException(e);
> > > }
> > > }
> > > };
> > >
> > > I started getting this error in my application:
> > >
> > > 2015-02-26 10:23:17,833 INFO [main-SendThread(xxx)]
> zookeeper.ClientCnxn
> > > (ClientCnxn.java:logStartConnect(966)) - Opening socket connection to
> > > server xxx. Will not attempt to authenticate using SASL (unknown error)
> > > 2015-02-26 10:23:17,834 INFO [main-SendThread(xxx)]
> zookeeper.ClientCnxn
> > > (ClientCnxn.java:primeConnection(849)) - Socket connection established
> to
> > > xxx, initiating session
> > > 2015-02-26 10:23:17,836 WARN [main-SendThread(xxx)]
> zookeeper.ClientCnxn
> > > (ClientCnxn.java:run(1089)) - Session 0x0 for server xxx, unexpected
> > error,
> > > closing socket connection and attempting reconnect
> > > java.io.IOException: Connection reset by peer
> > > at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> > > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> > > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> > > at sun.nio.ch.IOUtil.read(IOUtil.java:192)
> > > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
> > > at
> > >
> >
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
> > > at
> > >
> >
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
> > > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
> > >
> > >
> > > -Marcelo
> > >
> > > From: ndimi...@gmail.com
> > > Subject: Re: HBase connection pool
> > >
> > > Okay, looks like you're using a implicitly managed connection. It
> should
> > > be fine to share a single config instance across all threads. The
> > advantage
> > > of HTablePool over this approach is that the number of HTables would be
> > > managed independently from the number of Threads. This may or not be a
> > > concern for you, based on your memory requirements, &c. In your case,
> > > you're not specifying an ExecutorService per HTable, so the HTable
> > > instances will be relatively light weight. Each table will manage it's
> > own
> > > write buffer, which can be shared by multiple threads when autoFlush is
> > > disabled and HTablePool is used. This may or may not be desirable,
> > > depending on your use-case.
> > >
> > > For what it's worth, HTablePool is marked deprecated in 1.0, will
> likely
> > > be removed in 2.0. To "future proof" this code, I would move to a
> single
> > > shared HConnection for the whole application, and a thread-local HTable
> > > created from/with that connection.
> > >
> > > -n
> > >
> > > On Wed, Feb 25, 2015 at 10:53 AM, Marcelo Valle (BLOOMBERG/ LONDON) <
> > > mvallemil...@bloomberg.net> wrote:
> > >
> > >> Hi Nick,
> > >>
> > >> I am using HBase version 0.96, I sent the link from version 0.94
> because
> > >> I haven't found the java API docs for 0.96, sorry about that.
> > >> I have created the HTable directly from the config object, as follows:
> > >>
> > >> this.tlConfig = new ThreadLocal() {
> > >>
> > >> @Override
> > >> protected Configuration initialValue() {
> > >> return HBaseConfiguration.create();
> > >> }
> > >> };
> > >> this.tlTable = new ThreadLocal() {
> > >> @Override
> > >> protected HTable

HBase 0.98 CDH 5.2, significant Read degradation

2015-03-13 Thread Serega Sheypak

Hi, finally I met HBase perfomance problems :(
I see this message:
org.apache.hadoop.hbase.regionserver.wal.FSHLog

Slow sync cost: 345 ms, current pipeline: []

in log file
sometimes [] contains actual addresses of my datanodes.
What are the steps to undestand why HBase is so slow.
I have 7RS and 3K reads per second and 500 writes per second. Requests are
evenly distributed across cluster since I use hash for keys.

Re: HBase 0.98 CDH 5.2, significant Read degradation

2015-03-14 Thread Serega Sheypak

We are using cheap HW to run our HBase. Problem is in toshiba disks.
Thanks!


2015-03-13 20:44 GMT+03:00 Nick Dimiduk :

> HBase is telling you that writes to those datanodes are slow. Is it the
> same host names over and over? Probably they have high system load, a bad
> or dying disk, bad or dying network adapter, &c. Basically HBase is giving
> you a hint to go diagnose your cluster.
>
> -n
>
> On Fri, Mar 13, 2015 at 2:44 AM, Serega Sheypak 
> wrote:
>
> > Hi, finally I met HBase perfomance problems :(
> > I see this message:
> > org.apache.hadoop.hbase.regionserver.wal.FSHLog
> >
> > Slow sync cost: 345 ms, current pipeline: []
> >
> > in log file
> > sometimes [] contains actual addresses of my datanodes.
> > What are the steps to undestand why HBase is so slow.
> > I have 7RS and 3K reads per second and 500 writes per second. Requests
> are
> > evenly distributed across cluster since I use hash for keys.
> >
>

ipv6? org.apache.pig.backend.executionengine.ExecException: ERROR 2118: For input string: "4f8:0:a0a1::add:1010"

2015-03-18 Thread Serega Sheypak

Hi, I'm trying to use HBaseStorage to read data from HBase
1. I do persist smth to hbase each day using hbase-client java api
2. using HBaseStorage vis oozie
Now I failed to read persisted data using pig script via HUE or plain pig.
I don't have any problem reading data using java client api.
What do I do wrong?

Caused by: java.lang.NumberFormatException: For input string:
"4f8:0:a0a1::add:1010"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.parseInt(Integer.java:527)
at com.sun.jndi.dns.DnsClient.(DnsClient.java:125)
at com.sun.jndi.dns.Resolver.(Resolver.java:61)
at com.sun.jndi.dns.DnsContext.getResolver(DnsContext.java:570)
at com.sun.jndi.dns.DnsContext.c_getAttributes(DnsContext.java:430)
at
com.sun.jndi.toolkit.ctx.ComponentDirContext.p_getAttributes(ComponentDirContext.java:231)
at
com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.getAttributes(PartialCompositeDirContext.java:139)
at
com.sun.jndi.toolkit.url.GenericURLDirContext.getAttributes(GenericURLDirContext.java:103)
at
javax.naming.directory.InitialDirContext.getAttributes(InitialDirContext.java:142)
at org.apache.hadoop.net.DNS.reverseDns(DNS.java:84)
at
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.reverseDNS(TableInputFormatBase.java:228)
at
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:191)
at
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat.getSplits(HBaseTableInputFormat.java:87)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
... 18 more

Re: ipv6? org.apache.pig.backend.executionengine.ExecException: ERROR 2118: For input string: "4f8:0:a0a1::add:1010"

2015-03-19 Thread Serega Sheypak

Hm... So client or sever running openjdk?

четверг, 19 марта 2015 г. пользователь Alok Singh написал:

> Looks like ipv6 address is not being parsed correctly. Maybe related
> to : https://bugs.openjdk.java.net/browse/JDK-6991580
>
> Alok
>
> On Wed, Mar 18, 2015 at 3:13 PM, Serega Sheypak
> > wrote:
> > Hi, I'm trying to use HBaseStorage to read data from HBase
> > 1. I do persist smth to hbase each day using hbase-client java api
> > 2. using HBaseStorage vis oozie
> > Now I failed to read persisted data using pig script via HUE or plain
> pig.
> > I don't have any problem reading data using java client api.
> > What do I do wrong?
> >
> > Caused by: java.lang.NumberFormatException: For input string:
> > "4f8:0:a0a1::add:1010"
> > at
> >
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> > at java.lang.Integer.parseInt(Integer.java:492)
> > at java.lang.Integer.parseInt(Integer.java:527)
> > at com.sun.jndi.dns.DnsClient.(DnsClient.java:125)
> > at com.sun.jndi.dns.Resolver.(Resolver.java:61)
> > at com.sun.jndi.dns.DnsContext.getResolver(DnsContext.java:570)
> > at com.sun.jndi.dns.DnsContext.c_getAttributes(DnsContext.java:430)
> > at
> >
> com.sun.jndi.toolkit.ctx.ComponentDirContext.p_getAttributes(ComponentDirContext.java:231)
> > at
> >
> com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.getAttributes(PartialCompositeDirContext.java:139)
> > at
> >
> com.sun.jndi.toolkit.url.GenericURLDirContext.getAttributes(GenericURLDirContext.java:103)
> > at
> >
> javax.naming.directory.InitialDirContext.getAttributes(InitialDirContext.java:142)
> > at org.apache.hadoop.net.DNS.reverseDns(DNS.java:84)
> > at
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.reverseDNS(TableInputFormatBase.java:228)
> > at
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:191)
> > at
> >
> org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat.getSplits(HBaseTableInputFormat.java:87)
> > at
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
> > ... 18 more
>

Re: ipv6? org.apache.pig.backend.executionengine.ExecException: ERROR 2118: For input string: "4f8:0:a0a1::add:1010"

2015-03-19 Thread Serega Sheypak

I can give 100% we are not using openjdk or ipv6
I see that people felt the same pain before:
http://mail-archives.apache.org/mod_mbox/hbase-user/201305.mbox/%3ca7f70ff1-b24f-403c-b3e8-7ae18e72d...@gmail.com%3E

Does anybody knows how to overcome such problem?

2015-03-19 9:59 GMT+03:00 Serega Sheypak :

> Hm... So client or sever running openjdk?
>
> четверг, 19 марта 2015 г. пользователь Alok Singh написал:
>
> Looks like ipv6 address is not being parsed correctly. Maybe related
>> to : https://bugs.openjdk.java.net/browse/JDK-6991580
>>
>> Alok
>>
>> On Wed, Mar 18, 2015 at 3:13 PM, Serega Sheypak
>>  wrote:
>> > Hi, I'm trying to use HBaseStorage to read data from HBase
>> > 1. I do persist smth to hbase each day using hbase-client java api
>> > 2. using HBaseStorage vis oozie
>> > Now I failed to read persisted data using pig script via HUE or plain
>> pig.
>> > I don't have any problem reading data using java client api.
>> > What do I do wrong?
>> >
>> > Caused by: java.lang.NumberFormatException: For input string:
>> > "4f8:0:a0a1::add:1010"
>> > at
>> >
>> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>> > at java.lang.Integer.parseInt(Integer.java:492)
>> > at java.lang.Integer.parseInt(Integer.java:527)
>> > at com.sun.jndi.dns.DnsClient.(DnsClient.java:125)
>> > at com.sun.jndi.dns.Resolver.(Resolver.java:61)
>> > at com.sun.jndi.dns.DnsContext.getResolver(DnsContext.java:570)
>> > at com.sun.jndi.dns.DnsContext.c_getAttributes(DnsContext.java:430)
>> > at
>> >
>> com.sun.jndi.toolkit.ctx.ComponentDirContext.p_getAttributes(ComponentDirContext.java:231)
>> > at
>> >
>> com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.getAttributes(PartialCompositeDirContext.java:139)
>> > at
>> >
>> com.sun.jndi.toolkit.url.GenericURLDirContext.getAttributes(GenericURLDirContext.java:103)
>> > at
>> >
>> javax.naming.directory.InitialDirContext.getAttributes(InitialDirContext.java:142)
>> > at org.apache.hadoop.net.DNS.reverseDns(DNS.java:84)
>> > at
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.reverseDNS(TableInputFormatBase.java:228)
>> > at
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:191)
>> > at
>> >
>> org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat.getSplits(HBaseTableInputFormat.java:87)
>> > at
>> >
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
>> > ... 18 more
>>
>

Re: ipv6? org.apache.pig.backend.executionengine.ExecException: ERROR 2118: For input string: "4f8:0:a0a1::add:1010"

2015-03-19 Thread Serega Sheypak

Hi, there was dns ipv6 on datacenter side. We just delete ALL packages
having ipv6 in their name.
it helped.


2015-03-19 22:31 GMT+03:00 Esteban Gutierrez :

> Hi Serega,
>
> Sounds like one of your DNS servers has an IPv6 address. If you look
> into JDK-6991580 it comes with a very simple test and the problem is very
> easy to reproduce with jdk8 or jdk7
>
> cheers,
> esteban.
>
>
> --
> Cloudera, Inc.
>
>
> On Thu, Mar 19, 2015 at 4:56 AM, Serega Sheypak 
> wrote:
>
> > I can give 100% we are not using openjdk or ipv6
> > I see that people felt the same pain before:
> >
> >
> http://mail-archives.apache.org/mod_mbox/hbase-user/201305.mbox/%3ca7f70ff1-b24f-403c-b3e8-7ae18e72d...@gmail.com%3E
> >
> > Does anybody knows how to overcome such problem?
> >
> > 2015-03-19 9:59 GMT+03:00 Serega Sheypak :
> >
> > > Hm... So client or sever running openjdk?
> > >
> > > четверг, 19 марта 2015 г. пользователь Alok Singh написал:
> > >
> > > Looks like ipv6 address is not being parsed correctly. Maybe related
> > >> to : https://bugs.openjdk.java.net/browse/JDK-6991580
> > >>
> > >> Alok
> > >>
> > >> On Wed, Mar 18, 2015 at 3:13 PM, Serega Sheypak
> > >>  wrote:
> > >> > Hi, I'm trying to use HBaseStorage to read data from HBase
> > >> > 1. I do persist smth to hbase each day using hbase-client java api
> > >> > 2. using HBaseStorage vis oozie
> > >> > Now I failed to read persisted data using pig script via HUE or
> plain
> > >> pig.
> > >> > I don't have any problem reading data using java client api.
> > >> > What do I do wrong?
> > >> >
> > >> > Caused by: java.lang.NumberFormatException: For input string:
> > >> > "4f8:0:a0a1::add:1010"
> > >> > at
> > >> >
> > >>
> >
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> > >> > at java.lang.Integer.parseInt(Integer.java:492)
> > >> > at java.lang.Integer.parseInt(Integer.java:527)
> > >> > at com.sun.jndi.dns.DnsClient.(DnsClient.java:125)
> > >> > at com.sun.jndi.dns.Resolver.(Resolver.java:61)
> > >> > at com.sun.jndi.dns.DnsContext.getResolver(DnsContext.java:570)
> > >> > at com.sun.jndi.dns.DnsContext.c_getAttributes(DnsContext.java:430)
> > >> > at
> > >> >
> > >>
> >
> com.sun.jndi.toolkit.ctx.ComponentDirContext.p_getAttributes(ComponentDirContext.java:231)
> > >> > at
> > >> >
> > >>
> >
> com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.getAttributes(PartialCompositeDirContext.java:139)
> > >> > at
> > >> >
> > >>
> >
> com.sun.jndi.toolkit.url.GenericURLDirContext.getAttributes(GenericURLDirContext.java:103)
> > >> > at
> > >> >
> > >>
> >
> javax.naming.directory.InitialDirContext.getAttributes(InitialDirContext.java:142)
> > >> > at org.apache.hadoop.net.DNS.reverseDns(DNS.java:84)
> > >> > at
> > >> >
> > >>
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.reverseDNS(TableInputFormatBase.java:228)
> > >> > at
> > >> >
> > >>
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:191)
> > >> > at
> > >> >
> > >>
> >
> org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat.getSplits(HBaseTableInputFormat.java:87)
> > >> > at
> > >> >
> > >>
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
> > >> > ... 18 more
> > >>
> > >
> >
>

performance problems during bulk load because of triggered compaction?

2015-03-24 Thread Serega Sheypak

Hi, I have lowcost hardware, 2 HDD, 10 nodes with HBase 0.98 CDH 5.2.1
i have several apps that read/write to HBase using Java api.
Sometimes I see that response time raises from normal 30-40 ms to 1000-2000
ms or even more.
There are no running MapReduce at that time. But there is a bulk load each
hour.
I see that response degradation and bulk load process happen sometimes.

Table size is 17GB on hdfs and has 84 regions. Most of regions are
150-200MB size.
it has single column family:
{NAME => 'd', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROWCOL',
REPLICATION_SCOPE => '0', COMPRESSION => 'SNAPPY', VERSIONS => '1', TTL =>
'691200 SECONDS (8 DAYS)', MIN_VERSIONS => '0', KEEP_DELETED_CELLS =>
'false', BLOCKSIZE => '65536', IN_MEMORY => 'true', BLOCKCACHE => 'true'}

When bulkload happens, it just updates existing cell value, it brings 0.01%
of new rows.
I keep serialized objects in d:q, where d is column family and q is column
qualifier

How can I get the root cause of performance degradation and minimize it?

Re: [ANNOUNCE] Sean Busbey joins the Apache HBase PMC

2015-03-26 Thread Serega Sheypak

Sean, great to see you hear.

2015-03-26 21:06 GMT+03:00 anil gupta :

> Congrats, Sean.
>
> On Thu, Mar 26, 2015 at 10:27 AM, Nick Dimiduk  wrote:
>
> > Congratulations Sean! Nice work.
> >
> > On Thu, Mar 26, 2015 at 10:26 AM, Andrew Purtell 
> > wrote:
> >
> > > On behalf of the Apache HBase PMC I"m pleased to announce that Sean
> > Busbey
> > > has accepted our invitation to become a PMC member on the Apache HBase
> > > project. Sean has been an active and positive contributor in many
> areas,
> > > including on project meta-concerns such as versioning, build
> > > infrastructure, code reviews, etc. He's a natural and we're looking
> > forward
> > > to many more future contributions.
> > >
> > > Welcome to the PMC, Sean!
> > >
> > > --
> > > Best regards,
> > >
> > >- Andy
> > >
> > > Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > > (via Tom White)
> > >
> >
>
>
>
> --
> Thanks & Regards,
> Anil Gupta
>

Strange PrefixFilter behaviour on HBase 0.98.6-cdh5.2.0 OutOfOrderScannerNextException

2015-04-06 Thread Serega Sheypak

Hi, I'm trying to use PrefixFilter for the RowKey.
My rowKey consists of 3 parts, actually it's composite.
I do provide first part of key to scan all rows starting from prefix. There
should be less than 10 rowkeys for each prefix, since prefix is md5 hash.
I have itests for this part of code, it runs without any problems, failure
happens on real data of course. Can't get what Im doing wrong.

private Scan createCrossIdRowKeyPrefixFilterScanner(byte[] prefix, int limit){
Scan scan = new Scan();
scan.addColumn(CF_B, CQ_B);
scan.setMaxResultSize(limit);
scan.setBatch(BATCH);
scan.setMaxVersions(SINGLE_VERSION);
scan.setCaching(CACHING);

PrefixFilter prefixFilter = new PrefixFilter(prefix);
scan.setFilter(prefixFilter);

return scan;
}


and invocation:

stopWatch.start();
resultScanner = hTable.getScanner(rowKeyPrefixScanner);
Result[] results = resultScanner.next(limit);
stopWatch.stop();
LOG.debug("Took ["+(stopWatch.getTime()/1000L)+"]sec to scan for key prefix");
return parseUsers(results);

And I get:

used by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException:
org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected
nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id:
225177 number_of_rows: 10 close_scanner: false next_call_seq: 0

at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3193)

at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)

at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)

at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)

at java.lang.Thread.run(Thread.java:745)


 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)

at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:526)

at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)

at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)

at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:304)

at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)

at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)

at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)

at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)

at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:355)

... 14 more

Caused by:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException):
org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected
nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id:
225177 number_of_rows: 10 close_scanner: false next_call_seq: 0

at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3193)

at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)

at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)

at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)

at java.lang.Thread.run(Thread.java:745)


 at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)

at
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)

at
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)

at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990)

at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)

Re: Strange PrefixFilter behaviour on HBase 0.98.6-cdh5.2.0 OutOfOrderScannerNextException

2015-04-06 Thread Serega Sheypak

Looks like I didn't set startRow for the scanner...

2015-04-06 17:04 GMT+02:00 Serega Sheypak :

> Hi, I'm trying to use PrefixFilter for the RowKey.
> My rowKey consists of 3 parts, actually it's composite.
> I do provide first part of key to scan all rows starting from prefix.
> There should be less than 10 rowkeys for each prefix, since prefix is md5
> hash.
> I have itests for this part of code, it runs without any problems, failure
> happens on real data of course. Can't get what Im doing wrong.
>
> private Scan createCrossIdRowKeyPrefixFilterScanner(byte[] prefix, int limit){
> Scan scan = new Scan();
> scan.addColumn(CF_B, CQ_B);
> scan.setMaxResultSize(limit);
> scan.setBatch(BATCH);
> scan.setMaxVersions(SINGLE_VERSION);
> scan.setCaching(CACHING);
>
> PrefixFilter prefixFilter = new PrefixFilter(prefix);
> scan.setFilter(prefixFilter);
>
> return scan;
> }
>
>
> and invocation:
>
> stopWatch.start();
> resultScanner = hTable.getScanner(rowKeyPrefixScanner);
> Result[] results = resultScanner.next(limit);
> stopWatch.stop();
> LOG.debug("Took ["+(stopWatch.getTime()/1000L)+"]sec to scan for key prefix");
> return parseUsers(results);
>
> And I get:
>
> used by:
> org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException:
> org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected
> nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id:
> 225177 number_of_rows: 10 close_scanner: false next_call_seq: 0
>
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3193)
>
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)
>
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>
> at java.lang.Thread.run(Thread.java:745)
>
>
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>
> at
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>
> at
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
>
> at
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:304)
>
> at
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)
>
> at
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
>
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
>
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
>
> at
> org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:355)
>
> ... 14 more
>
> Caused by:
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException):
> org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected
> nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id:
> 225177 number_of_rows: 10 close_scanner: false next_call_seq: 0
>
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3193)
>
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)
>
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>
> at java.lang.Thread.run(Thread.java:745)
>
>
>  at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
>
> at
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
>
> at
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
>
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990)
>
> at
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)
>

Re: Strange PrefixFilter behaviour on HBase 0.98.6-cdh5.2.0 OutOfOrderScannerNextException

2015-04-06 Thread Serega Sheypak

I forgot to set firstRow for Scanner. Looks like HBase tried to scan the
whole table. Value from FilterPrefix wasn't used. I supposed that  prefix
value could be pushed to scanner as a starting point, but not.

2015-04-06 18:45 GMT+02:00 Imants Cekusins :

> may this be related:
>
> https://issues.apache.org/jira/browse/HBASE-11295
>
> ?
>

Re: Strange PrefixFilter behaviour on HBase 0.98.6-cdh5.2.0 OutOfOrderScannerNextException

2015-04-06 Thread Serega Sheypak

>Yes, scan goes through entire table unless start row is set.

> does this explain the error though?

> Prefix filter should work even with scan beginning from 1st record, no?
It would only take longer.


Yes, it's explains. My table has 70M rows, and prefix filter should scan
just for 10 starting from exact place, it takes millis to get response.

2015-04-06 22:54 GMT+02:00 Serega Sheypak :

> I forgot to set firstRow for Scanner. Looks like HBase tried to scan the
> whole table. Value from FilterPrefix wasn't used. I supposed that  prefix
> value could be pushed to scanner as a starting point, but not.
>
> 2015-04-06 18:45 GMT+02:00 Imants Cekusins :
>
>> may this be related:
>>
>> https://issues.apache.org/jira/browse/HBASE-11295
>>
>> ?
>>
>
>
>

Re: write availability

2015-04-07 Thread Serega Sheypak

>If I have an application that writes to a HBase cluster, can I count that
the cluster will always available to receive writes?
No, it's CP, not AP system.
> so everything get in sync when the other nodes get up again
There is no hinted backoff, It's not Cassandra.



2015-04-07 14:48 GMT+02:00 Marcelo Valle (BLOOMBERG/ LONDON) <
mvallemil...@bloomberg.net>:

> If I have an application that writes to a HBase cluster, can I count that
> the cluster will always available to receive writes?
> I might not be able to read if a region server which handles a range of
> keys is down, but will I be able to keep writing to other nodes, so
> everything get in sync when the other nodes get up again?
> Or I might get no write availability for a while?

Re: Hbase 0.98 Distributed Mode with hadoop 2.6 HA:Issues of Hbase

2015-04-07 Thread Serega Sheypak


 hbase.master
hdfs://cluster1:6

what is it?

2015-04-07 16:34 GMT+02:00 sridhararao mutluri :

> Hi,
> This is my hbase-site.xml:
>  hbase.master
> hdfs://cluster1:6   
>  hbase.rootdir
> hdfs://mycluster/hbase  
>  hbase.cluster.distributed
>  true  
>  hbase.zookeeper.property.clientPort
>  2181  
>  hbase.zookeeper.quorum
>  cluster1,cluster2,cluster3  
>hbase.zookeeper.property.dataDir
>  /hadoop/hdfs/zookeeper/data/zk1
> 
> Thanks,Sridhar
> > Date: Tue, 7 Apr 2015 07:09:44 -0700
> > Subject: Re: Hbase 0.98 Distributed Mode with hadoop 2.6 HA:Issues of
> Hbase
> > From: yuzhih...@gmail.com
> > To: user@hbase.apache.org
> > CC: bus...@cloudera.com
> >
> > bq. hbase.rootdir
> > hdfs://mycluster/hbase  
> >
> > Looks like there is a property missing at the end of the line.
> >
> > You showed snippet from shell output. Have you checked master log ?
> >
> > Cheers
> >
> > On Tue, Apr 7, 2015 at 5:16 AM, sridhararao mutluri 
> > wrote:
> >
> > > Hi Team,
> > > I am trying to use hbase 0.98 distributed mode with zk 3.4.6 & hadoop
> ha
> > > 2.6.(JDK 1.8)
> > > I am having following issue and little help in google pages also
> > > I tried to start zk first after clearing zk data dir and tried to start
> > > master first and rs later and no luck
> > > I used mycluster/hbase in hbase-site.xml and no luck to me.tried to put
> > > hdfs-site.xml/core-site.xml in $Hbase_home/conf also.
> > > I noticed all hadoop.*jars in $HBASE_HOME/lib are 2.2 of hadoop where
> as
> > > we are using 2.6 and tried to copy those hadoop jars ..but no luck.
> > > A New new error is coming:
> > >
> > > hbase(main):002:0> create 'cars', 'vi'
> > > ERROR: java.io.IOException: Table Namespace Manager not ready yet, try
> > > again laterat
> > >
> org.apache.hadoop.hbase.master.HMaster.getNamespaceDescriptor(HMaster.java:3179)
> > >   at
> > > org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1735)
> > >   at
> org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1774)
> > >   at
> > >
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:40470)
> > >   at
> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
> > > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
> > >   at
> > >
> org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:74)
> > >   at
> > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> > > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> > >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > >   at
> > >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > >   at java.lang.Thread.run(Thread.java:745)ZK are running on 3
> servers.I
> > > tired to stop hbase and then stopped zk in 3 clusters and cleared zk
> data
> > > and started fresh.
> > > My bash_profile on classpath:export
> > > CLASSPATH=$CLASSPATH:/home/hadoop/hadoop-2.6.0/lib/*:.export
> > > CLASSPATH=$CLASSPATH:/home/hadoop/hbase-0.98.4/lib/*:.
> > > my hbase_site.xml
> > > hbase.rootdir
> > > hdfs://mycluster/hbase   
>  
> > >  and hadoop core-site is same.
> > > Any incompatibility between JDK1.8 with hbase or hadoop 2.6?
> > > Please suggest any solution.
> > > Thanks,Sridhar
> > >
> > >
>
>

Re: write availability

2015-04-07 Thread Serega Sheypak

Marcelo, if you are comparing with Cassandra:
1. don't think about data replication/redundancy. It's out of HBase scope.
C* thinks about it, HBase doesn't HBase uses HDFS. So assume you never-ever
can lost the data. Assume, that HDFS configured properly.

2. HBase doesn't think in terms of replica. It's out if HBase scope.
It supposes that you have several files for table. And that's all.
Each RS takes it's own range of keys from table/ There is no keyspace in
HBase. There are tables. Each table has it's own key range. Each table has
it's own splits. There are no tokes/vnodes. You can split keys in HBase
manually of HBase will do it for you.
Assume we have a table. It has 1000 rowkeys starting from 0 to 999.
We decided to split it manually on two parts (regions in HBase terms):
0...900 and 901...999.
We need at least 1 region server to serve these regions. Assume we have 2
region servers and one takes region 0...900, the second takes 901...999.
If second RS which serves region 901...999 goes down you can't access this
data using HBase, but data still exists. We need to wait until other Region
server would take care of  region 901 ... 999.
C* solves too many problems at one time, HBase doesn't.

2015-04-07 18:04 GMT+02:00 Marcelo Valle (BLOOMBERG/ LONDON) <
mvallemil...@bloomberg.net>:

> Wellington,
>
> I might be misinterpreting this:
> http://stackoverflow.com/questions/13741946/role-of-datanode-regionserver-in-hbase-hadoop-integration
>
> But aren't HBase region servers and HDFS datanodes always in the same
> server? With a replication factor of 3, what happens if all 3 datanodes
> hosting that information go down and one of them come back, but with the
> disk intact? Considering from the time they went down to the time it went
> back HBase received new writes that would go to the same data node...
>
>
> From: user@hbase.apache.org
> Subject: Re: write availability
>
> The data is stored on files on hdfs. If a RS goes down, the master knows
> which regions were on that RS and which hdfs files contain data for these
> regions, so it will just assign the regions to others RS, and these others
> RS will have access to the regions data because it's stored on HDFS. The RS
> does not "own" the disk, this is HDFS job, so the recovery on this case is
> transparent.
>
>
> On 7 Apr 2015, at 16:51, Marcelo Valle (BLOOMBERG/ LONDON) <
> mvallemil...@bloomberg.net> wrote:
>
> > So if a RS goes down, it's assumed you lost the data on it, right?
> > HBase has replications on HDFS, so if a RS goes down it doesn't mean I
> lost all the data, as I could have the replicas yet... But what happens if
> all RS hosting a specific region goes down?
> > What if one RS from this one comes back again, but with the disk intact,
> with all the data it had before crashing?
> >
> >
> > From: user@hbase.apache.org
> > Subject: Re: write availability
> >
> > When a RS goes down, the Master will try to assign the regions on the
> remaining RSes. When the RS comes back, after a while, the Master balancer
> process will re-distribute regions between RS, so the given RS will be
> hosting regions, but not necessarily the one it used to host before it went
> down.
> >
> >
> > On 7 Apr 2015, at 16:31, Marcelo Valle (BLOOMBERG/ LONDON) <
> mvallemil...@bloomberg.net> wrote:
> >
> >>> So if the cluster is up, then you can insert records in to HBase even
> though you lost a RS that was handing a specific region.
> >>
> >> What happens when the RS goes down? Writes to that region will be
> written to another region server? Another RS assumes the region "range"
> while the RS is down?
> >>
> >> What happens when the RS that was down goes up again?
> >>
> >>
> >> From: user@hbase.apache.org
> >> Subject: Re: write availability
> >>
> >> I don’t know if I would say that…
> >>
> >> I read Marcelo’s question of “if the cluster is up, even though a RS
> may be down, can I still insert records in to HBase?”
> >>
> >> So if the cluster is up, then you can insert records in to HBase even
> though you lost a RS that was handing a specific region.
> >>
> >> But because he talked about syncing nodes… I could be misreading his
> initial question…
> >>
> >>> On Apr 7, 2015, at 9:02 AM, Serega Sheypak 
> wrote:
> >>>
> >>>> If I have an application that writes to a HBase cluster, can I count
> that
> >>> the cluster will always available to receive writes?
> >>> No, it's CP, not AP system.
> >>>> so ev

Re: write availability

2015-04-07 Thread Serega Sheypak

>But aren't HBase region servers and HDFS datanodes always in the same
server?
It's good point, but it's not mandatory.

>With a replication factor of 3, what happens if all 3 datanodes hosting
that information go down and one of them come back, but with the disk
intact?
Should be OK. you have 3 copies for each byte. 1 copy resurrected.

 Considering from the time they went down to the time it went back HBase
received new writes that would go to the same data node...
>"Should work fine" (need to check what HDFS does it it can't reach
replication level for the block), but you'll start to get under
replicated data. HDFS takes care about data replication.

2015-04-07 18:38 GMT+02:00 Serega Sheypak :

> Marcelo, if you are comparing with Cassandra:
> 1. don't think about data replication/redundancy. It's out of HBase scope.
> C* thinks about it, HBase doesn't HBase uses HDFS. So assume you never-ever
> can lost the data. Assume, that HDFS configured properly.
>
> 2. HBase doesn't think in terms of replica. It's out if HBase scope.
> It supposes that you have several files for table. And that's all.
> Each RS takes it's own range of keys from table/ There is no keyspace in
> HBase. There are tables. Each table has it's own key range. Each table has
> it's own splits. There are no tokes/vnodes. You can split keys in HBase
> manually of HBase will do it for you.
> Assume we have a table. It has 1000 rowkeys starting from 0 to 999.
> We decided to split it manually on two parts (regions in HBase terms):
> 0...900 and 901...999.
> We need at least 1 region server to serve these regions. Assume we have 2
> region servers and one takes region 0...900, the second takes 901...999.
> If second RS which serves region 901...999 goes down you can't access this
> data using HBase, but data still exists. We need to wait until other Region
> server would take care of  region 901 ... 999.
> C* solves too many problems at one time, HBase doesn't.
>
> 2015-04-07 18:04 GMT+02:00 Marcelo Valle (BLOOMBERG/ LONDON) <
> mvallemil...@bloomberg.net>:
>
>> Wellington,
>>
>> I might be misinterpreting this:
>> http://stackoverflow.com/questions/13741946/role-of-datanode-regionserver-in-hbase-hadoop-integration
>>
>> But aren't HBase region servers and HDFS datanodes always in the same
>> server? With a replication factor of 3, what happens if all 3 datanodes
>> hosting that information go down and one of them come back, but with the
>> disk intact? Considering from the time they went down to the time it went
>> back HBase received new writes that would go to the same data node...
>>
>>
>> From: user@hbase.apache.org
>> Subject: Re: write availability
>>
>> The data is stored on files on hdfs. If a RS goes down, the master knows
>> which regions were on that RS and which hdfs files contain data for these
>> regions, so it will just assign the regions to others RS, and these others
>> RS will have access to the regions data because it's stored on HDFS. The RS
>> does not "own" the disk, this is HDFS job, so the recovery on this case is
>> transparent.
>>
>>
>> On 7 Apr 2015, at 16:51, Marcelo Valle (BLOOMBERG/ LONDON) <
>> mvallemil...@bloomberg.net> wrote:
>>
>> > So if a RS goes down, it's assumed you lost the data on it, right?
>> > HBase has replications on HDFS, so if a RS goes down it doesn't mean I
>> lost all the data, as I could have the replicas yet... But what happens if
>> all RS hosting a specific region goes down?
>> > What if one RS from this one comes back again, but with the disk
>> intact, with all the data it had before crashing?
>> >
>> >
>> > From: user@hbase.apache.org
>> > Subject: Re: write availability
>> >
>> > When a RS goes down, the Master will try to assign the regions on the
>> remaining RSes. When the RS comes back, after a while, the Master balancer
>> process will re-distribute regions between RS, so the given RS will be
>> hosting regions, but not necessarily the one it used to host before it went
>> down.
>> >
>> >
>> > On 7 Apr 2015, at 16:31, Marcelo Valle (BLOOMBERG/ LONDON) <
>> mvallemil...@bloomberg.net> wrote:
>> >
>> >>> So if the cluster is up, then you can insert records in to HBase even
>> though you lost a RS that was handing a specific region.
>> >>
>> >> What happens when the RS goes down? Writes to that region will be
>> written to another region server? Another RS assumes the region "range"

Re: Export Hbase Snapshot

2015-04-09 Thread Serega Sheypak

Hi,
what is the reason to backup HDFS? It's distributed, reliable,
fault-tolerant, e.t.c.
NFS should expensive in order to keep TBs of data.


What problem you are trying to solve?


2015-04-09 20:35 GMT+02:00 Afroz Ahmad :

> We are planning to use the snapshot feature that takes a backup of a table
> with 1.2 TB of data. We are planning to export the data using
> ExportSnapshot and copy the resulting files to a NFS mount periodically.
>
> Out infrastructure team is very concerned about the amount of data that
> will be going over the wire and how long it will take
>
> This is just one table. There may be other tables in the future that we
> want to back up.
>
> So I wanted to get a sense of what others are doing with ExportSnapshot.
> What is the size of the tables that are backed up and whether the concerns
> raised by our infra team are valid?
>
>
> Thanks
>
> Afroz
>

HBase stucks from time to time

2015-04-22 Thread Serega Sheypak

Hi, we have 10 nodes cluster running HBase 0.98 CDH 5.2.1
Sometimes HBase stucks.
We have several apps constantly writing/reading data from it. Sometimes we
see that apps response time dramatically increases. It means that app
spends seconds to read/write from/to HBase. in 99% of time it takes 20ms.

I suppose that compactions/major compactions could be the root cause. I see
that compactions start at the same time when we have problems with app.
Could it be so?
So HBase can't write to WAL because compactions consumes all IO and apps
stops to write data?

Re: HBase stucks from time to time

2015-04-22 Thread Serega Sheypak

pipeline: [5.9.41.237:50010, 5.9.77.105:50010, 5.9.73.19:50010]
2015-04-22 12:53:13,132 INFO
org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 1193 ms,
current pipeline: [5.9.41.237:50010, 5.9.77.105:50010, 5.9.73.19:50010]
2015-04-22 12:53:13,132 INFO
org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 1193 ms,
current pipeline: [5.9.41.237:50010, 5.9.77.105:50010, 5.9.73.19:50010]
2015-04-22 12:53:13,132 INFO
org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 1193 ms,
current pipeline: [5.9.41.237:50010, 5.9.77.105:50010, 5.9.73.19:50010]
2015-04-22 12:53:13,247 INFO
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed,
sequenceid=41975630, memsize=3.6 M, hasBloomFilter=true, into tmp file
hdfs://nameservice1/hbase/data/default/rtb_user_to_cross_id/c54d51dd7f20377ea34f05824abbaf76/.tmp/3495a2b487fa448bb9e0b50050b11c3a
2015-04-22 12:53:13,301 INFO org.apache.hadoop.hbase.regionserver.HStore:
Added
hdfs://nameservice1/hbase/data/default/rtb_user_to_cross_id/c54d51dd7f20377ea34f05824abbaf76/c/3495a2b487fa448bb9e0b50050b11c3a,
entries=886, sequenceid=41975630, filesize=272.9 K

2015-04-22 19:54 GMT+02:00 Ted Yu :

> Serega:
> How often is major compaction run in your cluster ?
>
> Have you configured offpeak compaction ?
> See related parameters in:
> http://hbase.apache.org/book.html#compaction.parameters
>
> Cheers
>
> On Wed, Apr 22, 2015 at 10:39 AM, Serega Sheypak  >
> wrote:
>
> > Hi, we have 10 nodes cluster running HBase 0.98 CDH 5.2.1
> > Sometimes HBase stucks.
> > We have several apps constantly writing/reading data from it. Sometimes
> we
> > see that apps response time dramatically increases. It means that app
> > spends seconds to read/write from/to HBase. in 99% of time it takes 20ms.
> >
> > I suppose that compactions/major compactions could be the root cause. I
> see
> > that compactions start at the same time when we have problems with app.
> > Could it be so?
> > So HBase can't write to WAL because compactions consumes all IO and apps
> > stops to write data?
> >
>

Re: HBase stucks from time to time

2015-04-22 Thread Serega Sheypak

java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:716)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:486)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:111)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:69)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:225)
at java.lang.Thread.run(Thread.java:745)
2015-04-22 12:46:20,677 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
5.9.41.237:50010, dest: /5.9.65.143:36286, bytes: 5120, op: HDFS_READ,
cliID: DFSClient_hb_rs_domain06.myhost.ru,60020,1426776843636_-1084357126_33,
offset: 86653440, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
BP-927943268-5.9.77.105-1414682145673:blk_1078128115_4387882, duration:
17623602
2015-04-22 12:46:20,690 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
5.9.41.237:50010, dest: /78.46.48.38:57022, bytes: 5120, op: HDFS_READ,
cliID: DFSClient_hb_rs_domain14.myhost.ru,60020,1429640630771_156324173_33,
offset: 103135232, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
BP-927943268-5.9.77.105-1414682145673:blk_1078120513_4380280, duration:
24431169
2015-04-22 12:46:20,747 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
5.9.41.237:50010, dest: /5.9.40.136:34501, bytes: 5120, op: HDFS_READ,
cliID: DFSClient_hb_rs_domain04.myhost.ru,60020,1426774142736_-1119788258_33,
offset: 78469632, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
BP-927943268-5.9.77.105-1414682145673:blk_1078108223_4367990, duration:
76055489
2015-04-22 12:46:20,758 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
BP-927943268-5.9.77.105-1414682145673:blk_1078160469_4420237 src: /
5.9.74.13:35440 dest: /5.9.41.237:50010


2015-04-22 20:30 GMT+02:00 Jean-Marc Spaggiari :

> Did you send the logs from one of those 3 servers? [5.9.41.237:50010,
> 5.9.77.105:50010, 5.9.73.19:50010]
>
> Sound like something is slowing done everything. Can you extract DN logs
> for the same time?
>
> Do you have any tool monitoring the disks and network latency over time?
>
> If not, can you run iostat and try to reproduce the issue?
>
> JM
>
> 2015-04-22 14:23 GMT-04:00 Serega Sheypak :
>
> > > major compaction runs daily.
> >
> > >What else do you see in the RS logs?
> > no error, only *Slow sync cost *
> >
> > >How iostat looks like?
> > please see image. 12.00 - 12.30 is a time when reading/writing stopped
> > [image: Встроенное изображение 1]
> >
> >
> > >Can you share the logs around the time this occurs?
> > 2015-04-22 12:53:09,996 WARN org.apache.hadoop.ipc.RpcServer:
> > (responseTooSlow):
> >
> {"processingtimems":20128,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"
> > 5.9.75.155:58344
> >
> ","starttimems":1429692769868,"queuetimems":28034,"class":"HRegionServer","responsesize":8,"method":"Multi"}
> > 2015-04-22 12:53:09,996 WARN org.apache.hadoop.ipc.RpcServer:
> > (responseTooSlow):
> >
> {"processingtimems":21434,"call":"Mutate(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutateRequest)","client":"
> > 46.4.0.110:60149
> >
> ","starttimems":1429692768562,"queuetimems":44263,"class":"HRegionServer","responsesize":2,"method":"Mutate"}
> > 2015-04-22 12:53:10,093 WARN org.apache.hadoop.ipc.RpcServer:
> > (responseTooSlow):
> >
> {"processingtimems":17997,"call":"Mutate(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutateRequest)","client":"46.4.0.
> >
> > 015-04-22 12:53:10,270 WARN org.apache.hadoop.ipc.RpcServer:
> > (responseTooSlow):
> >
> {"processingtimems":18175,"call":"Mutate(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutateRequest)","client":"
> > 144.76.218.107:48620
> >
> ","starttimems":1429692772095,"queuetimems":49253,"class":"HRegionServer","responsesize":2,"method":"Mutate"}
> > 2015-04-22 12:53:10,315 INFO
> > org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 319 ms,
> > current pipeline: [5.9.41.

Re: HBase stucks from time to time

2015-04-22 Thread Serega Sheypak

Here is an image

2015-04-22 20:40 GMT+02:00 Serega Sheypak :

> Here are datanode logs from 5.9.41.237 <http://5.9.41.237:50010/>,
> regionserver logs were from5.9.41.237 <http://5.9.41.237:50010/> also
>
> EQUEST_SHORT_CIRCUIT_FDS, blockid: 1078130838, srvID:
> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
> 2015-04-22 12:46:17,154 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
> bd37825f7a445e2da6796940ebb754d6, slotIdx: 24, srvID:
> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
> 2015-04-22 12:46:17,204 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
> aea53fb897c383f3dec304ed618db0df, slotIdx: 4, srvID:
> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
> 2015-04-22 12:46:17,219 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
> 0178cc0245fcdc9e1dd75c5f8c6da1eb, slotIdx: 0, srvID:
> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
> 2015-04-22 12:46:17,236 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
> aea53fb897c383f3dec304ed618db0df, slotIdx: 102, srvID:
> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
> 2015-04-22 12:46:17,573 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 5.9.41.237:50010, dest: /5.9.65.143:36281, bytes: 4608, op: HDFS_READ,
> cliID: DFSClient_hb_rs_domain06.myhost.ru,60020,1426776843636_-1084357126_33,
> offset: 28435456, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
> BP-927943268-5.9.77.105-1414682145673:blk_1078147715_4407483, duration:
> 12396486
> 2015-04-22 12:46:17,596 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 5.9.41.237:50010, dest: /78.46.48.37:41539, bytes: 4608, op: HDFS_READ,
> cliID: DFSClient_hb_rs_domain13.myhost.ru,60020,1429640630559_-531755738_33,
> offset: 56052736, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
> BP-927943268-5.9.77.105-1414682145673:blk_1078106901_438, duration:
> 37455427
> 2015-04-22 12:46:17,821 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 5.9.41.237:50010, dest: /5.9.67.100:58718, bytes: 5120, op: HDFS_READ,
> cliID: DFSClient_hb_rs_domain03.myhost.ru,60020,1426877064271_-1246826933_33,
> offset: 77630464, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
> BP-927943268-5.9.77.105-1414682145673:blk_1078119757_4379524, duration:
> 16386940
> 2015-04-22 12:46:18,769 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to send data:
> java.net.SocketTimeoutException: 48 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/5.9.41.237:50010 remote=/
> 78.46.48.37:40503]
> 2015-04-22 12:46:18,769 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 5.9.41.237:50010, dest: /78.46.48.37:40503, bytes: 393216, op: HDFS_READ,
> cliID: DFSClient_hb_rs_domain13.myhost.ru,60020,1429640630559_-531755738_33,
> offset: 3584, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
> BP-927943268-5.9.77.105-1414682145673:blk_1078155386_4415154, duration:
> 480345113893
> 2015-04-22 12:46:18,769 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(5.9.41.237,
> datanodeUuid=659e6be2-8d98-458b-94bc-3bcdbb517508, infoPort=50075,
> ipcPort=50020, storageInfo=lv=-56;cid=cluster11;nsid=527111981;c=0):Got
> exception while serving
> BP-927943268-5.9.77.105-1414682145673:blk_1078155386_4415154 to /
> 78.46.48.37:40503
> java.net.SocketTimeoutException: 48 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/5.9.41.237:50010 remote=/
> 78.46.48.37:40503]
> at
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
> at
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
> at
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:716)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:486)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:111)
> at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:69)
> at
> org.apac

Re: HBase stucks from time to time

2015-04-22 Thread Serega Sheypak

Here is disk stats. Sadness appeared ad 12.30 - 13.00
https://www.dropbox.com/s/lj4r8o10buv1n2o/Screenshot%202015-04-22%2020.48.18.png?dl=0

2015-04-22 20:41 GMT+02:00 Serega Sheypak :

> Here is an image
>
> 2015-04-22 20:40 GMT+02:00 Serega Sheypak :
>
>> Here are datanode logs from 5.9.41.237 <http://5.9.41.237:50010/>,
>> regionserver logs were from5.9.41.237 <http://5.9.41.237:50010/> also
>>
>> EQUEST_SHORT_CIRCUIT_FDS, blockid: 1078130838, srvID:
>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
>> 2015-04-22 12:46:17,154 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
>> bd37825f7a445e2da6796940ebb754d6, slotIdx: 24, srvID:
>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
>> 2015-04-22 12:46:17,204 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
>> aea53fb897c383f3dec304ed618db0df, slotIdx: 4, srvID:
>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
>> 2015-04-22 12:46:17,219 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
>> 0178cc0245fcdc9e1dd75c5f8c6da1eb, slotIdx: 0, srvID:
>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
>> 2015-04-22 12:46:17,236 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
>> aea53fb897c383f3dec304ed618db0df, slotIdx: 102, srvID:
>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
>> 2015-04-22 12:46:17,573 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>> 5.9.41.237:50010, dest: /5.9.65.143:36281, bytes: 4608, op: HDFS_READ,
>> cliID: DFSClient_hb_rs_domain06.myhost.ru,60020,1426776843636_-1084357126_33,
>> offset: 28435456, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
>> BP-927943268-5.9.77.105-1414682145673:blk_1078147715_4407483, duration:
>> 12396486
>> 2015-04-22 12:46:17,596 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>> 5.9.41.237:50010, dest: /78.46.48.37:41539, bytes: 4608, op: HDFS_READ,
>> cliID: DFSClient_hb_rs_domain13.myhost.ru,60020,1429640630559_-531755738_33,
>> offset: 56052736, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
>> BP-927943268-5.9.77.105-1414682145673:blk_1078106901_438, duration:
>> 37455427
>> 2015-04-22 12:46:17,821 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>> 5.9.41.237:50010, dest: /5.9.67.100:58718, bytes: 5120, op: HDFS_READ,
>> cliID: DFSClient_hb_rs_domain03.myhost.ru,60020,1426877064271_-1246826933_33,
>> offset: 77630464, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
>> BP-927943268-5.9.77.105-1414682145673:blk_1078119757_4379524, duration:
>> 16386940
>> 2015-04-22 12:46:18,769 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to send data:
>> java.net.SocketTimeoutException: 48 millis timeout while waiting for
>> channel to be ready for write. ch :
>> java.nio.channels.SocketChannel[connected local=/5.9.41.237:50010
>> remote=/78.46.48.37:40503]
>> 2015-04-22 12:46:18,769 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>> 5.9.41.237:50010, dest: /78.46.48.37:40503, bytes: 393216, op:
>> HDFS_READ, cliID: 
>> DFSClient_hb_rs_domain13.myhost.ru,60020,1429640630559_-531755738_33,
>> offset: 3584, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
>> BP-927943268-5.9.77.105-1414682145673:blk_1078155386_4415154, duration:
>> 480345113893
>> 2015-04-22 12:46:18,769 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode:
>> DatanodeRegistration(5.9.41.237,
>> datanodeUuid=659e6be2-8d98-458b-94bc-3bcdbb517508, infoPort=50075,
>> ipcPort=50020, storageInfo=lv=-56;cid=cluster11;nsid=527111981;c=0):Got
>> exception while serving
>> BP-927943268-5.9.77.105-1414682145673:blk_1078155386_4415154 to /
>> 78.46.48.37:40503
>> java.net.SocketTimeoutException: 48 millis timeout while waiting for
>> channel to be ready for write. ch :
>> java.nio.channels.SocketChannel[connected local=/5.9.41.237:50010
>> remote=/78.46.48.37:40503]
>> at
>> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>> at
>> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
>> at
>> org.apache.hadoop.net.SocketOutputStream.transferToFully(Sock

Re: HBase stucks from time to time

2015-04-23 Thread Serega Sheypak

Hi, is there any input here? What we should monitor?

2015-04-22 20:55 GMT+02:00 Serega Sheypak :

> Here is disk stats. Sadness appeared ad 12.30 - 13.00
>
> https://www.dropbox.com/s/lj4r8o10buv1n2o/Screenshot%202015-04-22%2020.48.18.png?dl=0
>
> 2015-04-22 20:41 GMT+02:00 Serega Sheypak :
>
>> Here is an image
>>
>> 2015-04-22 20:40 GMT+02:00 Serega Sheypak :
>>
>>> Here are datanode logs from 5.9.41.237 <http://5.9.41.237:50010/>,
>>> regionserver logs were from5.9.41.237 <http://5.9.41.237:50010/> also
>>>
>>> EQUEST_SHORT_CIRCUIT_FDS, blockid: 1078130838, srvID:
>>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
>>> 2015-04-22 12:46:17,154 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>>> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
>>> bd37825f7a445e2da6796940ebb754d6, slotIdx: 24, srvID:
>>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
>>> 2015-04-22 12:46:17,204 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>>> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
>>> aea53fb897c383f3dec304ed618db0df, slotIdx: 4, srvID:
>>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
>>> 2015-04-22 12:46:17,219 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>>> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
>>> 0178cc0245fcdc9e1dd75c5f8c6da1eb, slotIdx: 0, srvID:
>>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
>>> 2015-04-22 12:46:17,236 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>>> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
>>> aea53fb897c383f3dec304ed618db0df, slotIdx: 102, srvID:
>>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
>>> 2015-04-22 12:46:17,573 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>> 5.9.41.237:50010, dest: /5.9.65.143:36281, bytes: 4608, op: HDFS_READ,
>>> cliID: 
>>> DFSClient_hb_rs_domain06.myhost.ru,60020,1426776843636_-1084357126_33,
>>> offset: 28435456, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
>>> BP-927943268-5.9.77.105-1414682145673:blk_1078147715_4407483, duration:
>>> 12396486
>>> 2015-04-22 12:46:17,596 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>> 5.9.41.237:50010, dest: /78.46.48.37:41539, bytes: 4608, op: HDFS_READ,
>>> cliID: DFSClient_hb_rs_domain13.myhost.ru,60020,1429640630559_-531755738_33,
>>> offset: 56052736, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
>>> BP-927943268-5.9.77.105-1414682145673:blk_1078106901_438, duration:
>>> 37455427
>>> 2015-04-22 12:46:17,821 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>> 5.9.41.237:50010, dest: /5.9.67.100:58718, bytes: 5120, op: HDFS_READ,
>>> cliID: 
>>> DFSClient_hb_rs_domain03.myhost.ru,60020,1426877064271_-1246826933_33,
>>> offset: 77630464, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
>>> BP-927943268-5.9.77.105-1414682145673:blk_1078119757_4379524, duration:
>>> 16386940
>>> 2015-04-22 12:46:18,769 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to send data:
>>> java.net.SocketTimeoutException: 48 millis timeout while waiting for
>>> channel to be ready for write. ch :
>>> java.nio.channels.SocketChannel[connected local=/5.9.41.237:50010
>>> remote=/78.46.48.37:40503]
>>> 2015-04-22 12:46:18,769 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>> 5.9.41.237:50010, dest: /78.46.48.37:40503, bytes: 393216, op:
>>> HDFS_READ, cliID: 
>>> DFSClient_hb_rs_domain13.myhost.ru,60020,1429640630559_-531755738_33,
>>> offset: 3584, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508, blockid:
>>> BP-927943268-5.9.77.105-1414682145673:blk_1078155386_4415154, duration:
>>> 480345113893
>>> 2015-04-22 12:46:18,769 WARN
>>> org.apache.hadoop.hdfs.server.datanode.DataNode:
>>> DatanodeRegistration(5.9.41.237,
>>> datanodeUuid=659e6be2-8d98-458b-94bc-3bcdbb517508, infoPort=50075,
>>> ipcPort=50020, storageInfo=lv=-56;cid=cluster11;nsid=527111981;c=0):Got
>>> exception while serving
>>> BP-927943268-5.9.77.105-1414682145673:blk_1078155386_4415154 to /
>>> 78.46.48.37:40503
>>> java.net.SocketTimeoutException: 48 millis timeout while waiting

Re: HBase stucks from time to time

2015-04-25 Thread Serega Sheypak

Hi, thanks for the input. We use
https://www.hetzner.de/gb/hosting/produkte_rootserver/ex40
Each node has 2 HDD.

2015-04-24 19:07 GMT+02:00 Esteban Gutierrez :

>  Hi Serega,
>
>
> The iostat data shows a very sharp spike in await time (I know is outside
> of the logs time range) and utilization is high  but hard to tell if the
> drives in the DNs are getting saturated continuously since it looks like
> averaged metric. Is this some kind of virtualized environment? are you
> using a NAS for the data volumes? If you look into the logs in context,
> seems that there is a bad data node that is causing issues in the HDFS
> pipeline and causing issues.
>
> cheers,
> esteban.
>
>
>
> --
> Cloudera, Inc.
>
>
> On Thu, Apr 23, 2015 at 1:37 PM, Serega Sheypak 
> wrote:
>
> > Hi, is there any input here? What we should monitor?
> >
> > 2015-04-22 20:55 GMT+02:00 Serega Sheypak :
> >
> > > Here is disk stats. Sadness appeared ad 12.30 - 13.00
> > >
> > >
> >
> https://www.dropbox.com/s/lj4r8o10buv1n2o/Screenshot%202015-04-22%2020.48.18.png?dl=0
> > >
> > > 2015-04-22 20:41 GMT+02:00 Serega Sheypak :
> > >
> > >> Here is an image
> > >>
> > >> 2015-04-22 20:40 GMT+02:00 Serega Sheypak :
> > >>
> > >>> Here are datanode logs from 5.9.41.237 <http://5.9.41.237:50010/>,
> > >>> regionserver logs were from5.9.41.237 <http://5.9.41.237:50010/>
> also
> > >>>
> > >>> EQUEST_SHORT_CIRCUIT_FDS, blockid: 1078130838, srvID:
> > >>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
> > >>> 2015-04-22 12:46:17,154 INFO
> > >>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> > >>> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
> > >>> bd37825f7a445e2da6796940ebb754d6, slotIdx: 24, srvID:
> > >>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
> > >>> 2015-04-22 12:46:17,204 INFO
> > >>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> > >>> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
> > >>> aea53fb897c383f3dec304ed618db0df, slotIdx: 4, srvID:
> > >>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
> > >>> 2015-04-22 12:46:17,219 INFO
> > >>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> > >>> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
> > >>> 0178cc0245fcdc9e1dd75c5f8c6da1eb, slotIdx: 0, srvID:
> > >>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
> > >>> 2015-04-22 12:46:17,236 INFO
> > >>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> > >>> 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId:
> > >>> aea53fb897c383f3dec304ed618db0df, slotIdx: 102, srvID:
> > >>> 659e6be2-8d98-458b-94bc-3bcdbb517508, success: true
> > >>> 2015-04-22 12:46:17,573 INFO
> > >>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> > >>> 5.9.41.237:50010, dest: /5.9.65.143:36281, bytes: 4608, op:
> HDFS_READ,
> > >>> cliID: DFSClient_hb_rs_domain06.myhost.ru
> > ,60020,1426776843636_-1084357126_33,
> > >>> offset: 28435456, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508,
> blockid:
> > >>> BP-927943268-5.9.77.105-1414682145673:blk_1078147715_4407483,
> duration:
> > >>> 12396486
> > >>> 2015-04-22 12:46:17,596 INFO
> > >>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> > >>> 5.9.41.237:50010, dest: /78.46.48.37:41539, bytes: 4608, op:
> > HDFS_READ,
> > >>> cliID: DFSClient_hb_rs_domain13.myhost.ru
> > ,60020,1429640630559_-531755738_33,
> > >>> offset: 56052736, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508,
> blockid:
> > >>> BP-927943268-5.9.77.105-1414682145673:blk_1078106901_438,
> duration:
> > >>> 37455427
> > >>> 2015-04-22 12:46:17,821 INFO
> > >>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> > >>> 5.9.41.237:50010, dest: /5.9.67.100:58718, bytes: 5120, op:
> HDFS_READ,
> > >>> cliID: DFSClient_hb_rs_domain03.myhost.ru
> > ,60020,1426877064271_-1246826933_33,
> > >>> offset: 77630464, srvID: 659e6be2-8d98-458b-94bc-3bcdbb517508,
> blockid:
> > >>> BP-92

counter performance

2015-04-26 Thread Serega Sheypak

Hi, I'm going to count activity.
Is it good Idea to use HBase to count?
Count it cause hotspots?
What are the optimisations for HBase to increase counters throughput?

I've found old letter here:
http://search-hadoop.com/m/xfABgvwuvQ1/counter+increment&subj=Re+on+the+impact+of+incremental+counters

Does it make sense to increase WAL flush to avoid it be too frequent? I
understand that I can loose the data if WAL is not flushed and RS failed.

client Table instance, confused with autoFlush

2015-05-13 Thread Serega Sheypak

Hi, in 0.94 we could use autoFlush method for HTable.
Now HTable shouldn't be used, we refactoring code for Table

Here is a note:
http://hbase.apache.org/book.html#perf.hbase.client.autoflush
>When performing a lot of Puts, make sure that setAutoFlush is set to false
on your Table

 instance

What is the right way to set autoFlush for Table instance? Can't find
method/example to do this?

Re: client Table instance, confused with autoFlush

2015-05-13 Thread Serega Sheypak

We are using CDH 5.4, it's on .0.98 version

2015-05-13 16:49 GMT+03:00 Solomon Duskis :

> BufferedMutator is the preferred alternative for autoflush starting in
> HBase 1.0.  Get a connection via ConnectionFactory, then
> connection.getBufferedMutator(tableName).  It's the same functionality as
> autoflush under the covers.
>
> On Wed, May 13, 2015 at 9:41 AM, Ted Yu  wrote:
>
> > Please take a look at https://issues.apache.org/jira/browse/HBASE-12728
> >
> > Cheers
> >
> > On Wed, May 13, 2015 at 6:25 AM, Serega Sheypak <
> serega.shey...@gmail.com>
> > wrote:
> >
> > > Hi, in 0.94 we could use autoFlush method for HTable.
> > > Now HTable shouldn't be used, we refactoring code for Table
> > >
> > > Here is a note:
> > > http://hbase.apache.org/book.html#perf.hbase.client.autoflush
> > > >When performing a lot of Puts, make sure that setAutoFlush is set to
> > false
> > > on your Table
> > > <
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html
> > > >
> > >  instance
> > >
> > > What is the right way to set autoFlush for Table instance? Can't find
> > > method/example to do this?
> > >
> >
>

Re: client Table instance, confused with autoFlush

2015-05-13 Thread Serega Sheypak

But HTable is deprecated in 0.98 ...?

2015-05-13 17:35 GMT+03:00 Solomon Duskis :

> The docs you referenced are for 1.0.  Table and BufferedMutator were
> introduced in 1.0.  In 0.98, you should continue using HTable and
> autoflush.
>
> On Wed, May 13, 2015 at 9:57 AM, Serega Sheypak 
> wrote:
>
> > We are using CDH 5.4, it's on .0.98 version
> >
> > 2015-05-13 16:49 GMT+03:00 Solomon Duskis :
> >
> > > BufferedMutator is the preferred alternative for autoflush starting in
> > > HBase 1.0.  Get a connection via ConnectionFactory, then
> > > connection.getBufferedMutator(tableName).  It's the same functionality
> as
> > > autoflush under the covers.
> > >
> > > On Wed, May 13, 2015 at 9:41 AM, Ted Yu  wrote:
> > >
> > > > Please take a look at
> > https://issues.apache.org/jira/browse/HBASE-12728
> > > >
> > > > Cheers
> > > >
> > > > On Wed, May 13, 2015 at 6:25 AM, Serega Sheypak <
> > > serega.shey...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi, in 0.94 we could use autoFlush method for HTable.
> > > > > Now HTable shouldn't be used, we refactoring code for Table
> > > > >
> > > > > Here is a note:
> > > > > http://hbase.apache.org/book.html#perf.hbase.client.autoflush
> > > > > >When performing a lot of Puts, make sure that setAutoFlush is set
> to
> > > > false
> > > > > on your Table
> > > > > <
> > > >
> > >
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html
> > > > > >
> > > > >  instance
> > > > >
> > > > > What is the right way to set autoFlush for Table instance? Can't
> find
> > > > > method/example to do this?
> > > > >
> > > >
> > >
> >
>

Re: client Table instance, confused with autoFlush

2015-05-13 Thread Serega Sheypak

Ok, thanks!

2015-05-13 18:14 GMT+03:00 Shahab Yunus :

> Until you move to HBase 1.*, you should use HTableInterface. And the
> autoFlush methods and semantics, as far as I understand are, same so you
> should not have problem.
>
> Regards,
> Shahab
>
> On Wed, May 13, 2015 at 11:09 AM, Serega Sheypak  >
> wrote:
>
> > But HTable is deprecated in 0.98 ...?
> >
> > 2015-05-13 17:35 GMT+03:00 Solomon Duskis :
> >
> > > The docs you referenced are for 1.0.  Table and BufferedMutator were
> > > introduced in 1.0.  In 0.98, you should continue using HTable and
> > > autoflush.
> > >
> > > On Wed, May 13, 2015 at 9:57 AM, Serega Sheypak <
> > serega.shey...@gmail.com>
> > > wrote:
> > >
> > > > We are using CDH 5.4, it's on .0.98 version
> > > >
> > > > 2015-05-13 16:49 GMT+03:00 Solomon Duskis :
> > > >
> > > > > BufferedMutator is the preferred alternative for autoflush starting
> > in
> > > > > HBase 1.0.  Get a connection via ConnectionFactory, then
> > > > > connection.getBufferedMutator(tableName).  It's the same
> > functionality
> > > as
> > > > > autoflush under the covers.
> > > > >
> > > > > On Wed, May 13, 2015 at 9:41 AM, Ted Yu 
> wrote:
> > > > >
> > > > > > Please take a look at
> > > > https://issues.apache.org/jira/browse/HBASE-12728
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > > On Wed, May 13, 2015 at 6:25 AM, Serega Sheypak <
> > > > > serega.shey...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi, in 0.94 we could use autoFlush method for HTable.
> > > > > > > Now HTable shouldn't be used, we refactoring code for Table
> > > > > > >
> > > > > > > Here is a note:
> > > > > > > http://hbase.apache.org/book.html#perf.hbase.client.autoflush
> > > > > > > >When performing a lot of Puts, make sure that setAutoFlush is
> > set
> > > to
> > > > > > false
> > > > > > > on your Table
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html
> > > > > > > >
> > > > > > >  instance
> > > > > > >
> > > > > > > What is the right way to set autoFlush for Table instance?
> Can't
> > > find
> > > > > > > method/example to do this?
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Optimizing compactions on super-low-cost HW

2015-05-18 Thread Serega Sheypak

Hi, we are using extremely cheap HW:
2 HHD 7200
4*2 core (Hyperthreading)
32GB RAM

We met serious IO performance issues.
We have more or less even distribution of read/write requests. The same for
datasize.

ServerName Request Per Second Read Request Count Write Request Count
node01.domain.com,60020,1430172017193 195 171871826 16761699
node02.domain.com,60020,1426925053570 24 34314930 16006603
node03.domain.com,60020,1430860939797 22 32054801 16913299
node04.domain.com,60020,1431975656065 33 1765121 253405
node05.domain.com,60020,1430484646409 27 42248883 16406280
node07.domain.com,60020,1426776403757 27 36324492 16299432
node08.domain.com,60020,1426775898757 26 38507165 13582109
node09.domain.com,60020,1430440612531 27 34360873 15080194
node11.domain.com,60020,1431989669340 28 44307 13466
node12.domain.com,60020,1431927604238 30 5318096 2020855
node13.domain.com,60020,1431372874221 29 31764957 15843688
node14.domain.com,60020,1429640630771 41 36300097 13049801

ServerName Num. Stores Num. Storefiles Storefile Size Uncompressed Storefile
Size Index Size Bloom Size
node01.domain.com,60020,1430172017193 82 186 1052080m 76496mb 641849k
310111k
node02.domain.com,60020,1426925053570 82 179 1062730m 79713mb 649610k
318854k
node03.domain.com,60020,1430860939797 82 179 1036597m 76199mb 627346k
307136k
node04.domain.com,60020,1431975656065 82 400 1034624m 76405mb 655954k
289316k
node05.domain.com,60020,1430484646409 82 185 807m 81474mb 688136k
334127k
node07.domain.com,60020,1426776403757 82 164 1023217m 74830mb 631774k
296169k
node08.domain.com,60020,1426775898757 81 171 1086446m 79933mb 681486k
312325k
node09.domain.com,60020,1430440612531 81 160 1073852m 77874mb 658924k
309734k
node11.domain.com,60020,1431989669340 81 166 1006322m 75652mb 664753k
264081k
node12.domain.com,60020,1431927604238 82 188 1050229m 75140mb 652970k
304137k
node13.domain.com,60020,1431372874221 82 178 937557m 70042mb 601684k 257607k
node14.domain.com,60020,1429640630771 82 145 949090m 69749mb 592812k 266677k


When compaction starts  random node gets I/O 100%, io wait for seconds,
even tenth of seconds.

What are the approaches to optimize minor and major compactions when you
are I/O bound..?

Re: Optimizing compactions on super-low-cost HW

2015-05-20 Thread Serega Sheypak

Hi, any input here?

2015-05-19 2:26 GMT+03:00 Serega Sheypak :

> Hi, we are using extremely cheap HW:
> 2 HHD 7200
> 4*2 core (Hyperthreading)
> 32GB RAM
>
> We met serious IO performance issues.
> We have more or less even distribution of read/write requests. The same
> for datasize.
>
> ServerName Request Per Second Read Request Count Write Request Count
> node01.domain.com,60020,1430172017193 195 171871826 16761699
> node02.domain.com,60020,1426925053570 24 34314930 16006603
> node03.domain.com,60020,1430860939797 22 32054801 16913299
> node04.domain.com,60020,1431975656065 33 1765121 253405
> node05.domain.com,60020,1430484646409 27 42248883 16406280
> node07.domain.com,60020,1426776403757 27 36324492 16299432
> node08.domain.com,60020,1426775898757 26 38507165 13582109
> node09.domain.com,60020,1430440612531 27 34360873 15080194
> node11.domain.com,60020,1431989669340 28 44307 13466
> node12.domain.com,60020,1431927604238 30 5318096 2020855
> node13.domain.com,60020,1431372874221 29 31764957 15843688
> node14.domain.com,60020,1429640630771 41 36300097 13049801
>
> ServerName Num. Stores Num. Storefiles Storefile Size Uncompressed Storefile
> Size Index Size Bloom Size
> node01.domain.com,60020,1430172017193 82 186 1052080m 76496mb 641849k
> 310111k
> node02.domain.com,60020,1426925053570 82 179 1062730m 79713mb 649610k
> 318854k
> node03.domain.com,60020,1430860939797 82 179 1036597m 76199mb 627346k
> 307136k
> node04.domain.com,60020,1431975656065 82 400 1034624m 76405mb 655954k
> 289316k
> node05.domain.com,60020,1430484646409 82 185 807m 81474mb 688136k
> 334127k
> node07.domain.com,60020,1426776403757 82 164 1023217m 74830mb 631774k
> 296169k
> node08.domain.com,60020,1426775898757 81 171 1086446m 79933mb 681486k
> 312325k
> node09.domain.com,60020,1430440612531 81 160 1073852m 77874mb 658924k
> 309734k
> node11.domain.com,60020,1431989669340 81 166 1006322m 75652mb 664753k
> 264081k
> node12.domain.com,60020,1431927604238 82 188 1050229m 75140mb 652970k
> 304137k
> node13.domain.com,60020,1431372874221 82 178 937557m 70042mb 601684k
> 257607k
> node14.domain.com,60020,1429640630771 82 145 949090m 69749mb 592812k
> 266677k
>
>
> When compaction starts  random node gets I/O 100%, io wait for seconds,
> even tenth of seconds.
>
> What are the approaches to optimize minor and major compactions when you
> are I/O bound..?
>

Re: Optimizing compactions on super-low-cost HW

2015-05-21 Thread Serega Sheypak

Hi! Thank you for trying to help.
Here are the settings. Do you need to know some more?
> memstore
hbase.hregion.memstore.flush.size=128MB

> compaction
hbase.extendedperiod=1hour
hbase.hstore.compactionThreshold=3
hbase.hstore.blockingStoreFiles=10
hbase.hstore.compaction.max=_
hbase.hregion.majorcompaction=1day

hbase.offpeak.start.hour=1
hbase.offpeak.end.hour=5

2015-05-20 18:01 GMT+03:00 ramkrishna vasudevan <
ramkrishna.s.vasude...@gmail.com>:

> Can you specify what are the other details like the memstore size, the
> compaction related configurations?
>
> On Wed, May 20, 2015 at 8:11 PM, Serega Sheypak 
> wrote:
>
> > Hi, any input here?
> >
> > 2015-05-19 2:26 GMT+03:00 Serega Sheypak :
> >
> > > Hi, we are using extremely cheap HW:
> > > 2 HHD 7200
> > > 4*2 core (Hyperthreading)
> > > 32GB RAM
> > >
> > > We met serious IO performance issues.
> > > We have more or less even distribution of read/write requests. The same
> > > for datasize.
> > >
> > > ServerName Request Per Second Read Request Count Write Request Count
> > > node01.domain.com,60020,1430172017193 195 171871826 16761699
> > > node02.domain.com,60020,1426925053570 24 34314930 16006603
> > > node03.domain.com,60020,1430860939797 22 32054801 16913299
> > > node04.domain.com,60020,1431975656065 33 1765121 253405
> > > node05.domain.com,60020,1430484646409 27 42248883 16406280
> > > node07.domain.com,60020,1426776403757 27 36324492 16299432
> > > node08.domain.com,60020,1426775898757 26 38507165 13582109
> > > node09.domain.com,60020,1430440612531 27 34360873 15080194
> > > node11.domain.com,60020,1431989669340 28 44307 13466
> > > node12.domain.com,60020,1431927604238 30 5318096 2020855
> > > node13.domain.com,60020,1431372874221 29 31764957 15843688
> > > node14.domain.com,60020,1429640630771 41 36300097 13049801
> > >
> > > ServerName Num. Stores Num. Storefiles Storefile Size Uncompressed
> > Storefile
> > > Size Index Size Bloom Size
> > > node01.domain.com,60020,1430172017193 82 186 1052080m 76496mb 641849k
> > > 310111k
> > > node02.domain.com,60020,1426925053570 82 179 1062730m 79713mb 649610k
> > > 318854k
> > > node03.domain.com,60020,1430860939797 82 179 1036597m 76199mb 627346k
> > > 307136k
> > > node04.domain.com,60020,1431975656065 82 400 1034624m 76405mb 655954k
> > > 289316k
> > > node05.domain.com,60020,1430484646409 82 185 807m 81474mb 688136k
> > > 334127k
> > > node07.domain.com,60020,1426776403757 82 164 1023217m 74830mb 631774k
> > > 296169k
> > > node08.domain.com,60020,1426775898757 81 171 1086446m 79933mb 681486k
> > > 312325k
> > > node09.domain.com,60020,1430440612531 81 160 1073852m 77874mb 658924k
> > > 309734k
> > > node11.domain.com,60020,1431989669340 81 166 1006322m 75652mb 664753k
> > > 264081k
> > > node12.domain.com,60020,1431927604238 82 188 1050229m 75140mb 652970k
> > > 304137k
> > > node13.domain.com,60020,1431372874221 82 178 937557m 70042mb 601684k
> > > 257607k
> > > node14.domain.com,60020,1429640630771 82 145 949090m 69749mb 592812k
> > > 266677k
> > >
> > >
> > > When compaction starts  random node gets I/O 100%, io wait for seconds,
> > > even tenth of seconds.
> > >
> > > What are the approaches to optimize minor and major compactions when
> you
> > > are I/O bound..?
> > >
> >
>

Re: Optimizing compactions on super-low-cost HW

2015-05-21 Thread Serega Sheypak

> Do you have the system sharing
There are 2 HDD 7200 2TB each. There is 300GB OS partition on each drive
with mirroring enabled. I can't persuade devops that mirroring could cause
IO issues. What arguments can I bring? They use OS partition mirroring when
disck fails, we can use other partition to boot OS and continue to work...

>Do you have to compact? In other words, do you have read SLAs?
Unfortunately, I have mixed workload from web applications. I need to write
and read and SLA is < 50ms.

>How are your read times currently?
Cloudera manager says it's 4K reads per second and 500 writes per second

>Does your working dataset fit in RAM or do
reads have to go to disk?
I have several tables for 500GB each and many small tables 10-20 GB. Small
tables loaded hourly/daily using bulkload (prepare HFiles using MR and move
them to HBase using utility). Big tables are used by webapps, they read and
write them.

>It looks like you are running at about three storefiles per column family
is it hbase.hstore.compactionThreshold=3?

>What if you upped the threshold at which minors run?
you mean bump  hbase.hstore.compactionThreshold to 8 or 10?

>Do you have a downtime during which you could schedule compactions?
Unfortunately no. It should work 24/7 and sometimes it doesn't do it.

>Are you managing the major compactions yourself or are you having hbase do
it for you?
HBase, once a day hbase.hregion.majorcompaction=1day

I can disable WAL. It's ok to loose some data in case of RS failure. I'm
not doing banking transactions.
If I disable WAL, could it help?

2015-05-20 18:04 GMT+03:00 Stack :

> On Mon, May 18, 2015 at 4:26 PM, Serega Sheypak 
> wrote:
>
> > Hi, we are using extremely cheap HW:
> > 2 HHD 7200
> > 4*2 core (Hyperthreading)
> > 32GB RAM
> >
> > We met serious IO performance issues.
> > We have more or less even distribution of read/write requests. The same
> for
> > datasize.
> >
> > ServerName Request Per Second Read Request Count Write Request Count
> > node01.domain.com,60020,1430172017193 195 171871826 16761699
> > node02.domain.com,60020,1426925053570 24 34314930 16006603
> > node03.domain.com,60020,1430860939797 22 32054801 16913299
> > node04.domain.com,60020,1431975656065 33 1765121 253405
> > node05.domain.com,60020,1430484646409 27 42248883 16406280
> > node07.domain.com,60020,1426776403757 27 36324492 16299432
> > node08.domain.com,60020,1426775898757 26 38507165 13582109
> > node09.domain.com,60020,1430440612531 27 34360873 15080194
> > node11.domain.com,60020,1431989669340 28 44307 13466
> > node12.domain.com,60020,1431927604238 30 5318096 2020855
> > node13.domain.com,60020,1431372874221 29 31764957 15843688
> > node14.domain.com,60020,1429640630771 41 36300097 13049801
> >
> > ServerName Num. Stores Num. Storefiles Storefile Size Uncompressed
> > Storefile
> > Size Index Size Bloom Size
> > node01.domain.com,60020,1430172017193 82 186 1052080m 76496mb 641849k
> > 310111k
> > node02.domain.com,60020,1426925053570 82 179 1062730m 79713mb 649610k
> > 318854k
> > node03.domain.com,60020,1430860939797 82 179 1036597m 76199mb 627346k
> > 307136k
> > node04.domain.com,60020,1431975656065 82 400 1034624m 76405mb 655954k
> > 289316k
> > node05.domain.com,60020,1430484646409 82 185 807m 81474mb 688136k
> > 334127k
> > node07.domain.com,60020,1426776403757 82 164 1023217m 74830mb 631774k
> > 296169k
> > node08.domain.com,60020,1426775898757 81 171 1086446m 79933mb 681486k
> > 312325k
> > node09.domain.com,60020,1430440612531 81 160 1073852m 77874mb 658924k
> > 309734k
> > node11.domain.com,60020,1431989669340 81 166 1006322m 75652mb 664753k
> > 264081k
> > node12.domain.com,60020,1431927604238 82 188 1050229m 75140mb 652970k
> > 304137k
> > node13.domain.com,60020,1431372874221 82 178 937557m 70042mb 601684k
> > 257607k
> > node14.domain.com,60020,1429640630771 82 145 949090m 69749mb 592812k
> > 266677k
> >
> >
> > When compaction starts  random node gets I/O 100%, io wait for seconds,
> > even tenth of seconds.
> >
> > What are the approaches to optimize minor and major compactions when you
> > are I/O bound..?
> >
>
> Yeah, with two disks, you will be crimped. Do you have the system sharing
> with hbase/hdfs or is hdfs running on one disk only?
>
> Do you have to compact? In other words, do you have read SLAs?  How are
> your read times currently?  Does your working dataset fit in RAM or do
> reads have to go to disk?  It looks like you are running at about three
> storefiles per column family.  What if you upped the threshold at which
> minors run? Do you have a downtime during which you could schedule
> compactions? Are you managing the major compactions yourself or are you
> having hbase do it for you?
>
> St.Ack
>

Re: Getting intermittent errors while insertind data into HBase

2015-05-21 Thread Serega Sheypak

Maybe you share HTable instance across several threads?
Can you share the code:
1. HTable initialization
2. Htable.put(something)

2015-05-21 11:57 GMT+03:00 Jithender Boreddy :

> Hi,
>
> I am inserting data from my java application into two HBase tables
> back to back. And I am running my application sequentially as part of
> stress testing. I am getting strange error intermittently. It is
> passing many times but failing by throwing below error few times.
>
> Can someone point me to the correct direction here by letting me know
> what going wrong ?
>
> Pasted below partial stack trace:
> Stack Trace:
> "java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:953)
> java.util.LinkedList$ListItr.remove(LinkedList.java:919)
>
> org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:319)
>
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:965)
>
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1281)
> org.apache.hadoop.hbase.client.HTable.put(HTable.java:925)
> com.autodesk.dao.SensorDataDAO.insertRecords(Unknown
> Source)
> com.autodesk.dao.SensorDataDAO.insertRecords(Unknown
> Source)
>
>
> com.autodesk.dao.SensorDataDAO$$FastClassByCGLIB$$36f4c9d9.invoke()
> net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:191)
>
>
> org.springframework.aop.framework.Cglib2AopProxy$CglibMethodInvocation.invokeJoinpoint(Cglib2AopProxy.java:688)
>
>
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>
>
> org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:80)
> com.autodesk.utils.aspects.TimerAspect.log(Unknown Source)
> sun.reflect.GeneratedMethodAccessor38.invoke(Unknown
> Source)
>
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> java.lang.reflect.Method.invoke(Method.java:606)
>
>
> org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621)
>
>
> org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610)
>
>
> org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:65)
>
>
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>
>
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:89)
>
>
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>
>
> org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:621)
>
>
> com.autodesk.dao.ReadingDAO$$EnhancerByCGLIB$$fa7dd7e1.insertRecords()
>
> com.autodesk.business.ReadingProcessor.createReadings(Unknown Source)
>

Re: Optimizing compactions on super-low-cost HW

2015-05-21 Thread Serega Sheypak

Hi!
Please help ^)

2015-05-21 11:04 GMT+03:00 Serega Sheypak :

> > Do you have the system sharing
> There are 2 HDD 7200 2TB each. There is 300GB OS partition on each drive
> with mirroring enabled. I can't persuade devops that mirroring could cause
> IO issues. What arguments can I bring? They use OS partition mirroring when
> disck fails, we can use other partition to boot OS and continue to work...
>
> >Do you have to compact? In other words, do you have read SLAs?
> Unfortunately, I have mixed workload from web applications. I need to
> write and read and SLA is < 50ms.
>
> >How are your read times currently?
> Cloudera manager says it's 4K reads per second and 500 writes per second
>
> >Does your working dataset fit in RAM or do
> reads have to go to disk?
> I have several tables for 500GB each and many small tables 10-20 GB. Small
> tables loaded hourly/daily using bulkload (prepare HFiles using MR and move
> them to HBase using utility). Big tables are used by webapps, they read and
> write them.
>
> >It looks like you are running at about three storefiles per column family
> is it hbase.hstore.compactionThreshold=3?
>
> >What if you upped the threshold at which minors run?
> you mean bump  hbase.hstore.compactionThreshold to 8 or 10?
>
> >Do you have a downtime during which you could schedule compactions?
> Unfortunately no. It should work 24/7 and sometimes it doesn't do it.
>
> >Are you managing the major compactions yourself or are you having hbase
> do it for you?
> HBase, once a day hbase.hregion.majorcompaction=1day
>
> I can disable WAL. It's ok to loose some data in case of RS failure. I'm
> not doing banking transactions.
> If I disable WAL, could it help?
>
> 2015-05-20 18:04 GMT+03:00 Stack :
>
>> On Mon, May 18, 2015 at 4:26 PM, Serega Sheypak > >
>> wrote:
>>
>> > Hi, we are using extremely cheap HW:
>> > 2 HHD 7200
>> > 4*2 core (Hyperthreading)
>> > 32GB RAM
>> >
>> > We met serious IO performance issues.
>> > We have more or less even distribution of read/write requests. The same
>> for
>> > datasize.
>> >
>> > ServerName Request Per Second Read Request Count Write Request Count
>> > node01.domain.com,60020,1430172017193 195 171871826 16761699
>> > node02.domain.com,60020,1426925053570 24 34314930 16006603
>> > node03.domain.com,60020,1430860939797 22 32054801 16913299
>> > node04.domain.com,60020,1431975656065 33 1765121 253405
>> > node05.domain.com,60020,1430484646409 27 42248883 16406280
>> > node07.domain.com,60020,1426776403757 27 36324492 16299432
>> > node08.domain.com,60020,1426775898757 26 38507165 13582109
>> > node09.domain.com,60020,1430440612531 27 34360873 15080194
>> > node11.domain.com,60020,1431989669340 28 44307 13466
>> > node12.domain.com,60020,1431927604238 30 5318096 2020855
>> > node13.domain.com,60020,1431372874221 29 31764957 15843688
>> > node14.domain.com,60020,1429640630771 41 36300097 13049801
>> >
>> > ServerName Num. Stores Num. Storefiles Storefile Size Uncompressed
>> > Storefile
>> > Size Index Size Bloom Size
>> > node01.domain.com,60020,1430172017193 82 186 1052080m 76496mb 641849k
>> > 310111k
>> > node02.domain.com,60020,1426925053570 82 179 1062730m 79713mb 649610k
>> > 318854k
>> > node03.domain.com,60020,1430860939797 82 179 1036597m 76199mb 627346k
>> > 307136k
>> > node04.domain.com,60020,1431975656065 82 400 1034624m 76405mb 655954k
>> > 289316k
>> > node05.domain.com,60020,1430484646409 82 185 807m 81474mb 688136k
>> > 334127k
>> > node07.domain.com,60020,1426776403757 82 164 1023217m 74830mb 631774k
>> > 296169k
>> > node08.domain.com,60020,1426775898757 81 171 1086446m 79933mb 681486k
>> > 312325k
>> > node09.domain.com,60020,1430440612531 81 160 1073852m 77874mb 658924k
>> > 309734k
>> > node11.domain.com,60020,1431989669340 81 166 1006322m 75652mb 664753k
>> > 264081k
>> > node12.domain.com,60020,1431927604238 82 188 1050229m 75140mb 652970k
>> > 304137k
>> > node13.domain.com,60020,1431372874221 82 178 937557m 70042mb 601684k
>> > 257607k
>> > node14.domain.com,60020,1429640630771 82 145 949090m 69749mb 592812k
>> > 266677k
>> >
>> >
>> > When compaction starts  random node gets I/O 100%, io wait for seconds,
>> > even tenth of seconds.
>> >
>> > What are the approaches to optimize minor and major compactions when you
>> > are I/O bound..?
>> >
>>
>> Yeah, with two disks, you will be crimped. Do you have the system sharing
>> with hbase/hdfs or is hdfs running on one disk only?
>>
>> Do you have to compact? In other words, do you have read SLAs?  How are
>> your read times currently?  Does your working dataset fit in RAM or do
>> reads have to go to disk?  It looks like you are running at about three
>> storefiles per column family.  What if you upped the threshold at which
>> minors run? Do you have a downtime during which you could schedule
>> compactions? Are you managing the major compactions yourself or are you
>> having hbase do it for you?
>>
>> St.Ack
>>
>
>

Re: Optimizing compactions on super-low-cost HW

2015-05-22 Thread Serega Sheypak

>What version of hbase are you on?
We are on CDH 5.2.1 HBase 0.98


>These hfiles are created on same cluster with MR? (i.e. they are using up
i/os)
The same cluster :) They are created during night and we get IO degradation
when no MR runs. I understand, that MR also gives significant IO pressure.

>Can you cache more?
Don't understand, can you explain? Row cache enabled for all tables which
apps read.

>Can you make it so files are bigger before you flush?
How can I reach that? increase memstore size?

>the traffic is not so heavy?
During night is 3-4 times less. I run major compactions during night.

>You realize that a major compaction will do full rewrite of your dataset?
I do

> When they run, how many storefiles are there?
How can I measure that? Goto hdfs and count files in table catalog?

Do you have to run once a day?  Can you not run once a week?
Maybe if there is no significant read performance penalty

> Enable deferring sync'ing firs
Will try...

2015-05-21 23:04 GMT+03:00 Stack :

> On Thu, May 21, 2015 at 1:04 AM, Serega Sheypak 
> wrote:
>
> > > Do you have the system sharing
> > There are 2 HDD 7200 2TB each. There is 300GB OS partition on each drive
> > with mirroring enabled. I can't persuade devops that mirroring could
> cause
> > IO issues. What arguments can I bring? They use OS partition mirroring
> when
> > disck fails, we can use other partition to boot OS and continue to
> work...
> >
> >
> You are already compromised i/o-wise having two disks only. I have not the
> experience to say for sure but basic physics would seem to dictate that
> having your two disks (partially) mirrored compromises your i/o even more.
>
> You are in a bit of a hard place. Your operators want the machine to boot
> even after it loses 50% of its disk.
>
>
> > >Do you have to compact? In other words, do you have read SLAs?
> > Unfortunately, I have mixed workload from web applications. I need to
> write
> > and read and SLA is < 50ms.
> >
> >
> Ok. You get the bit that seeks are about 10ms or each so with two disks you
> can do 2x100 seeks a second presuming no one else is using disk.
>
>
> > >How are your read times currently?
> > Cloudera manager says it's 4K reads per second and 500 writes per second
> >
> > >Does your working dataset fit in RAM or do
> > reads have to go to disk?
> > I have several tables for 500GB each and many small tables 10-20 GB.
> Small
> > tables loaded hourly/daily using bulkload (prepare HFiles using MR and
> move
> > them to HBase using utility). Big tables are used by webapps, they read
> and
> > write them.
> >
> >
> These hfiles are created on same cluster with MR? (i.e. they are using up
> i/os)
>
>
> > >It looks like you are running at about three storefiles per column
> family
> > is it hbase.hstore.compactionThreshold=3?
> >
>
>
> > >What if you upped the threshold at which minors run?
> > you mean bump  hbase.hstore.compactionThreshold to 8 or 10?
> >
> >
> Yes.
>
> Downside is that your reads may require more seeks to find a keyvalue.
>
> Can you cache more?
>
> Can you make it so files are bigger before you flush?
>
>
>
> > >Do you have a downtime during which you could schedule compactions?
> > Unfortunately no. It should work 24/7 and sometimes it doesn't do it.
> >
> >
> So, it is running at full bore 24/7?  There is no 'downtime'... a time when
> the traffic is not so heavy?
>
>
>
> > >Are you managing the major compactions yourself or are you having hbase
> do
> > it for you?
> > HBase, once a day hbase.hregion.majorcompaction=1day
> >
> >
> Have you studied your compactions?  You realize that a major compaction
> will do full rewrite of your dataset?  When they run, how many storefiles
> are there?
>
> Do you have to run once a day?  Can you not run once a week?  Can you
> manage the compactions yourself... and run them a region at a time in a
> rolling manner across the cluster rather than have them just run whenever
> it suits them once a day?
>
>
>
> > I can disable WAL. It's ok to loose some data in case of RS failure. I'm
> > not doing banking transactions.
> > If I disable WAL, could it help?
> >
> >
> It could but don't. Enable deferring sync'ing first if you can 'lose' some
> data.
>
> Work on your flushing and compactions before you mess w/ WAL.
>
> What version of hbase are you on? You say CDH but the newer your hbase, the
> better it does generally.
>
> St.Ack
>
>
>

Re: Optimizing compactions on super-low-cost HW

2015-05-22 Thread Serega Sheypak

We don't have money, these nodes are the cheapest. I totally agree that we
need 4-6 HDD, but there is no chance to get it unfortunately.
Okay, I'll try yo apply Stack suggestions.

2015-05-22 13:00 GMT+03:00 Michael Segel :

> Look, to be blunt, you’re screwed.
>
> If I read your cluster spec.. it sounds like you have a single i7 (quad
> core) cpu. That’s 4 cores or 8 threads.
>
> Mirroring the OS is common practice.
> Using the same drives for Hadoop… not so good, but once the sever boots
> up… not so much I/O.
> Its not good, but you could live with it….
>
> Your best bet is to add a couple of more spindles. Ideally you’d want to
> have 6 drives. the 2 OS drives mirrored and separate. (Use the extra space
> to stash / write logs.) Then have 4 drives / spindles in JBOD for Hadoop.
> This brings you to a 1:1 on physical cores.  If your box can handle more
> spindles, then going to a total of 10 drives would improve performance
> further.
>
> However, you need to level set your expectations… you can only go so far.
> If you have 4 drives spinning,  you could start to saturate a 1GbE network
> so that will hurt performance.
>
> That’s pretty much your only option in terms of fixing the hardware and
> then you have to start tuning.
>
> > On May 21, 2015, at 4:04 PM, Stack  wrote:
> >
> > On Thu, May 21, 2015 at 1:04 AM, Serega Sheypak <
> serega.shey...@gmail.com>
> > wrote:
> >
> >>> Do you have the system sharing
> >> There are 2 HDD 7200 2TB each. There is 300GB OS partition on each drive
> >> with mirroring enabled. I can't persuade devops that mirroring could
> cause
> >> IO issues. What arguments can I bring? They use OS partition mirroring
> when
> >> disck fails, we can use other partition to boot OS and continue to
> work...
> >>
> >>
> > You are already compromised i/o-wise having two disks only. I have not
> the
> > experience to say for sure but basic physics would seem to dictate that
> > having your two disks (partially) mirrored compromises your i/o even
> more.
> >
> > You are in a bit of a hard place. Your operators want the machine to boot
> > even after it loses 50% of its disk.
> >
> >
> >>> Do you have to compact? In other words, do you have read SLAs?
> >> Unfortunately, I have mixed workload from web applications. I need to
> write
> >> and read and SLA is < 50ms.
> >>
> >>
> > Ok. You get the bit that seeks are about 10ms or each so with two disks
> you
> > can do 2x100 seeks a second presuming no one else is using disk.
> >
> >
> >>> How are your read times currently?
> >> Cloudera manager says it's 4K reads per second and 500 writes per second
> >>
> >>> Does your working dataset fit in RAM or do
> >> reads have to go to disk?
> >> I have several tables for 500GB each and many small tables 10-20 GB.
> Small
> >> tables loaded hourly/daily using bulkload (prepare HFiles using MR and
> move
> >> them to HBase using utility). Big tables are used by webapps, they read
> and
> >> write them.
> >>
> >>
> > These hfiles are created on same cluster with MR? (i.e. they are using up
> > i/os)
> >
> >
> >>> It looks like you are running at about three storefiles per column
> family
> >> is it hbase.hstore.compactionThreshold=3?
> >>
> >
> >
> >>> What if you upped the threshold at which minors run?
> >> you mean bump  hbase.hstore.compactionThreshold to 8 or 10?
> >>
> >>
> > Yes.
> >
> > Downside is that your reads may require more seeks to find a keyvalue.
> >
> > Can you cache more?
> >
> > Can you make it so files are bigger before you flush?
> >
> >
> >
> >>> Do you have a downtime during which you could schedule compactions?
> >> Unfortunately no. It should work 24/7 and sometimes it doesn't do it.
> >>
> >>
> > So, it is running at full bore 24/7?  There is no 'downtime'... a time
> when
> > the traffic is not so heavy?
> >
> >
> >
> >>> Are you managing the major compactions yourself or are you having
> hbase do
> >> it for you?
> >> HBase, once a day hbase.hregion.majorcompaction=1day
> >>
> >>
> > Have you studied your compactions?  You realize that a major compaction
> > will do full rewrite of your dataset?  When they run, how many storefiles
> > are there?
> >
> > Do you have to run once a day?  Can you

Re: Optimizing compactions on super-low-cost HW

2015-05-24 Thread Serega Sheypak

Hi, thanks!
> hbase.hstore.blockingStoreFiles
Don't understand the idea of this setting, can I find explanation for
"dummies"?

>hbase.hregion.majorcompaction
done already

>DATA_BLOCK_ENCODING, SNAPPY
I always use it by default, CPU OK

> memstore flush size
done


>I assume only the 300g partitions are mirrored, right? (not the entire 2t
drive)
Aha

>Can you add more machines?
Will do it when earn money.
Thank you :)

2015-05-24 21:42 GMT+03:00 lars hofhansl :

> Yeah, all you can do is drive your write amplification down.
>
>
> As Stack said:
> - Increase hbase.hstore.compactionThreshold, and
> hbase.hstore.blockingStoreFiles. It'll hurt read, but in your case read is
> already significantly hurt when compactions happen.
>
>
> - Absolutely set hbase.hregion.majorcompaction to 1 week (with a jitter if
> 1/2 week, that's the default in 0.98 and later). Minor compaction will
> still happen, based on the compactionThreshold setting. Right now you're
> rewriting _all_ you data _every_ day.
>
>
> - Turning off WAL writing will safe you IO, but I doubt it'll help much. I
> do not expect async WAL helps a lot as the aggregate IO is still the same.
>
> - See if you can enable DATA_BLOCK_ENCODING on your column families
> (FAST_DIFF, or PREFIX are good). You can also try SNAPPY compression. That
> would reduce you overall IO (Since your CPUs are also weak you'd have to
> test the CPU/IO tradeoff)
>
>
> - If you have RAM to spare, increase the memstore flush size (will lead to
> initially larger and fewer files).
>
>
> - Or (again if you have spare RAM) make your regions smaller, to curb
> write amplification.
>
>
> - I assume only the 300g partitions are mirrored, right? (not the entire
> 2t drive)
>
>
> I have some suggestions compiled here (if you don't mind the plug):
> http://hadoop-hbase.blogspot.com/2015/05/my-hbasecon-talk-about-hbase.html
>
> Other than that, I'll repeat what others said, you have 14 extremely weak
> machines, you can't expect the world from this.
> You're aggregate IOPS are less than 3000, you aggregate IO bandwidth
> ~3GB/s. Can you add more machines?
>
>
> -- Lars
>
> 
> From: Serega Sheypak 
> To: user 
> Sent: Friday, May 22, 2015 3:45 AM
> Subject: Re: Optimizing compactions on super-low-cost HW
>
>
> We don't have money, these nodes are the cheapest. I totally agree that we
> need 4-6 HDD, but there is no chance to get it unfortunately.
> Okay, I'll try yo apply Stack suggestions.
>
>
>
>
> 2015-05-22 13:00 GMT+03:00 Michael Segel :
>
> > Look, to be blunt, you’re screwed.
> >
> > If I read your cluster spec.. it sounds like you have a single i7 (quad
> > core) cpu. That’s 4 cores or 8 threads.
> >
> > Mirroring the OS is common practice.
> > Using the same drives for Hadoop… not so good, but once the sever boots
> > up… not so much I/O.
> > Its not good, but you could live with it….
> >
> > Your best bet is to add a couple of more spindles. Ideally you’d want to
> > have 6 drives. the 2 OS drives mirrored and separate. (Use the extra
> space
> > to stash / write logs.) Then have 4 drives / spindles in JBOD for Hadoop.
> > This brings you to a 1:1 on physical cores.  If your box can handle more
> > spindles, then going to a total of 10 drives would improve performance
> > further.
> >
> > However, you need to level set your expectations… you can only go so far.
> > If you have 4 drives spinning,  you could start to saturate a 1GbE
> network
> > so that will hurt performance.
> >
> > That’s pretty much your only option in terms of fixing the hardware and
> > then you have to start tuning.
> >
> > > On May 21, 2015, at 4:04 PM, Stack  wrote:
> > >
> > > On Thu, May 21, 2015 at 1:04 AM, Serega Sheypak <
> > serega.shey...@gmail.com>
> > > wrote:
> > >
> > >>> Do you have the system sharing
> > >> There are 2 HDD 7200 2TB each. There is 300GB OS partition on each
> drive
> > >> with mirroring enabled. I can't persuade devops that mirroring could
> > cause
> > >> IO issues. What arguments can I bring? They use OS partition mirroring
> > when
> > >> disck fails, we can use other partition to boot OS and continue to
> > work...
> > >>
> > >>
> > > You are already compromised i/o-wise having two disks only. I have not
> > the
> > > experience to say for sure but basic physics would seem to dictate that
> > > having your two disks (par

Re: Optimizing compactions on super-low-cost HW

2015-05-25 Thread Serega Sheypak

Ok, got it. Thank you.

2015-05-25 7:58 GMT+03:00 lars hofhansl :

> Re: blockingStoreFiles
> With LSM stores you do not get a smooth behavior when you continuously try
> to pump more data into the cluster than the system can absorb.
> For a while the memstores can absorb the write in RAM, then they need to
> flush. If compactions cannot keep up with the influx of new HFiles, you
> have two choices: (1) you allow the number of the HFiles to grow at the
> expense of read performance, or (2) you tell the clients to slow down
> (there are various levels of sophistication about how you do that, but
> that's besides the point).
> blockingStoreFiles is the maximum number of files (per store, i.e. per
> column family) that HBase will allow to accumulate before it stops
> accepting writes from the clients.In 0.94 it would simply block for a
> while. In 0.98 it throws an exception back to the client to tell it to back
> off.
> -- Lars
>
>  From: Serega Sheypak 
>  To: user ; lars hofhansl 
>  Sent: Sunday, May 24, 2015 12:59 PM
>  Subject: Re: Optimizing compactions on super-low-cost HW
>
> Hi, thanks!
> > hbase.hstore.blockingStoreFiles
> Don't understand the idea of this setting, can I find explanation for
> "dummies"?
>
> >hbase.hregion.majorcompaction
> done already
>
> >DATA_BLOCK_ENCODING, SNAPPY
> I always use it by default, CPU OK
>
> > memstore flush size
> done
>
>
> >I assume only the 300g partitions are mirrored, right? (not the entire 2t
> drive)
> Aha
>
> >Can you add more machines?
> Will do it when earn money.
> Thank you :)
>
>
>
> 2015-05-24 21:42 GMT+03:00 lars hofhansl :
>
> > Yeah, all you can do is drive your write amplification down.
> >
> >
> > As Stack said:
> > - Increase hbase.hstore.compactionThreshold, and
> > hbase.hstore.blockingStoreFiles. It'll hurt read, but in your case read
> is
> > already significantly hurt when compactions happen.
> >
> >
> > - Absolutely set hbase.hregion.majorcompaction to 1 week (with a jitter
> if
> > 1/2 week, that's the default in 0.98 and later). Minor compaction will
> > still happen, based on the compactionThreshold setting. Right now you're
> > rewriting _all_ you data _every_ day.
> >
> >
> > - Turning off WAL writing will safe you IO, but I doubt it'll help much.
> I
> > do not expect async WAL helps a lot as the aggregate IO is still the
> same.
> >
> > - See if you can enable DATA_BLOCK_ENCODING on your column families
> > (FAST_DIFF, or PREFIX are good). You can also try SNAPPY compression.
> That
> > would reduce you overall IO (Since your CPUs are also weak you'd have to
> > test the CPU/IO tradeoff)
> >
> >
> > - If you have RAM to spare, increase the memstore flush size (will lead
> to
> > initially larger and fewer files).
> >
> >
> > - Or (again if you have spare RAM) make your regions smaller, to curb
> > write amplification.
> >
> >
> > - I assume only the 300g partitions are mirrored, right? (not the entire
> > 2t drive)
> >
> >
> > I have some suggestions compiled here (if you don't mind the plug):
> >
> http://hadoop-hbase.blogspot.com/2015/05/my-hbasecon-talk-about-hbase.html
> >
> > Other than that, I'll repeat what others said, you have 14 extremely weak
> > machines, you can't expect the world from this.
> > You're aggregate IOPS are less than 3000, you aggregate IO bandwidth
> > ~3GB/s. Can you add more machines?
> >
> >
> > -- Lars
> >
> > 
> > From: Serega Sheypak 
> > To: user 
> > Sent: Friday, May 22, 2015 3:45 AM
> > Subject: Re: Optimizing compactions on super-low-cost HW
> >
> >
> > We don't have money, these nodes are the cheapest. I totally agree that
> we
> > need 4-6 HDD, but there is no chance to get it unfortunately.
> > Okay, I'll try yo apply Stack suggestions.
> >
> >
> >
> >
> > 2015-05-22 13:00 GMT+03:00 Michael Segel :
> >
> > > Look, to be blunt, you’re screwed.
> > >
> > > If I read your cluster spec.. it sounds like you have a single i7 (quad
> > > core) cpu. That’s 4 cores or 8 threads.
> > >
> > > Mirroring the OS is common practice.
> > > Using the same drives for Hadoop… not so good, but once the sever boots
> > > up… not so much I/O.
> > > Its not good, but you could live with it….
> > >
> > > Your best bet is to add a couple of more spindles

Re: Hbase vs Cassandra

2015-05-30 Thread Serega Sheypak

http://blog.parsely.com/post/1928/cass/
Here is cool blogpost. I've used hbase for years and once had a project
with Cassandra. Over complicated system with bugs declared as features.
Really there is no reason to use Cassandra.
Describe our task and I can tell you how solve it using hbase

пятница, 29 мая 2015 г. пользователь Ajay написал:

> Hi,
>
> I need some info on Hbase vs Cassandra as a data store (in general plus
> specific to time series data).
>
> The comparison in the following helps:
> 1: features
> 2: deployment and monitoring
> 3: performance
> 4: anything else
>
> Thanks
> Ajay
>

Re: Hbase vs Cassandra

2015-05-30 Thread Serega Sheypak

1. No killer features comparing to hbase
2.terrible!!! Ambari/cloudera manager rulezzz. Netflix has its own tool for
Cassandra but it doesn't support vnodes.
3. Rumors say it fast when it works;) the reason- it can silently drop data
you try to write.
4. Timeseries is a nightmare. The easiest approach is just replicate data
to hdfs, partition it by hour/day and run spark/scalding/pig/hive/Impala

пятница, 29 мая 2015 г. пользователь Ajay написал:

> Hi,
>
> I need some info on Hbase vs Cassandra as a data store (in general plus
> specific to time series data).
>
> The comparison in the following helps:
> 1: features
> 2: deployment and monitoring
> 3: performance
> 4: anything else
>
> Thanks
> Ajay
>

Re: Hbase vs Cassandra

2015-05-30 Thread Serega Sheypak

You can use Cassandra not datastax distro. Apache Cassandra is opensourse

суббота, 30 мая 2015 г. пользователь jongchul seon написал:

> I have not tried cassandra, because it is not fully open source...  I
> personally prefer HBase which always shows expected result for my code.
>
>
> 2015-05-30 11:40 GMT+00:00 Serega Sheypak  >:
>
> > 1. No killer features comparing to hbase
> > 2.terrible!!! Ambari/cloudera manager rulezzz. Netflix has its own tool
> for
> > Cassandra but it doesn't support vnodes.
> > 3. Rumors say it fast when it works;) the reason- it can silently drop
> data
> > you try to write.
> > 4. Timeseries is a nightmare. The easiest approach is just replicate data
> > to hdfs, partition it by hour/day and run spark/scalding/pig/hive/Impala
> >
> > пятница, 29 мая 2015 г. пользователь Ajay написал:
> >
> > > Hi,
> > >
> > > I need some info on Hbase vs Cassandra as a data store (in general plus
> > > specific to time series data).
> > >
> > > The comparison in the following helps:
> > > 1: features
> > > 2: deployment and monitoring
> > > 3: performance
> > > 4: anything else
> > >
> > > Thanks
> > > Ajay
> > >
> >
>

Can't run hbase testing util cluster using CDH 5.4.4

2015-07-19 Thread Serega Sheypak

Hi, bumped my testing stuff to CDH 5.4.4 and got failure while running
tests. Here is a log

2015-07-19 23:40:18,607 INFO  [SERGEYs-MBP:51977.activeMasterManager]
master.TableNamespaceManager (TableNamespaceManager.java:start(85)) -
Namespace table not found. Creating...

2015-07-19 23:40:18,620 INFO  [ProcessThread(sid:0 cport:-1):]
server.PrepRequestProcessor (PrepRequestProcessor.java:pRequest(645)) - Got
user-level KeeperException when processing sessionid:0x14ea8429017
type:create cxid:0x1ac zxid:0x35 txntype:-1 reqpath:n/a Error
Path:/hbase/table-lock/hbase:namespace Error:KeeperErrorCode = NoNode for
/hbase/table-lock/hbase:namespace

2015-07-19 23:40:18,630 INFO  [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:51977-0]
handler.CreateTableHandler (CreateTableHandler.java:process(189)) - Create
table hbase:namespace

2015-07-19 23:40:18,644 INFO  [RegionOpenAndInitThread-hbase:namespace-1]
regionserver.HRegion (HRegion.java:createHRegion(5598)) - creating HRegion
hbase:namespace HTD == 'hbase:namespace', {NAME => 'info', BLOOMFILTER =>
'ROW', VERSIONS => '10', IN_MEMORY => 'true', KEEP_DELETED_CELLS =>
'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION =>
'NONE', CACHE_DATA_IN_L1 => 'true', MIN_VERSIONS => '0', BLOCKCACHE =>
'true', BLOCKSIZE => '8192', REPLICATION_SCOPE => '0'} RootDir =
file:/Users/ssa/devel/myown/hadoop/mini-hdfs-cluster-maven-plugin/target/hbase-root/.tmp
Table name == hbase:namespace

2015-07-19 23:40:18,655 INFO  [RegionOpenAndInitThread-hbase:namespace-1]
regionserver.HRegion (HRegion.java:doClose(1425)) - Closed
hbase:namespace,,1437342018607.9c2e23572f970747f86ec499b89c281b.

2015-07-19 23:40:18,702 INFO  [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:51977-0]
hbase.MetaTableAccessor (MetaTableAccessor.java:addRegionsToMeta(1169)) -
Added 2

2015-07-19 23:40:18,704 WARN  [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:51977-0]
zookeeper.ZKTableStateManager (ZKTableStateManager.java:setTableState(100))
- Moving table hbase:namespace state from ENABLING to ENABLED

2015-07-19 23:40:18,706 INFO  [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:51977-0]
handler.CreateTableHandler (CreateTableHandler.java:completed(219)) -
Table, hbase:namespace, creation successful

Process Thread Dump: Thread dump because: Master not initialized after
20ms seconds

186 active threads

Thread 255 (MASTER_TABLE_OPERATIONS-SERGEYs-MBP:51977-0):

  State: WAITING

  Blocked count: 11

  Waited count: 11

  Waiting on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@20b5f2ac

  Stack:

sun.misc.Unsafe.park(Native Method)

java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)


java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)


java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)


java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)


java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)


java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

java.lang.Thread.run(Thread.java:745)

Thread 249 (CatalogJanitor-SERGEYs-MBP:51977):





2015-07-19 23:43:33,188 ERROR [main] hbase.MiniHBaseCluster
(MiniHBaseCluster.java:init(229)) - Error starting cluster

java.lang.RuntimeException: Master not initialized after 20ms seconds

at
org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:225)

at
org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:436)

at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:224)

at org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:93)

at org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:80)

 at
org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:67)


fails here: MINI_HBASE_CLUSTER = new MiniHBaseCluster(configuration, 1);


What it could be? Have no idea.

Re: Can't run hbase testing util cluster using CDH 5.4.4

2015-07-20 Thread Serega Sheypak

@Sean, thanks. I saw sometimes Cloudera guys help here. I also used
Cloudera community forum.

@Jean-Marc, nothing special, just maven-plugin wrapper around
miniHbaseCluster
Here is the code where failure happen: new MiniHBaseCluster(configuration,
1);

There are nothing special from my side. I'm surprised, it always worked
since CDH 4.4, just bump dependency versions, fix code to follow API
changes and that' all.

2015-07-20 4:08 GMT+02:00 Jean-Marc Spaggiari :

> Hi Serega,
>
> What kind of tests are your trying to run? The HBase test suite? Or
> something you developed yourself?
>
> JM
>
> 2015-07-19 17:47 GMT-04:00 Serega Sheypak :
>
> > Hi, bumped my testing stuff to CDH 5.4.4 and got failure while running
> > tests. Here is a log
> >
> > 2015-07-19 23:40:18,607 INFO  [SERGEYs-MBP:51977.activeMasterManager]
> > master.TableNamespaceManager (TableNamespaceManager.java:start(85)) -
> > Namespace table not found. Creating...
> >
> > 2015-07-19 23:40:18,620 INFO  [ProcessThread(sid:0 cport:-1):]
> > server.PrepRequestProcessor (PrepRequestProcessor.java:pRequest(645)) -
> Got
> > user-level KeeperException when processing sessionid:0x14ea8429017
> > type:create cxid:0x1ac zxid:0x35 txntype:-1 reqpath:n/a Error
> > Path:/hbase/table-lock/hbase:namespace Error:KeeperErrorCode = NoNode for
> > /hbase/table-lock/hbase:namespace
> >
> > 2015-07-19 23:40:18,630 INFO
> [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:51977-0]
> > handler.CreateTableHandler (CreateTableHandler.java:process(189)) -
> Create
> > table hbase:namespace
> >
> > 2015-07-19 23:40:18,644 INFO  [RegionOpenAndInitThread-hbase:namespace-1]
> > regionserver.HRegion (HRegion.java:createHRegion(5598)) - creating
> HRegion
> > hbase:namespace HTD == 'hbase:namespace', {NAME => 'info', BLOOMFILTER =>
> > 'ROW', VERSIONS => '10', IN_MEMORY => 'true', KEEP_DELETED_CELLS =>
> > 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION =>
> > 'NONE', CACHE_DATA_IN_L1 => 'true', MIN_VERSIONS => '0', BLOCKCACHE =>
> > 'true', BLOCKSIZE => '8192', REPLICATION_SCOPE => '0'} RootDir =
> >
> >
> file:/Users/ssa/devel/myown/hadoop/mini-hdfs-cluster-maven-plugin/target/hbase-root/.tmp
> > Table name == hbase:namespace
> >
> > 2015-07-19 23:40:18,655 INFO  [RegionOpenAndInitThread-hbase:namespace-1]
> > regionserver.HRegion (HRegion.java:doClose(1425)) - Closed
> > hbase:namespace,,1437342018607.9c2e23572f970747f86ec499b89c281b.
> >
> > 2015-07-19 23:40:18,702 INFO
> [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:51977-0]
> > hbase.MetaTableAccessor (MetaTableAccessor.java:addRegionsToMeta(1169)) -
> > Added 2
> >
> > 2015-07-19 23:40:18,704 WARN
> [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:51977-0]
> > zookeeper.ZKTableStateManager
> (ZKTableStateManager.java:setTableState(100))
> > - Moving table hbase:namespace state from ENABLING to ENABLED
> >
> > 2015-07-19 23:40:18,706 INFO
> [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:51977-0]
> > handler.CreateTableHandler (CreateTableHandler.java:completed(219)) -
> > Table, hbase:namespace, creation successful
> >
> > Process Thread Dump: Thread dump because: Master not initialized after
> > 20ms seconds
> >
> > 186 active threads
> >
> > Thread 255 (MASTER_TABLE_OPERATIONS-SERGEYs-MBP:51977-0):
> >
> >   State: WAITING
> >
> >   Blocked count: 11
> >
> >   Waited count: 11
> >
> >   Waiting on
> >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@20b5f2ac
> >
> >   Stack:
> >
> > sun.misc.Unsafe.park(Native Method)
> >
> > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> >
> >
> >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> >
> >
> >
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> >
> >
> >
> >
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
> >
> >
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
> >
> >
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >
> > java.lang.Thread.run(Thread.java:745)
> >
> > Thread 249 (CatalogJanitor-SERGEYs-MBP:51977):
> >

Re: Can't run hbase testing util cluster using CDH 5.4.4

2015-07-20 Thread Serega Sheypak

I see these lines:

2015-07-20 21:27:21,791 INFO  [RegionOpenAndInitThread-hbase:namespace-1]
regionserver.HRegion (HRegion.java:createHRegion(5598)) - creating HRegion
hbase:namespace HTD == 'hbase:namespace', {NAME => 'info', BLOOMFILTER =>
'ROW', VERSIONS => '10', IN_MEMORY => 'true', KEEP_DELETED_CELLS =>
'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION =>
'NONE', CACHE_DATA_IN_L1 => 'true', MIN_VERSIONS => '0', BLOCKCACHE =>
'true', BLOCKSIZE => '8192', REPLICATION_SCOPE => '0'} RootDir =
file:/Users/ssa/devel/myown/hadoop/mini-hdfs-cluster-maven-plugin/target/hbase-root/.tmp
Table name == hbase:namespace

2015-07-20 21:27:21,802 INFO  [RegionOpenAndInitThread-hbase:namespace-1]
regionserver.HRegion (HRegion.java:doClose(1425)) - Closed
hbase:namespace,,1437420441756.a69bbbcaf4a82786964da8d9cc62bea3.

2015-07-20 21:27:21,861 INFO  [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:61781-0]
hbase.MetaTableAccessor (MetaTableAccessor.java:addRegionsToMeta(1169)) -
Added 2

2015-07-20 21:27:21,864 WARN  [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:61781-0]
zookeeper.ZKTableStateManager (ZKTableStateManager.java:setTableState(100))
- Moving table hbase:namespace state from ENABLING to ENABLED

2015-07-20 21:27:21,866 INFO  [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:61781-0]
handler.CreateTableHandler (CreateTableHandler.java:completed(219)) -
Table, hbase:namespace, creation successful


then it hungs and prints thread dump in few minutes.


2015-07-20 21:25 GMT+02:00 Esteban Gutierrez :

>
> But do you see that thread printing anything in the logs?
>
> --
> Cloudera, Inc.
>
>
> On Mon, Jul 20, 2015 at 12:07 PM, Serega Sheypak  > wrote:
>
>> This is testing utiliy, it has few bytes of data to load. Running on
>> oracle-jdk8
>>
>> java.lang.Thread.run(Thread.java:745)
>> Thread 178 (JvmPauseMonitor):
>>   State: TIMED_WAITING
>>   Blocked count: 3
>>   Waited count: 398
>>   Stack:
>> java.lang.Thread.sleep(Native Method)
>>
>> org.apache.hadoop.hbase.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:159)
>> java.lang.Thread.run(Thread.java:745)
>>
>> Thread 177 (JvmPauseMonitor):
>>   State: TIMED_WAITING
>>   Blocked count: 1
>>   Waited count: 398
>>   Stack:
>> java.lang.Thread.sleep(Native Method)
>>
>> org.apache.hadoop.hbase.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:159)
>> java.lang.Thread.run(Thread.java:745)
>>
>> 2015-07-20 19:30 GMT+02:00 Esteban Gutierrez :
>>
>>> -user@hbase
>>>
>>> Hi Serega,
>>>
>>> The RunTimeException is pointing to a timeout of nearly 3min. Have tried
>>> to find in the master log lines what is causing that 3min pause? do you see
>>> any log line related to the JvmPauseMonitor? (perhaps some GC going on)
>>>
>>> thanks,
>>> esteban.
>>>
>>> --
>>> Cloudera, Inc.
>>>
>>>
>>> On Mon, Jul 20, 2015 at 12:28 AM, Serega Sheypak <
>>> serega.shey...@gmail.com> wrote:
>>>
>>>> @Sean, thanks. I saw sometimes Cloudera guys help here. I also used
>>>> Cloudera community forum.
>>>>
>>>> @Jean-Marc, nothing special, just maven-plugin wrapper around
>>>> miniHbaseCluster
>>>> Here is the code where failure happen: new
>>>> MiniHBaseCluster(configuration,
>>>> 1);
>>>>
>>>> There are nothing special from my side. I'm surprised, it always worked
>>>> since CDH 4.4, just bump dependency versions, fix code to follow API
>>>> changes and that' all.
>>>>
>>>> 2015-07-20 4:08 GMT+02:00 Jean-Marc Spaggiari >>> >:
>>>>
>>>> > Hi Serega,
>>>> >
>>>> > What kind of tests are your trying to run? The HBase test suite? Or
>>>> > something you developed yourself?
>>>> >
>>>> > JM
>>>> >
>>>> > 2015-07-19 17:47 GMT-04:00 Serega Sheypak :
>>>> >
>>>> > > Hi, bumped my testing stuff to CDH 5.4.4 and got failure while
>>>> running
>>>> > > tests. Here is a log
>>>> > >
>>>> > > 2015-07-19 23:40:18,607 INFO
>>>> [SERGEYs-MBP:51977.activeMasterManager]
>>>> > > master.TableNamespaceManager (TableNamespaceManager.java:start(85))
>>>> -
>>>> > > Namespace table not found. Creating...

Re: Can't run hbase testing util cluster using CDH 5.4.4

2015-07-20 Thread Serega Sheypak

I degrade project with hbase testing utility to these versions:
2.5.0-cdh5.2.0
2.5.0-mr1-cdh5.2.0
0.98.6-cdh5.2.0

It works.

I upgrade to these:

2.6.0-cdh5.4.4
2.6.0-mr1-cdh5.4.4
1.0.0-cdh5.4.4

it hangs...

2015-07-20 23:27 GMT+02:00 Serega Sheypak :

> I see these lines:
>
> 2015-07-20 21:27:21,791 INFO  [RegionOpenAndInitThread-hbase:namespace-1]
> regionserver.HRegion (HRegion.java:createHRegion(5598)) - creating HRegion
> hbase:namespace HTD == 'hbase:namespace', {NAME => 'info', BLOOMFILTER =>
> 'ROW', VERSIONS => '10', IN_MEMORY => 'true', KEEP_DELETED_CELLS =>
> 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION =>
> 'NONE', CACHE_DATA_IN_L1 => 'true', MIN_VERSIONS => '0', BLOCKCACHE =>
> 'true', BLOCKSIZE => '8192', REPLICATION_SCOPE => '0'} RootDir =
> file:/Users/ssa/devel/myown/hadoop/mini-hdfs-cluster-maven-plugin/target/hbase-root/.tmp
> Table name == hbase:namespace
>
> 2015-07-20 21:27:21,802 INFO  [RegionOpenAndInitThread-hbase:namespace-1]
> regionserver.HRegion (HRegion.java:doClose(1425)) - Closed
> hbase:namespace,,1437420441756.a69bbbcaf4a82786964da8d9cc62bea3.
>
> 2015-07-20 21:27:21,861 INFO
> [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:61781-0] hbase.MetaTableAccessor
> (MetaTableAccessor.java:addRegionsToMeta(1169)) - Added 2
>
> 2015-07-20 21:27:21,864 WARN
> [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:61781-0] zookeeper.ZKTableStateManager
> (ZKTableStateManager.java:setTableState(100)) - Moving table
> hbase:namespace state from ENABLING to ENABLED
>
> 2015-07-20 21:27:21,866 INFO
> [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:61781-0] handler.CreateTableHandler
> (CreateTableHandler.java:completed(219)) - Table, hbase:namespace, creation
> successful
>
>
> then it hungs and prints thread dump in few minutes.
>
>
> 2015-07-20 21:25 GMT+02:00 Esteban Gutierrez :
>
>>
>> But do you see that thread printing anything in the logs?
>>
>> --
>> Cloudera, Inc.
>>
>>
>> On Mon, Jul 20, 2015 at 12:07 PM, Serega Sheypak <
>> serega.shey...@gmail.com> wrote:
>>
>>> This is testing utiliy, it has few bytes of data to load. Running on
>>> oracle-jdk8
>>>
>>> java.lang.Thread.run(Thread.java:745)
>>> Thread 178 (JvmPauseMonitor):
>>>   State: TIMED_WAITING
>>>   Blocked count: 3
>>>   Waited count: 398
>>>   Stack:
>>> java.lang.Thread.sleep(Native Method)
>>>
>>> org.apache.hadoop.hbase.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:159)
>>> java.lang.Thread.run(Thread.java:745)
>>>
>>> Thread 177 (JvmPauseMonitor):
>>>   State: TIMED_WAITING
>>>   Blocked count: 1
>>>   Waited count: 398
>>>   Stack:
>>> java.lang.Thread.sleep(Native Method)
>>>
>>> org.apache.hadoop.hbase.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:159)
>>> java.lang.Thread.run(Thread.java:745)
>>>
>>> 2015-07-20 19:30 GMT+02:00 Esteban Gutierrez :
>>>
>>>> -user@hbase
>>>>
>>>> Hi Serega,
>>>>
>>>> The RunTimeException is pointing to a timeout of nearly 3min. Have
>>>> tried to find in the master log lines what is causing that 3min pause? do
>>>> you see any log line related to the JvmPauseMonitor? (perhaps some GC going
>>>> on)
>>>>
>>>> thanks,
>>>> esteban.
>>>>
>>>> --
>>>> Cloudera, Inc.
>>>>
>>>>
>>>> On Mon, Jul 20, 2015 at 12:28 AM, Serega Sheypak <
>>>> serega.shey...@gmail.com> wrote:
>>>>
>>>>> @Sean, thanks. I saw sometimes Cloudera guys help here. I also used
>>>>> Cloudera community forum.
>>>>>
>>>>> @Jean-Marc, nothing special, just maven-plugin wrapper around
>>>>> miniHbaseCluster
>>>>> Here is the code where failure happen: new
>>>>> MiniHBaseCluster(configuration,
>>>>> 1);
>>>>>
>>>>> There are nothing special from my side. I'm surprised, it always worked
>>>>> since CDH 4.4, just bump dependency versions, fix code to follow API
>>>>> changes and that' all.
>>>>>
>>>>> 2015-07-20 4:08 GMT+02:00 Jean-Marc Spaggiari >>>> >:
>>>>>
>>>>> > Hi Serega,
>>>>> >
>>>>> > W

Re: Can't run hbase testing util cluster using CDH 5.4.4

2015-07-20 Thread Serega Sheypak

omg, looks like -Djava.net.preferIPv4Stack=true helped...
probably some weird macos update...

2015-07-20 23:56 GMT+02:00 Serega Sheypak :

> I degrade project with hbase testing utility to these versions:
> 2.5.0-cdh5.2.0
> 2.5.0-mr1-cdh5.2.0
> 0.98.6-cdh5.2.0
>
> It works.
>
> I upgrade to these:
>
> 2.6.0-cdh5.4.4
> 2.6.0-mr1-cdh5.4.4
> 1.0.0-cdh5.4.4
>
> it hangs...
>
> 2015-07-20 23:27 GMT+02:00 Serega Sheypak :
>
>> I see these lines:
>>
>> 2015-07-20 21:27:21,791 INFO  [RegionOpenAndInitThread-hbase:namespace-1]
>> regionserver.HRegion (HRegion.java:createHRegion(5598)) - creating HRegion
>> hbase:namespace HTD == 'hbase:namespace', {NAME => 'info', BLOOMFILTER =>
>> 'ROW', VERSIONS => '10', IN_MEMORY => 'true', KEEP_DELETED_CELLS =>
>> 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION =>
>> 'NONE', CACHE_DATA_IN_L1 => 'true', MIN_VERSIONS => '0', BLOCKCACHE =>
>> 'true', BLOCKSIZE => '8192', REPLICATION_SCOPE => '0'} RootDir =
>> file:/Users/ssa/devel/myown/hadoop/mini-hdfs-cluster-maven-plugin/target/hbase-root/.tmp
>> Table name == hbase:namespace
>>
>> 2015-07-20 21:27:21,802 INFO  [RegionOpenAndInitThread-hbase:namespace-1]
>> regionserver.HRegion (HRegion.java:doClose(1425)) - Closed
>> hbase:namespace,,1437420441756.a69bbbcaf4a82786964da8d9cc62bea3.
>>
>> 2015-07-20 21:27:21,861 INFO
>> [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:61781-0] hbase.MetaTableAccessor
>> (MetaTableAccessor.java:addRegionsToMeta(1169)) - Added 2
>>
>> 2015-07-20 21:27:21,864 WARN
>> [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:61781-0] zookeeper.ZKTableStateManager
>> (ZKTableStateManager.java:setTableState(100)) - Moving table
>> hbase:namespace state from ENABLING to ENABLED
>>
>> 2015-07-20 21:27:21,866 INFO
>> [MASTER_TABLE_OPERATIONS-SERGEYs-MBP:61781-0] handler.CreateTableHandler
>> (CreateTableHandler.java:completed(219)) - Table, hbase:namespace, creation
>> successful
>>
>>
>> then it hungs and prints thread dump in few minutes.
>>
>>
>> 2015-07-20 21:25 GMT+02:00 Esteban Gutierrez :
>>
>>>
>>> But do you see that thread printing anything in the logs?
>>>
>>> --
>>> Cloudera, Inc.
>>>
>>>
>>> On Mon, Jul 20, 2015 at 12:07 PM, Serega Sheypak <
>>> serega.shey...@gmail.com> wrote:
>>>
>>>> This is testing utiliy, it has few bytes of data to load. Running on
>>>> oracle-jdk8
>>>>
>>>> java.lang.Thread.run(Thread.java:745)
>>>> Thread 178 (JvmPauseMonitor):
>>>>   State: TIMED_WAITING
>>>>   Blocked count: 3
>>>>   Waited count: 398
>>>>   Stack:
>>>> java.lang.Thread.sleep(Native Method)
>>>>
>>>> org.apache.hadoop.hbase.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:159)
>>>> java.lang.Thread.run(Thread.java:745)
>>>>
>>>> Thread 177 (JvmPauseMonitor):
>>>>   State: TIMED_WAITING
>>>>   Blocked count: 1
>>>>   Waited count: 398
>>>>   Stack:
>>>> java.lang.Thread.sleep(Native Method)
>>>>
>>>> org.apache.hadoop.hbase.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:159)
>>>> java.lang.Thread.run(Thread.java:745)
>>>>
>>>> 2015-07-20 19:30 GMT+02:00 Esteban Gutierrez :
>>>>
>>>>> -user@hbase
>>>>>
>>>>> Hi Serega,
>>>>>
>>>>> The RunTimeException is pointing to a timeout of nearly 3min. Have
>>>>> tried to find in the master log lines what is causing that 3min pause? do
>>>>> you see any log line related to the JvmPauseMonitor? (perhaps some GC 
>>>>> going
>>>>> on)
>>>>>
>>>>> thanks,
>>>>> esteban.
>>>>>
>>>>> --
>>>>> Cloudera, Inc.
>>>>>
>>>>>
>>>>> On Mon, Jul 20, 2015 at 12:28 AM, Serega Sheypak <
>>>>> serega.shey...@gmail.com> wrote:
>>>>>
>>>>>> @Sean, thanks. I saw sometimes Cloudera guys help here. I also used
>>>>>> Cloudera community forum.
>>>>>>
>>>>>> @Jean-Marc, nothing special, just maven-plugin wrapper around
>>>>>> miniHbaseClust

Re: Re[2]: region servers stuck

2015-07-24 Thread Serega Sheypak

probably block was being replicated because of DN failure and HBase was
trying to access that replica and got stuck?
I can see that DN answers that some blocks are missing.
or maybe you run HDFS-balancer?

The other thing is that you should always get read access to HDFS by
design, you are not allowed to modify file concurrently, first writer gets
lease on block and NN doesn't allow to get concurrent leases as I remember
it correctly...

See what happens with block 1099777976128

RS:
015-07-19 07:25:08,533 INFO org.apache.hadoop.hbase.regionserver.HStore:
Starting compaction of 2 file(s) in i of
table7,\x8C\xA0,1435936455217.12a2d1e37fd8f0f9870fc1b5afd6046d. into
tmpdir=hdfs://server1/hbase/data/default/table7/12a2d1e37fd8f0f9870fc1b5afd6046d/.tmp,
totalSize=416.0 M
2015-07-19 07:25:08,556 WARN org.apache.hadoop.hdfs.BlockReaderFactory:
BlockReaderFactory(fileName=/hbase/data/default/table7/12a2d1e37fd8f0f9870fc1b5afd6046d/i/983cf03fddfa480f92346f25a61b0b9e,
block=BP-1892992341-10.10.122.111-1352825964285:blk_1195579097_1099777976128):
unknown response code ERROR while attempting to set up short-circuit
access. Block
BP-1892992341-10.10.122.111-1352825964285:blk_1195579097_1099777976128 is
not valid
2015-07-19 07:25:08,556 WARN
org.apache.hadoop.hdfs.client.ShortCircuitCache:
ShortCircuitCache(0x6b1f04e2): failed to load
1195579097_BP-1892992341-10.10.122.111-1352825964285
2015-07-19 07:25:08,557 WARN org.apache.hadoop.hdfs.BlockReaderFactory: I/O
error constructing remote block reader.
java.io.IOException: Got error for OP_READ_BLOCK, self=/10.0.241.39:53420,
remote=/10.0.241.39:50010, for file
/hbase/data/default/table7/12a2d1e37fd8f0f9870fc1b5afd6046d/i/983cf03fddfa480f92346f25a61b0b9e,
for pool BP-1892992341-10.10.122.111-1352825964285 block
1195579097_1099777976128
at
org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:432)
at
org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:397)
at
org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:786)
at
org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:665)
at
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:325)
at
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:566)
at
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:789)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:836)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1210)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1483)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1314)
at
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:355)
at
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.seekTo(HFileReaderV2.java:1052)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:244)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:152)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:317)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:240)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:202)
at
org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:257)
at
org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65)
at
org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:109)
at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1080)
at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1482)
at
org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:475)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-07-19 07:25:08,558 WARN org.apache.hadoop.hdfs.DFSClient: Failed to
connect to /10.0.241.39:50010 for block, add to deadNodes and continue.
java.io.IOException: Got error for OP_READ_BLOCK, self=/10.0.241.39:53420,
remote=/10.0.241.39:50010, for file
/hbase/data/default/table7/12a2d1e37fd8f0f9870fc1b5afd6046d/i/983cf03fddfa480f92346f25a61b0b9e,
for pool BP-1892992341-10.10.122.111-1352825964285 block
1195579097_1099777976128
java.io.IOException: Got error for OP_READ_BLOCK, self=/10.0.241.39:53420,
remote=/10.0.241.39:50010, for file
/hbase/data/default/table7/12a2d1e37fd8f0f9870fc1b5afd6046d/i/983cf03fddfa480f92346f25a61b0b9e,
for pool BP-1892992341-10.10.122.111-1352825964285 block
11955

Spikes when writing data to HBase

2015-08-11 Thread Serega Sheypak

Hi, we are using version 1.0.0+cdh5.4.4+160
We have heavy write load, ~ 10K per econd
We have 10 nodes 7 disks each. I read some perf notes, they state that
HBase can handle 1K per second writes per node without any problems.


I see some spikes on "writers". Write operation timing "jumps" from 40-50ms
to 200-500ms Probably I hit memstore limit. RegionServer starts to flush
memstore and stop to accept updates.

I have several questions:
1. Does 4/(8 in hyperthreading) CPU + 7HDD node could absorb 1K writes per
second?
2. What is the right way to fight with blocked writes?
2.1. What I did:
hbase.hregion.memstore.flush.size to 256M to produce larger HFiles when
flushing memstore
base.hregion.memstore.block.multiplier to 4, since I have only one
intensive-write table. Let it grow
hbase.regionserver.optionallogflushinterval to 10s, i CAN loose some data,
NP here. The idea that I reduce I/O pressure on disks.
===
Not sure if I can correctly play with these parameters.
hbase.hstore.blockingStoreFiles=10
hbase.hstore.compactionThreshold=3

Re: Spikes when writing data to HBase

2015-08-11 Thread Serega Sheypak

Hi Vladimir!

Here are graphs. Servlet (3 tomcats on 3 different hosts write to HBase)
http://www.bigdatapath.com/wp-content/uploads/2015/08/01_apps1.png
See how response time jump. I can't explain it. Write load is really-really
low.

all RS have even load. I see request-metrics in HBase master web UI.
Tables are pre-splitted. I have 10 RS and pre-splitted tables on 50 regions.

  >1. How large is your single write?
1-2KB

   >2. Do you see any RegionTooBusyException in a client log files
no HBase related exceptions. Response

 >  3. How large is your table ( # of regions, # of column families)
1 column familiy, table_01 150GB, table_02 130 GB

I have two "major tables", here are stats for them:
http://www.bigdatapath.com/wp-content/uploads/2015/08/table_02.png
http://www.bigdatapath.com/wp-content/uploads/2015/08/table_01.png
   >4. RS memory related config: Max heap

   5. memstore size (if not default - 0.4)
hbase.regionserver.global.memstore.upperLimit=0.4
hbase.regionserver.global.memstore.lowerLimit=0.38
RS heapsize=8GB

>*Do you see any region splits?  *
no, never happened since tables are pre-splitted

2015-08-11 18:54 GMT+02:00 Vladimir Rodionov :

> *Common questions:*
>
>
>1. How large is your single write?
>2. Do you see any RegionTooBusyException in a client log files
>3. How large is your table ( # of regions, # of column families)
>4. RS memory related config: Max heap
>5. memstore size (if not default - 0.4)
>
>
> Memstore flush
>
> hbase.hregion.memstore.flush.size = 256M
> hbase.hregion.memstore.block.multiplier = N (do not block writes) N * 256M
> MUST be greater than overall memstore size (HBASE_HEAPSIZE *
> hbase.regionserver.global.memstore.size)
>
> WAL files.
>
> Set HDFS block size to 256MB. hbase.regionserver.hlog.blocksize = 0.95 HDFS
> block size (256MB * 0.95). Keep hbase.regionserver.hlog.blocksize *
> hbase.regionserver.maxlogs just a bit above
> hbase.regionserver.global.memstore.lowerLimit
> (0.35-0.45) * HBASE_HEAPSIZE to avoid premature memstore flushing.
>
> *Do you see any region splits?  *
>
> Region split blocks writes. Try to presplit table and avoid splitting after
> that. Disable splitting completely
>
> hbase.regionserver.region.split.policy
> =org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy
>
> -Vlad
>
>
>
>
> On Tue, Aug 11, 2015 at 3:22 AM, Serega Sheypak 
> wrote:
>
> > Hi, we are using version 1.0.0+cdh5.4.4+160
> > We have heavy write load, ~ 10K per econd
> > We have 10 nodes 7 disks each. I read some perf notes, they state that
> > HBase can handle 1K per second writes per node without any problems.
> >
> >
> > I see some spikes on "writers". Write operation timing "jumps" from
> 40-50ms
> > to 200-500ms Probably I hit memstore limit. RegionServer starts to flush
> > memstore and stop to accept updates.
> >
> > I have several questions:
> > 1. Does 4/(8 in hyperthreading) CPU + 7HDD node could absorb 1K writes
> per
> > second?
> > 2. What is the right way to fight with blocked writes?
> > 2.1. What I did:
> > hbase.hregion.memstore.flush.size to 256M to produce larger HFiles when
> > flushing memstore
> > base.hregion.memstore.block.multiplier to 4, since I have only one
> > intensive-write table. Let it grow
> > hbase.regionserver.optionallogflushinterval to 10s, i CAN loose some
> data,
> > NP here. The idea that I reduce I/O pressure on disks.
> > ===
> > Not sure if I can correctly play with these parameters.
> > hbase.hstore.blockingStoreFiles=10
> > hbase.hstore.compactionThreshold=3
> >
>

Re: Spikes when writing data to HBase

2015-08-11 Thread Serega Sheypak

>How about GC activity? ApplicationStopTime? Do you track that?
yes, jviusalm says it's ok, newrelic also doesn't show something strange.
HBase also says it's OK.

Profiler says most time thread is waiting for response from hbase side. My
assumption is:
1. I have weird bug in HBase configuration
2. I have undiscovered problems with networking (BUT the same tomcats write
data to flume with higher rate, no data loss at all)
3. I have weird problem with HConnection HConnectionManager is
multithreaded env, when same servlet instance shared across many threads
4. some mystic process somewhere in the cluster

>Is the issue reproducible? or you got it first time?
always. Spikes disappear during night, but RPM doesn't change too much.

I will run my controller code out of tomcat and see how it goes. I'm going
to isolate components...


2015-08-11 23:36 GMT+02:00 Vladimir Rodionov :

> How about GC activity? ApplicationStopTime? Do you track that?
>
> Is the issue reproducible? or you got it first time?
>
> Start with RS logs and try to find anything suspicious in a period of a
> very high latency. 1.5 sec HBase write latency does not look right.
>
> -Vlad
>
> On Tue, Aug 11, 2015 at 2:08 PM, Serega Sheypak 
> wrote:
>
> > Hi Vladimir!
> >
> > Here are graphs. Servlet (3 tomcats on 3 different hosts write to HBase)
> > http://www.bigdatapath.com/wp-content/uploads/2015/08/01_apps1.png
> > See how response time jump. I can't explain it. Write load is
> really-really
> > low.
> >
> > all RS have even load. I see request-metrics in HBase master web UI.
> > Tables are pre-splitted. I have 10 RS and pre-splitted tables on 50
> > regions.
> >
> >   >1. How large is your single write?
> > 1-2KB
> >
> >>2. Do you see any RegionTooBusyException in a client log files
> > no HBase related exceptions. Response
> >
> >  >  3. How large is your table ( # of regions, # of column families)
> > 1 column familiy, table_01 150GB, table_02 130 GB
> >
> > I have two "major tables", here are stats for them:
> > http://www.bigdatapath.com/wp-content/uploads/2015/08/table_02.png
> > http://www.bigdatapath.com/wp-content/uploads/2015/08/table_01.png
> >>4. RS memory related config: Max heap
> >
> >5. memstore size (if not default - 0.4)
> > hbase.regionserver.global.memstore.upperLimit=0.4
> > hbase.regionserver.global.memstore.lowerLimit=0.38
> > RS heapsize=8GB
> >
> > >*Do you see any region splits?  *
> > no, never happened since tables are pre-splitted
> >
> > 2015-08-11 18:54 GMT+02:00 Vladimir Rodionov :
> >
> > > *Common questions:*
> > >
> > >
> > >1. How large is your single write?
> > >2. Do you see any RegionTooBusyException in a client log files
> > >3. How large is your table ( # of regions, # of column families)
> > >4. RS memory related config: Max heap
> > >5. memstore size (if not default - 0.4)
> > >
> > >
> > > Memstore flush
> > >
> > > hbase.hregion.memstore.flush.size = 256M
> > > hbase.hregion.memstore.block.multiplier = N (do not block writes) N *
> > 256M
> > > MUST be greater than overall memstore size (HBASE_HEAPSIZE *
> > > hbase.regionserver.global.memstore.size)
> > >
> > > WAL files.
> > >
> > > Set HDFS block size to 256MB. hbase.regionserver.hlog.blocksize = 0.95
> > HDFS
> > > block size (256MB * 0.95). Keep hbase.regionserver.hlog.blocksize *
> > > hbase.regionserver.maxlogs just a bit above
> > > hbase.regionserver.global.memstore.lowerLimit
> > > (0.35-0.45) * HBASE_HEAPSIZE to avoid premature memstore flushing.
> > >
> > > *Do you see any region splits?  *
> > >
> > > Region split blocks writes. Try to presplit table and avoid splitting
> > after
> > > that. Disable splitting completely
> > >
> > > hbase.regionserver.region.split.policy
> > > =org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy
> > >
> > > -Vlad
> > >
> > >
> > >
> > >
> > > On Tue, Aug 11, 2015 at 3:22 AM, Serega Sheypak <
> > serega.shey...@gmail.com>
> > > wrote:
> > >
> > > > Hi, we are using version 1.0.0+cdh5.4.4+160
> > > > We have heavy write load, ~ 10K per econd
> > > > We have 10 nodes 7 disks each. I read some perf notes, they state
> that
> > > > HBase can handle 1K per second writes per node without any problems.
> > >

Re: Spikes when writing data to HBase

2015-08-11 Thread Serega Sheypak

Probably I found something. Response time decreases when parallelism grows.
What I did:

1. wrap business logic controller into java main class. My controller does
some logic and puts/gets to hbase with checkAndPut (sometimes)
2. create HConnection
3. pass HConnection to controller
4. wrap controller execution into codahale metrics
5. execute controller in several threads simultaneously. The same happens
in servlet environment

I can't explain result.
1. I used 10 threads and 10 iterations in each.

*RESULT:  99% <= 28.81 milliseconds which sounds GOOD!*
-- Meters
--
putMeter
 count = 414914
 mean rate = 885.58 events/second
 1-minute rate = 911.56 events/second
 5-minute rate = 778.16 events/second
15-minute rate = 549.72 events/second

-- Timers
--
putTimer
 count = 414914
 mean rate = 884.66 calls/second
 1-minute rate = 911.53 calls/second
 5-minute rate = 765.60 calls/second
15-minute rate = 515.06 calls/second
   min = 4.87 milliseconds
   max = 211.77 milliseconds
  mean = 10.81 milliseconds
stddev = 5.43 milliseconds
median = 10.34 milliseconds
  75% <= 11.59 milliseconds
  95% <= 14.41 milliseconds
  98% <= 19.59 milliseconds
  99% <= 28.81 milliseconds
99.9% <= 60.67 milliseconds

I've increased count of threads to 100:
*RESULT: 99% <= 112.09 milliseconds*
-- Meters
--
putMeter
 count = 1433056
 mean rate = 2476.46 events/second
 1-minute rate = 2471.18 events/second
 5-minute rate = 2483.28 events/second
15-minute rate = 2512.52 events/second

-- Timers
--
putTimer
 count = 1433058
 mean rate = 2474.61 calls/second
 1-minute rate = 2468.45 calls/second
 5-minute rate = 2446.45 calls/second
15-minute rate = 2383.23 calls/second
   min = 10.03 milliseconds
   max = 853.05 milliseconds
  mean = 40.71 milliseconds
stddev = 39.04 milliseconds
median = 35.60 milliseconds
  75% <= 47.69 milliseconds
  95% <= 71.79 milliseconds
  98% <= 85.83 milliseconds
  99% <= 112.09 milliseconds
99.9% <= 853.05 milliseconds

Is it possible to explain it? Could it be a problem in some
pooling/threading inside HConnection?

please see what happened to compactions during test:
http://www.bigdatapath.com/wp-content/uploads/2015/08/compations.png

get/put ops
http://www.bigdatapath.com/wp-content/uploads/2015/08/get_ops.png

slow ops:
http://www.bigdatapath.com/wp-content/uploads/2015/08/slow_ops.png

2015-08-11 23:43 GMT+02:00 Serega Sheypak :

> >How about GC activity? ApplicationStopTime? Do you track that?
> yes, jviusalm says it's ok, newrelic also doesn't show something strange.
> HBase also says it's OK.
>
> Profiler says most time thread is waiting for response from hbase side. My
> assumption is:
> 1. I have weird bug in HBase configuration
> 2. I have undiscovered problems with networking (BUT the same tomcats
> write data to flume with higher rate, no data loss at all)
> 3. I have weird problem with HConnection HConnectionManager is
> multithreaded env, when same servlet instance shared across many threads
> 4. some mystic process somewhere in the cluster
>
> >Is the issue reproducible? or you got it first time?
> always. Spikes disappear during night, but RPM doesn't change too much.
>
> I will run my controller code out of tomcat and see how it goes. I'm going
> to isolate components...
>
>
> 2015-08-11 23:36 GMT+02:00 Vladimir Rodionov :
>
>> How about GC activity? ApplicationStopTime? Do you track that?
>>
>> Is the issue reproducible? or you got it first time?
>>
>> Start with RS logs and try to find anything suspicious in a period of a
>> very high latency. 1.5 sec HBase write latency does not look right.
>>
>> -Vlad
>>
>> On Tue, Aug 11, 2015 at 2:08 PM, Serega Sheypak > >
>> wrote:
>>
>> > Hi Vladimir!
>> >
>> > Here are graphs. Servlet (3 tomcats on 3 different hosts write to HBase)
>> > http://www.bigdatapath.com/wp-content/uploads/2015/08/01_apps1.png
>> > See how response time jump. I can't explain it. Write load is
>> really-really
>> > low.
>> >
>> > all RS have even load. I see request-metrics in HBase master web UI.
>> > Tables are pre-splitted.

Re: Spikes when writing data to HBase

2015-08-11 Thread Serega Sheypak

Hi, here is it:
https://gist.github.com/seregasheypak/00ef1a44e6293d13e56e

2015-08-12 4:25 GMT+02:00 Vladimir Rodionov :

> Can you post code snippet? Pastbin link is fine.
>
> -Vlad
>
> On Tue, Aug 11, 2015 at 4:03 PM, Serega Sheypak 
> wrote:
>
> > Probably I found something. Response time decreases when parallelism
> grows.
> > What I did:
> >
> > 1. wrap business logic controller into java main class. My controller
> does
> > some logic and puts/gets to hbase with checkAndPut (sometimes)
> > 2. create HConnection
> > 3. pass HConnection to controller
> > 4. wrap controller execution into codahale metrics
> > 5. execute controller in several threads simultaneously. The same happens
> > in servlet environment
> >
> > I can't explain result.
> > 1. I used 10 threads and 10 iterations in each.
> >
> > *RESULT:  99% <= 28.81 milliseconds which sounds GOOD!*
> > -- Meters
> > --
> > putMeter
> >  count = 414914
> >  mean rate = 885.58 events/second
> >  1-minute rate = 911.56 events/second
> >  5-minute rate = 778.16 events/second
> > 15-minute rate = 549.72 events/second
> >
> > -- Timers
> > --
> > putTimer
> >  count = 414914
> >  mean rate = 884.66 calls/second
> >  1-minute rate = 911.53 calls/second
> >  5-minute rate = 765.60 calls/second
> > 15-minute rate = 515.06 calls/second
> >min = 4.87 milliseconds
> >max = 211.77 milliseconds
> >   mean = 10.81 milliseconds
> > stddev = 5.43 milliseconds
> > median = 10.34 milliseconds
> >   75% <= 11.59 milliseconds
> >   95% <= 14.41 milliseconds
> >   98% <= 19.59 milliseconds
> >   99% <= 28.81 milliseconds
> > 99.9% <= 60.67 milliseconds
> >
> > I've increased count of threads to 100:
> > *RESULT: 99% <= 112.09 milliseconds*
> > -- Meters
> > --
> > putMeter
> >  count = 1433056
> >  mean rate = 2476.46 events/second
> >  1-minute rate = 2471.18 events/second
> >  5-minute rate = 2483.28 events/second
> > 15-minute rate = 2512.52 events/second
> >
> > -- Timers
> > --
> > putTimer
> >  count = 1433058
> >  mean rate = 2474.61 calls/second
> >  1-minute rate = 2468.45 calls/second
> >  5-minute rate = 2446.45 calls/second
> > 15-minute rate = 2383.23 calls/second
> >min = 10.03 milliseconds
> >max = 853.05 milliseconds
> >   mean = 40.71 milliseconds
> > stddev = 39.04 milliseconds
> > median = 35.60 milliseconds
> >   75% <= 47.69 milliseconds
> >   95% <= 71.79 milliseconds
> >   98% <= 85.83 milliseconds
> >   99% <= 112.09 milliseconds
> >     99.9% <= 853.05 milliseconds
> >
> > Is it possible to explain it? Could it be a problem in some
> > pooling/threading inside HConnection?
> >
> > please see what happened to compactions during test:
> > http://www.bigdatapath.com/wp-content/uploads/2015/08/compations.png
> >
> > get/put ops
> > http://www.bigdatapath.com/wp-content/uploads/2015/08/get_ops.png
> >
> > slow ops:
> > http://www.bigdatapath.com/wp-content/uploads/2015/08/slow_ops.png
> >
> > 2015-08-11 23:43 GMT+02:00 Serega Sheypak :
> >
> > > >How about GC activity? ApplicationStopTime? Do you track that?
> > > yes, jviusalm says it's ok, newrelic also doesn't show something
> strange.
> > > HBase also says it's OK.
> > >
> > > Profiler says most time thread is waiting for response from hbase side.
> > My
> > > assumption is:
> > > 1. I have weird bug in HBase configuration
> > > 2. I have undiscovered problems with networking (BUT the same tomcats
> > > write data to flume with higher rate, no data loss at all)
> > > 3. I have weird problem with HConnection HConnectionManager is
> > > multithreaded env, when same servlet instance shared across many
> threads
> > > 4. some mystic pro

Re: Spikes when writing data to HBase

2015-08-12 Thread Serega Sheypak

I agree.
  99% <= 112.09 milliseconds
I could make 3 gets during 112 MS.


2015-08-12 9:24 GMT+02:00 Vladimir Rodionov :

> OK, this is actually checkAndPut -> get - check -put. Latency is dominated
> by get operation. Unless you have SSDs 10-40 ms mean read latency is
> normal.
>
> -Vlad
>
> On Tue, Aug 11, 2015 at 11:24 PM, Serega Sheypak  >
> wrote:
>
> > Hi, here is it:
> > https://gist.github.com/seregasheypak/00ef1a44e6293d13e56e
> >
> > 2015-08-12 4:25 GMT+02:00 Vladimir Rodionov :
> >
> > > Can you post code snippet? Pastbin link is fine.
> > >
> > > -Vlad
> > >
> > > On Tue, Aug 11, 2015 at 4:03 PM, Serega Sheypak <
> > serega.shey...@gmail.com>
> > > wrote:
> > >
> > > > Probably I found something. Response time decreases when parallelism
> > > grows.
> > > > What I did:
> > > >
> > > > 1. wrap business logic controller into java main class. My controller
> > > does
> > > > some logic and puts/gets to hbase with checkAndPut (sometimes)
> > > > 2. create HConnection
> > > > 3. pass HConnection to controller
> > > > 4. wrap controller execution into codahale metrics
> > > > 5. execute controller in several threads simultaneously. The same
> > happens
> > > > in servlet environment
> > > >
> > > > I can't explain result.
> > > > 1. I used 10 threads and 10 iterations in each.
> > > >
> > > > *RESULT:  99% <= 28.81 milliseconds which sounds GOOD!*
> > > > -- Meters
> > > >
> --
> > > > putMeter
> > > >  count = 414914
> > > >  mean rate = 885.58 events/second
> > > >  1-minute rate = 911.56 events/second
> > > >  5-minute rate = 778.16 events/second
> > > > 15-minute rate = 549.72 events/second
> > > >
> > > > -- Timers
> > > >
> --
> > > > putTimer
> > > >  count = 414914
> > > >  mean rate = 884.66 calls/second
> > > >  1-minute rate = 911.53 calls/second
> > > >  5-minute rate = 765.60 calls/second
> > > > 15-minute rate = 515.06 calls/second
> > > >min = 4.87 milliseconds
> > > >max = 211.77 milliseconds
> > > >   mean = 10.81 milliseconds
> > > > stddev = 5.43 milliseconds
> > > > median = 10.34 milliseconds
> > > >   75% <= 11.59 milliseconds
> > > >   95% <= 14.41 milliseconds
> > > >   98% <= 19.59 milliseconds
> > > >   99% <= 28.81 milliseconds
> > > > 99.9% <= 60.67 milliseconds
> > > >
> > > > I've increased count of threads to 100:
> > > > *RESULT: 99% <= 112.09 milliseconds*
> > > > -- Meters
> > > >
> --
> > > > putMeter
> > > >  count = 1433056
> > > >  mean rate = 2476.46 events/second
> > > >  1-minute rate = 2471.18 events/second
> > > >  5-minute rate = 2483.28 events/second
> > > > 15-minute rate = 2512.52 events/second
> > > >
> > > > -- Timers
> > > >
> --
> > > > putTimer
> > > >  count = 1433058
> > > >  mean rate = 2474.61 calls/second
> > > >  1-minute rate = 2468.45 calls/second
> > > >  5-minute rate = 2446.45 calls/second
> > > > 15-minute rate = 2383.23 calls/second
> > > >min = 10.03 milliseconds
> > > >max = 853.05 milliseconds
> > > >   mean = 40.71 milliseconds
> > > > stddev = 39.04 milliseconds
> > > > median = 35.60 milliseconds
> > > >   75% <= 47.69 milliseconds
> > > >   95% <= 71.79 milliseconds
> > > >   98% <= 85.83 milliseconds
> > > >   99% <= 112.09 milliseconds
> > > > 99.9% <= 853.05 milliseconds
> > > >
> > > >

1 2 >

1 - 100 of 170 matches

Mail list logo