Re: ZooKeeper available but no active master location found

2014-04-10 Thread Margusja

Yes there is:
  groupIdorg.apache.hbase/groupId
  artifactIdhbase/artifactId
  version0.92.1/version

Best regards, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE (serialNumber=37303140314)

On 10/04/14 00:57, Ted Yu wrote:

Have you modified pom.xml of twitbase ?
If not, this is the dependency you get:
 dependency
   groupIdorg.apache.hbase/groupId
   artifactIdhbase/artifactId
   version0.92.1/version

0.92.1 and 0.96.0 are not compatible.

Cheers


On Wed, Apr 9, 2014 at 10:58 AM, Margusja mar...@roo.ee wrote:


Hi

I downloaded and installed hortonworks sandbox 2.0 for virtualbox.
HBase version is: 0.96.0.2.0.6.0-76-hadoop2, re6d7a56f72914d01e55c0478d74e5
cfd3778f231
[hbase@sandbox twitbase-master]$ cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1   localhost.localdomain localhost
10.0.2.15   sandbox.hortonworks.com sandbox

[hbase@sandbox twitbase-master]$ hostname
sandbox.hortonworks.com

[root@sandbox ~]# netstat -lnp | grep 2181
tcp0  0 0.0.0.0:2181 0.0.0.0:*   LISTEN
  19359/java

[root@sandbox ~]# netstat -lnp | grep 6
tcp0  0 10.0.2.15:6 0.0.0.0:*   LISTEN
28549/java

[hbase@sandbox twitbase-master]$ hbase shell
14/04/05 05:56:44 INFO Configuration.deprecation: hadoop.native.lib is
deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'helpRETURN' for list of supported commands.
Type exitRETURN to leave the HBase Shell
Version 0.96.0.2.0.6.0-76-hadoop2, re6d7a56f72914d01e55c0478d74e5cfd3778f231,
Thu Oct 17 18:15:20 PDT 2013

hbase(main):001:0 list
TABLE
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/
lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/
slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
ambarismoketest
mytable
simple_hcat_load_table
users
weblogs
5 row(s) in 4.6040 seconds

= [ambarismoketest, mytable, simple_hcat_load_table, users,
weblogs]
hbase(main):002:0

So far is good.

I'd like to play with a code: https://github.com/hbaseinaction/twitbase

downloaded and made package: mvn package and  got twitbase-1.0.0.jar.

When I try to exec code I will get:
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:zookeeper.version=3.4.3-1240972, built on 02/06/2012 10:48 GMT
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:host.name=
sandbox.hortonworks.com
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.version=1.6.0_30
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.vendor=Sun Microsystems Inc.
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.home=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.class.path=target/twitbase-1.0.0.jar
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.library.path=/usr/lib/jvm/java-1.6.0-
openjdk-1.6.0.0.x86_64/jre/lib/amd64/server:/usr/lib/jvm/
java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64:/usr/lib/
jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/../lib/amd64:/
usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.io.tmpdir=/tmp
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.compiler=NA
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:os.name
=Linux
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:os.arch=amd64
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:os.version=2.6.32-431.11.2.el6.x86_64
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:user.name
=hbase
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:user.home=/home/hbase
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:user.dir=/home/hbase/twitbase-master
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Initiating client connection,
connectString=10.0.2.15:2181 sessionTimeout=18 watcher=hconnection
14/04/05 05:59:50 INFO zookeeper.ClientCnxn: Opening socket connection to
server /10.0.2.15:2181
14/04/05 05:59:50 INFO zookeeper.RecoverableZooKeeper: The identifier of
this process is 30...@sandbox.hortonworks.com
14/04/05 05:59:50 INFO client.ZooKeeperSaslClient: Client will not
SASL-authenticate because the default JAAS configuration section 'Client'
could not be found. If you are not using SASL, you may ignore this. On the
other hand, if you expected SASL to work, please fix your JAAS
configuration.
14/04/05 05:59:51 INFO zookeeper.ClientCnxn: Socket connection established
to sandbox.hortonworks.com/10.0.2.15:2181, initiating session
14/04/05 

Re: BlockCache for large scans.

2014-04-10 Thread gortiz
But, I think there's a direct relation between improving performance in 
large scan and memory for memstore. Until I understand, memstore just 
work as cache to write operations.


On 09/04/14 23:44, Ted Yu wrote:

Didn't quite get what you mean, Asaf.

If you're talking about HBASE-5349, please read release note of HBASE-5349.

By default, memstore min/max range is initialized to memstore percent:

 globalMemStorePercentMinRange = conf.getFloat(
MEMSTORE_SIZE_MIN_RANGE_KEY,

 globalMemStorePercent);

 globalMemStorePercentMaxRange = conf.getFloat(
MEMSTORE_SIZE_MAX_RANGE_KEY,

 globalMemStorePercent);

Cheers


On Wed, Apr 9, 2014 at 3:17 PM, Asaf Mesika asaf.mes...@gmail.com wrote:


The Jira says it's enabled by auto. Is there an official explaining this
feature?

On Wednesday, April 9, 2014, Ted Yu yuzhih...@gmail.com wrote:


Please take a look at http://www.n10k.com/blog/blockcache-101/

For D, hbase.regionserver.global.memstore.size is specified in terms of
percentage of heap. Unless you enable HBASE-5349 'Automagically tweak
global memstore and block cache sizes based on workload'


On Wed, Apr 9, 2014 at 12:24 AM, gortiz gor...@pragsis.comjavascript:;
wrote:


I've been reading the book definitive guide and hbase in action a

little.

I found this question from Cloudera that I'm not sure after looking

some

benchmarks and documentations from HBase. Could someone explain me a

little

about? . I think that when you do a large scan you should disable the
blockcache becuase the blocks are going to swat a lot, so you didn't

get

anything from cache, I guess you should be penalized since you're

spending

memory, calling GC and CPU with this task.

*You want to do a full table scan on your data. You decide to disable
block caching to see if this**
**improves scan performance. Will disabling block caching improve scan
performance?*

A.
No. Disabling block caching does not improve scan performance.

B.
Yes. When you disable block caching, you free up that memory for other
operations. With a full
table scan, you cannot take advantage of block caching anyway because

your

entire table won't fit
into cache.

C.
No. If you disable block caching, HBase must read each block index from
disk for each scan,
thereby decreasing scan performance.

D.
Yes. When you disable block caching, you free up memory for MemStore,
which improves,
scan performance.





--
*Guillermo Ortiz*
/Big Data Developer/

Telf.: +34 917 680 490
Fax: +34 913 833 301
C/ Manuel Tovar, 49-53 - 28034 Madrid - Spain

_http://www.bidoop.es_



This server is in the failed servers list

2014-04-10 Thread Margusja

Hi
I have java code:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.client.HBaseAdmin;
import org.apache.hadoop.hbase.util.Bytes;

public class Hbase_connect {

public static void main(String[] args) throws Exception {
Configuration conf = HBaseConfiguration.create();
conf.set(hbase.zookeeper.quorum, 
sandbox.hortonworks.com);

conf.set(hbase.zookeeper.property.clientPort, 2181);
conf.set(hbase.rootdir, 
hdfs://sandbox.hortonworks.com:8020/apps/hbase/data);

conf.set(zookeeper.znode.parent, /hbase-unsecure);
HBaseAdmin admin = new HBaseAdmin(conf);
HTableDescriptor[] tabdesc = admin.listTables();
for(int i=0; itabdesc.length; i++) {
System.out.println(Table =  + new 
String(tabdesc [i].getName()));

}
}
}

^C[hbase@sandbox hbase_connect]$ ls -lah libs/
total 80M
drwxr-xr-x 3 hbase hadoop 4.0K Apr  5 10:42 .
drwxr-xr-x 3 hbase hadoop 4.0K Apr  5 11:02 ..
-rw-r--r-- 1 hbase hadoop 2.5K Oct  6 23:39 hadoop-client-2.2.0.jar
-rw-r--r-- 1 hbase hadoop 4.1M Jul 24  2013 hadoop-core-1.2.1.jar
drwxr-xr-x 4 hbase hadoop 4.0K Apr  5 09:40 hbase-0.96.2-hadoop2
-rw-r--r-- 1 hbase hadoop  76M Apr  3 16:18 hbase-0.96.2-hadoop2-bin.tar.gz

[hbase@sandbox hbase_connect]$ java -cp 
./:./libs/*:./libs/hbase-0.96.2-hadoop2/lib/* Hbase_connect
14/04/05 11:03:03 INFO zookeeper.ZooKeeper: Client 
environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
14/04/05 11:03:03 INFO zookeeper.ZooKeeper: Client 
environment:host.name=sandbox.hortonworks.com
14/04/05 11:03:03 INFO zookeeper.ZooKeeper: Client 
environment:java.version=1.6.0_30
14/04/05 11:03:03 INFO zookeeper.ZooKeeper: Client 
environment:java.vendor=Sun Microsystems Inc.
14/04/05 11:03:03 INFO zookeeper.ZooKeeper: Client 
environment:java.home=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre
14/04/05 11:03:03 INFO zookeeper.ZooKeeper: Client 

Re: This server is in the failed servers list

2014-04-10 Thread Margusja

Hi

Found soluion. I used non hortonworks hadoop lib hadoop-core-1.2.1.jar 
removed hadoop-core-1.2.1.jar and copied:
cp /usr/lib/hadoop/hadoop-common-2.2.0.2.0.6.0-76.jar ./libs/
[hbase@sandbox hbase_connect]$ javac -cp 
./libs/*:./libs/hbase-0.96.2-hadoop2/lib/* Hbase_connect.java
[hbase@sandbox hbase_connect]$ java -cp 
./:./libs/*:./libs/hbase-0.96.2-hadoop2/lib/* Hbase_connect
2014-04-05 12:09:32,795 WARN  [main] util.NativeCodeLoader 
(NativeCodeLoader.java:clinit(62)) - Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
2014-04-05 12:09:34,378 INFO  [main] zookeeper.ZooKeeper 
(Environment.java:logEnv(100)) - Client 
environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
2014-04-05 12:09:34,381 INFO  [main] zookeeper.ZooKeeper 
(Environment.java:logEnv(100)) - Client 
environment:host.name=sandbox.hortonworks.com
2014-04-05 12:09:34,381 INFO  [main] zookeeper.ZooKeeper 
(Environment.java:logEnv(100)) - Client environment:java.version=1.6.0_30
2014-04-05 12:09:34,382 INFO  [main] zookeeper.ZooKeeper 
(Environment.java:logEnv(100)) - Client environment:java.vendor=Sun 
Microsystems Inc.
2014-04-05 12:09:34,382 INFO  [main] zookeeper.ZooKeeper 
(Environment.java:logEnv(100)) - Client 
environment:java.home=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre
2014-04-05 12:09:34,382 INFO  [main] zookeeper.ZooKeeper 
(Environment.java:logEnv(100)) - Client 

Lease exception when I execute large scan with filters.

2014-04-10 Thread gortiz

I got this error when I execute a full scan with filters about a table.

Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hbase.regionserver.LeaseException: 
org.apache.hadoop.hbase.regionserver.LeaseException: lease 
'-4165751462641113359' does not exist
at 
org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:231)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2482)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428)


I have read about increase the lease time and rpc time, but it's not 
working.. what else could I try?? The table isn't too big. I have been 
checking the logs from GC, HMaster and some RegionServers and I didn't 
see anything weird. I tried as well to try with a couple of caching values.


endpoint coprocessor

2014-04-10 Thread Bogala, Chandra Reddy
Hi,
I am planning to write endpoint coprocessor to calculate TOP N results for my 
usecase.  I got confused with old apis and new apis.
I followed below links and try to implement. But looks like api's changed a 
lot. I don't see many of these classes in hbase jars. We are using Hbase 0.96.
Can anyone point to the latest document/apis?. And if possible sample code to 
calculate top N.

https://blogs.apache.org/hbase/entry/coprocessor_introduction
https://www.youtube.com/watch?v=xHvJhuGGOKc

Thanks,
Chandra




Re: endpoint coprocessor

2014-04-10 Thread Ted Yu
Here is a reference implementation for aggregation :
http://search-hadoop.com/c/HBase:hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java||Hbase+aggregation+endpoint

You can find it in hbase source code. 
Cheers

On Apr 10, 2014, at 4:29 AM, Bogala, Chandra Reddy chandra.bog...@gs.com 
wrote:

 Hi,
 I am planning to write endpoint coprocessor to calculate TOP N results for my 
 usecase.  I got confused with old apis and new apis.
 I followed below links and try to implement. But looks like api's changed a 
 lot. I don't see many of these classes in hbase jars. We are using Hbase 0.96.
 Can anyone point to the latest document/apis?. And if possible sample code to 
 calculate top N.
 
 https://blogs.apache.org/hbase/entry/coprocessor_introduction
 https://www.youtube.com/watch?v=xHvJhuGGOKc
 
 Thanks,
 Chandra
 
 


Re: Lease exception when I execute large scan with filters.

2014-04-10 Thread Ted Yu
Can you give us a bit more information:

HBase release you're running
What filters are used for the scan

Thanks

On Apr 10, 2014, at 2:36 AM, gortiz gor...@pragsis.com wrote:

 I got this error when I execute a full scan with filters about a table.
 
 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hbase.regionserver.LeaseException: 
 org.apache.hadoop.hbase.regionserver.LeaseException: lease 
 '-4165751462641113359' does not exist
at org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:231)
at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2482)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428)
 
 I have read about increase the lease time and rpc time, but it's not 
 working.. what else could I try?? The table isn't too big. I have been 
 checking the logs from GC, HMaster and some RegionServers and I didn't see 
 anything weird. I tried as well to try with a couple of caching values.


Re: Lease exception when I execute large scan with filters.

2014-04-10 Thread gortiz
I was trying to check the behaviour of HBase. The cluster is a group of 
old computers, one master, five slaves, each one with 2Gb, so, 12gb in 
total.
The table has a column family with 1000 columns and each column with 100 
versions.
There's another column faimily with four columns an one image of 100kb.  
(I've tried without this column family as well.)
The table is partitioned manually in all the slaves, so data are 
balanced in the cluster.


I'm executing this sentence *scan 'table1', {FILTER = ValueFilter(=, 
'binary:5')* in HBase 0.94.6

My time for lease and rpc is three minutes.
Since, it's a full scan of the table, I have been playing with the 
BLOCKCACHE as well (just disable and enable, not about the size of it). 
I thought that it was going to have too much calls to the GC. I'm not 
sure about this point.


I know that it's not the best way to use HBase, it's just a test. I 
think that it's not working because the hardware isn't enough, although, 
I would like to try some kind of tunning to improve it.









On 10/04/14 14:21, Ted Yu wrote:

Can you give us a bit more information:

HBase release you're running
What filters are used for the scan

Thanks

On Apr 10, 2014, at 2:36 AM, gortiz gor...@pragsis.com wrote:


I got this error when I execute a full scan with filters about a table.

Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hbase.regionserver.LeaseException: 
org.apache.hadoop.hbase.regionserver.LeaseException: lease 
'-4165751462641113359' does not exist
at org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:231)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2482)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428)

I have read about increase the lease time and rpc time, but it's not working.. 
what else could I try?? The table isn't too big. I have been checking the logs 
from GC, HMaster and some RegionServers and I didn't see anything weird. I 
tried as well to try with a couple of caching values.



--
*Guillermo Ortiz*
/Big Data Developer/

Telf.: +34 917 680 490
Fax: +34 913 833 301
C/ Manuel Tovar, 49-53 - 28034 Madrid - Spain

_http://www.bidoop.es_



Re: Lease exception when I execute large scan with filters.

2014-04-10 Thread gortiz
Another little question is, when the filter I'm using, Do I check all 
the versions? or just the newest? Because, I'm wondering if when I do a 
scan over all the table, I look for the value 5 in all the dataset or 
I'm just looking for in one newest version of each value.


On 10/04/14 16:52, gortiz wrote:
I was trying to check the behaviour of HBase. The cluster is a group 
of old computers, one master, five slaves, each one with 2Gb, so, 12gb 
in total.
The table has a column family with 1000 columns and each column with 
100 versions.
There's another column faimily with four columns an one image of 
100kb.  (I've tried without this column family as well.)
The table is partitioned manually in all the slaves, so data are 
balanced in the cluster.


I'm executing this sentence *scan 'table1', {FILTER = ValueFilter(=, 
'binary:5')* in HBase 0.94.6

My time for lease and rpc is three minutes.
Since, it's a full scan of the table, I have been playing with the 
BLOCKCACHE as well (just disable and enable, not about the size of 
it). I thought that it was going to have too much calls to the GC. I'm 
not sure about this point.


I know that it's not the best way to use HBase, it's just a test. I 
think that it's not working because the hardware isn't enough, 
although, I would like to try some kind of tunning to improve it.









On 10/04/14 14:21, Ted Yu wrote:

Can you give us a bit more information:

HBase release you're running
What filters are used for the scan

Thanks

On Apr 10, 2014, at 2:36 AM, gortiz gor...@pragsis.com wrote:


I got this error when I execute a full scan with filters about a table.

Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hbase.regionserver.LeaseException: 
org.apache.hadoop.hbase.regionserver.LeaseException: lease 
'-4165751462641113359' does not exist
at 
org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:231) 

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2482)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428)


I have read about increase the lease time and rpc time, but it's not 
working.. what else could I try?? The table isn't too big. I have 
been checking the logs from GC, HMaster and some RegionServers and I 
didn't see anything weird. I tried as well to try with a couple of 
caching values.






--
*Guillermo Ortiz*
/Big Data Developer/

Telf.: +34 917 680 490
Fax: +34 913 833 301
C/ Manuel Tovar, 49-53 - 28034 Madrid - Spain

_http://www.bidoop.es_



Re: Lease exception when I execute large scan with filters.

2014-04-10 Thread Ted Yu
It should be newest version of each value.

Cheers


On Thu, Apr 10, 2014 at 9:55 AM, gortiz gor...@pragsis.com wrote:

 Another little question is, when the filter I'm using, Do I check all the
 versions? or just the newest? Because, I'm wondering if when I do a scan
 over all the table, I look for the value 5 in all the dataset or I'm just
 looking for in one newest version of each value.


 On 10/04/14 16:52, gortiz wrote:

 I was trying to check the behaviour of HBase. The cluster is a group of
 old computers, one master, five slaves, each one with 2Gb, so, 12gb in
 total.
 The table has a column family with 1000 columns and each column with 100
 versions.
 There's another column faimily with four columns an one image of 100kb.
  (I've tried without this column family as well.)
 The table is partitioned manually in all the slaves, so data are balanced
 in the cluster.

 I'm executing this sentence *scan 'table1', {FILTER = ValueFilter(=,
 'binary:5')* in HBase 0.94.6
 My time for lease and rpc is three minutes.
 Since, it's a full scan of the table, I have been playing with the
 BLOCKCACHE as well (just disable and enable, not about the size of it). I
 thought that it was going to have too much calls to the GC. I'm not sure
 about this point.

 I know that it's not the best way to use HBase, it's just a test. I think
 that it's not working because the hardware isn't enough, although, I would
 like to try some kind of tunning to improve it.








 On 10/04/14 14:21, Ted Yu wrote:

 Can you give us a bit more information:

 HBase release you're running
 What filters are used for the scan

 Thanks

 On Apr 10, 2014, at 2:36 AM, gortiz gor...@pragsis.com wrote:

  I got this error when I execute a full scan with filters about a table.

 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hbase.regionserver.LeaseException:
 org.apache.hadoop.hbase.regionserver.LeaseException: lease
 '-4165751462641113359' does not exist
 at 
 org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:231)

 at org.apache.hadoop.hbase.regionserver.HRegionServer.
 next(HRegionServer.java:2482)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(
 NativeMethodAccessorImpl.java:39)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(
 DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(
 WritableRpcEngine.java:320)
 at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(
 HBaseServer.java:1428)

 I have read about increase the lease time and rpc time, but it's not
 working.. what else could I try?? The table isn't too big. I have been
 checking the logs from GC, HMaster and some RegionServers and I didn't see
 anything weird. I tried as well to try with a couple of caching values.





 --
 *Guillermo Ortiz*
 /Big Data Developer/

 Telf.: +34 917 680 490
 Fax: +34 913 833 301
 C/ Manuel Tovar, 49-53 - 28034 Madrid - Spain

 _http://www.bidoop.es_




Re: BlockCache for large scans.

2014-04-10 Thread lars hofhansl
Generally (and this is database lore not just HBase) if you use an LRU type 
cache, your working set does not fit into the cache, and you repeatedly scan 
this working set you have created the worst case scenario. The database does 
all the work caching the blocks, and subsequent scans will need block that were 
just evicted towards end of the previous scan.

For large scans where it is likely that the entire scan does not fit into the 
block cache, you should absolutely disable caching the blocks traversed for 
this scan (i.e. scan.setCacheBlocks(false)). Index blocks are not affected, 
they are cached regardless.

-- Lars




 From: gortiz gor...@pragsis.com
To: user@hbase.apache.org 
Sent: Wednesday, April 9, 2014 11:37 PM
Subject: Re: BlockCache for large scans.
 

But, I think there's a direct relation between improving performance in 
large scan and memory for memstore. Until I understand, memstore just 
work as cache to write operations.


On 09/04/14 23:44, Ted Yu wrote:
 Didn't quite get what you mean, Asaf.

 If you're talking about HBASE-5349, please read release note of HBASE-5349.

 By default, memstore min/max range is initialized to memstore percent:

      globalMemStorePercentMinRange = conf.getFloat(
 MEMSTORE_SIZE_MIN_RANGE_KEY,

          globalMemStorePercent);

      globalMemStorePercentMaxRange = conf.getFloat(
 MEMSTORE_SIZE_MAX_RANGE_KEY,

          globalMemStorePercent);

 Cheers


 On Wed, Apr 9, 2014 at 3:17 PM, Asaf Mesika asaf.mes...@gmail.com wrote:

 The Jira says it's enabled by auto. Is there an official explaining this
 feature?

 On Wednesday, April 9, 2014, Ted Yu yuzhih...@gmail.com wrote:

 Please take a look at http://www.n10k.com/blog/blockcache-101/

 For D, hbase.regionserver.global.memstore.size is specified in terms of
 percentage of heap. Unless you enable HBASE-5349 'Automagically tweak
 global memstore and block cache sizes based on workload'


 On Wed, Apr 9, 2014 at 12:24 AM, gortiz gor...@pragsis.comjavascript:;
 wrote:

 I've been reading the book definitive guide and hbase in action a
 little.
 I found this question from Cloudera that I'm not sure after looking
 some
 benchmarks and documentations from HBase. Could someone explain me a
 little
 about? . I think that when you do a large scan you should disable the
 blockcache becuase the blocks are going to swat a lot, so you didn't
 get
 anything from cache, I guess you should be penalized since you're
 spending
 memory, calling GC and CPU with this task.

 *You want to do a full table scan on your data. You decide to disable
 block caching to see if this**
 **improves scan performance. Will disabling block caching improve scan
 performance?*

 A.
 No. Disabling block caching does not improve scan performance.

 B.
 Yes. When you disable block caching, you free up that memory for other
 operations. With a full
 table scan, you cannot take advantage of block caching anyway because
 your
 entire table won't fit
 into cache.

 C.
 No. If you disable block caching, HBase must read each block index from
 disk for each scan,
 thereby decreasing scan performance.

 D.
 Yes. When you disable block caching, you free up memory for MemStore,
 which improves,
 scan performance.




-- 
*Guillermo Ortiz*
/Big Data Developer/

Telf.: +34 917 680 490
Fax: +34 913 833 301
C/ Manuel Tovar, 49-53 - 28034 Madrid - Spain

_http://www.bidoop.es_

Re: ZooKeeper available but no active master location found

2014-04-10 Thread Ted Yu
Here was the change I made to pom.xml in order to build against
0.98.1-hadoop1:
http://pastebin.com/JEX3A0kR

I still got some compilation errors, such as:

[ERROR]
/Users/tyu/twitbase/src/main/java/HBaseIA/TwitBase/hbase/RelationsDAO.java:[156,14]
cannot find symbol
[ERROR] symbol  : method
coprocessorExec(java.lang.ClassHBaseIA.TwitBase.coprocessors.RelationCountProtocol,byte[],byte[],org.apache.hadoop.hbase.client.coprocessor.Batch.CallHBaseIA.TwitBase.coprocessors.RelationCountProtocol,java.lang.Long)
[ERROR] location: interface org.apache.hadoop.hbase.client.HTableInterface

Will revisit this when I have time.

FYI


On Thu, Apr 10, 2014 at 12:11 AM, Margusja mar...@roo.ee wrote:

 Yes there is:
   groupIdorg.apache.hbase/groupId
   artifactIdhbase/artifactId
   version0.92.1/version

 Best regards, Margus (Margusja) Roo
 +372 51 48 780
 http://margus.roo.ee
 http://ee.linkedin.com/in/margusroo
 skype: margusja
 ldapsearch -x -h ldap.sk.ee -b c=EE (serialNumber=37303140314)

 On 10/04/14 00:57, Ted Yu wrote:

 Have you modified pom.xml of twitbase ?
 If not, this is the dependency you get:
  dependency
groupIdorg.apache.hbase/groupId
artifactIdhbase/artifactId
version0.92.1/version

 0.92.1 and 0.96.0 are not compatible.

 Cheers


 On Wed, Apr 9, 2014 at 10:58 AM, Margusja mar...@roo.ee wrote:

  Hi

 I downloaded and installed hortonworks sandbox 2.0 for virtualbox.
 HBase version is: 0.96.0.2.0.6.0-76-hadoop2,
 re6d7a56f72914d01e55c0478d74e5
 cfd3778f231
 [hbase@sandbox twitbase-master]$ cat /etc/hosts
 # Do not remove the following line, or various programs
 # that require network functionality will fail.
 127.0.0.1   localhost.localdomain localhost
 10.0.2.15   sandbox.hortonworks.com sandbox

 [hbase@sandbox twitbase-master]$ hostname
 sandbox.hortonworks.com

 [root@sandbox ~]# netstat -lnp | grep 2181
 tcp0  0 0.0.0.0:2181 0.0.0.0:*   LISTEN
   19359/java

 [root@sandbox ~]# netstat -lnp | grep 6
 tcp0  0 10.0.2.15:6 0.0.0.0:*   LISTEN
 28549/java

 [hbase@sandbox twitbase-master]$ hbase shell
 14/04/05 05:56:44 INFO Configuration.deprecation: hadoop.native.lib is
 deprecated. Instead, use io.native.lib.available
 HBase Shell; enter 'helpRETURN' for list of supported commands.
 Type exitRETURN to leave the HBase Shell
 Version 0.96.0.2.0.6.0-76-hadoop2, re6d7a56f72914d01e55c0478d74e5
 cfd3778f231,
 Thu Oct 17 18:15:20 PDT 2013

 hbase(main):001:0 list
 TABLE
 SLF4J: Class path contains multiple SLF4J bindings.
 SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/
 lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/
 slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
 explanation.
 ambarismoketest
 mytable
 simple_hcat_load_table
 users
 weblogs
 5 row(s) in 4.6040 seconds

 = [ambarismoketest, mytable, simple_hcat_load_table, users,
 weblogs]
 hbase(main):002:0

 So far is good.

 I'd like to play with a code: https://github.com/hbaseinaction/twitbase

 downloaded and made package: mvn package and  got twitbase-1.0.0.jar.

 When I try to exec code I will get:
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
 environment:zookeeper.version=3.4.3-1240972, built on 02/06/2012 10:48
 GMT
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:host.name
 =
 sandbox.hortonworks.com
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
 environment:java.version=1.6.0_30
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
 environment:java.vendor=Sun Microsystems Inc.
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
 environment:java.home=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
 environment:java.class.path=target/twitbase-1.0.0.jar
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
 environment:java.library.path=/usr/lib/jvm/java-1.6.0-
 openjdk-1.6.0.0.x86_64/jre/lib/amd64/server:/usr/lib/jvm/
 java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64:/usr/lib/
 jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/../lib/amd64:/
 usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
 environment:java.io.tmpdir=/tmp
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
 environment:java.compiler=NA
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:os.name
 =Linux
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
 environment:os.arch=amd64
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
 environment:os.version=2.6.32-431.11.2.el6.x86_64
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:user.name
 =hbase
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
 environment:user.home=/home/hbase
 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
 environment:user.dir=/home/hbase/twitbase-master