Re: BlockCache for large scans.

2014-04-09 Thread gortiz
But, I think there's a direct relation between improving performance in 
large scan and memory for memstore. Until I understand, memstore just 
work as cache to write operations.


On 09/04/14 23:44, Ted Yu wrote:

Didn't quite get what you mean, Asaf.

If you're talking about HBASE-5349, please read release note of HBASE-5349.

By default, memstore min/max range is initialized to memstore percent:

 globalMemStorePercentMinRange = conf.getFloat(
MEMSTORE_SIZE_MIN_RANGE_KEY,

 globalMemStorePercent);

 globalMemStorePercentMaxRange = conf.getFloat(
MEMSTORE_SIZE_MAX_RANGE_KEY,

 globalMemStorePercent);

Cheers


On Wed, Apr 9, 2014 at 3:17 PM, Asaf Mesika  wrote:


The Jira says it's enabled by auto. Is there an official explaining this
feature?

On Wednesday, April 9, 2014, Ted Yu  wrote:


Please take a look at http://www.n10k.com/blog/blockcache-101/

For D, hbase.regionserver.global.memstore.size is specified in terms of
percentage of heap. Unless you enable HBASE-5349 'Automagically tweak
global memstore and block cache sizes based on workload'


On Wed, Apr 9, 2014 at 12:24 AM, gortiz >
wrote:


I've been reading the book definitive guide and hbase in action a

little.

I found this question from Cloudera that I'm not sure after looking

some

benchmarks and documentations from HBase. Could someone explain me a

little

about? . I think that when you do a large scan you should disable the
blockcache becuase the blocks are going to swat a lot, so you didn't

get

anything from cache, I guess you should be penalized since you're

spending

memory, calling GC and CPU with this task.

*You want to do a full table scan on your data. You decide to disable
block caching to see if this**
**improves scan performance. Will disabling block caching improve scan
performance?*

A.
No. Disabling block caching does not improve scan performance.

B.
Yes. When you disable block caching, you free up that memory for other
operations. With a full
table scan, you cannot take advantage of block caching anyway because

your

entire table won't fit
into cache.

C.
No. If you disable block caching, HBase must read each block index from
disk for each scan,
thereby decreasing scan performance.

D.
Yes. When you disable block caching, you free up memory for MemStore,
which improves,
scan performance.





--
*Guillermo Ortiz*
/Big Data Developer/

Telf.: +34 917 680 490
Fax: +34 913 833 301
C/ Manuel Tovar, 49-53 - 28034 Madrid - Spain

_http://www.bidoop.es_



Re: ZooKeeper available but no active master location found

2014-04-09 Thread Margusja

Yes there is:
  org.apache.hbase
  hbase
  0.92.1

Best regards, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE "(serialNumber=37303140314)"

On 10/04/14 00:57, Ted Yu wrote:

Have you modified pom.xml of twitbase ?
If not, this is the dependency you get:
 
   org.apache.hbase
   hbase
   0.92.1

0.92.1 and 0.96.0 are not compatible.

Cheers


On Wed, Apr 9, 2014 at 10:58 AM, Margusja  wrote:


Hi

I downloaded and installed hortonworks sandbox 2.0 for virtualbox.
HBase version is: 0.96.0.2.0.6.0-76-hadoop2, re6d7a56f72914d01e55c0478d74e5
cfd3778f231
[hbase@sandbox twitbase-master]$ cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1   localhost.localdomain localhost
10.0.2.15   sandbox.hortonworks.com sandbox

[hbase@sandbox twitbase-master]$ hostname
sandbox.hortonworks.com

[root@sandbox ~]# netstat -lnp | grep 2181
tcp0  0 0.0.0.0:2181 0.0.0.0:*   LISTEN
  19359/java

[root@sandbox ~]# netstat -lnp | grep 6
tcp0  0 10.0.2.15:6 0.0.0.0:*   LISTEN
28549/java

[hbase@sandbox twitbase-master]$ hbase shell
14/04/05 05:56:44 INFO Configuration.deprecation: hadoop.native.lib is
deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help' for list of supported commands.
Type "exit" to leave the HBase Shell
Version 0.96.0.2.0.6.0-76-hadoop2, re6d7a56f72914d01e55c0478d74e5cfd3778f231,
Thu Oct 17 18:15:20 PDT 2013

hbase(main):001:0> list
TABLE
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/
lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/
slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
ambarismoketest
mytable
simple_hcat_load_table
users
weblogs
5 row(s) in 4.6040 seconds

=> ["ambarismoketest", "mytable", "simple_hcat_load_table", "users",
"weblogs"]
hbase(main):002:0>

So far is good.

I'd like to play with a code: https://github.com/hbaseinaction/twitbase

downloaded and made package: mvn package and  got twitbase-1.0.0.jar.

When I try to exec code I will get:
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:zookeeper.version=3.4.3-1240972, built on 02/06/2012 10:48 GMT
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:host.name=
sandbox.hortonworks.com
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.version=1.6.0_30
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.vendor=Sun Microsystems Inc.
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.home=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.class.path=target/twitbase-1.0.0.jar
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.library.path=/usr/lib/jvm/java-1.6.0-
openjdk-1.6.0.0.x86_64/jre/lib/amd64/server:/usr/lib/jvm/
java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64:/usr/lib/
jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/../lib/amd64:/
usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.io.tmpdir=/tmp
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:java.compiler=
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:os.name
=Linux
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:os.arch=amd64
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:os.version=2.6.32-431.11.2.el6.x86_64
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:user.name
=hbase
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:user.home=/home/hbase
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
environment:user.dir=/home/hbase/twitbase-master
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Initiating client connection,
connectString=10.0.2.15:2181 sessionTimeout=18 watcher=hconnection
14/04/05 05:59:50 INFO zookeeper.ClientCnxn: Opening socket connection to
server /10.0.2.15:2181
14/04/05 05:59:50 INFO zookeeper.RecoverableZooKeeper: The identifier of
this process is 30...@sandbox.hortonworks.com
14/04/05 05:59:50 INFO client.ZooKeeperSaslClient: Client will not
SASL-authenticate because the default JAAS configuration section 'Client'
could not be found. If you are not using SASL, you may ignore this. On the
other hand, if you expected SASL to work, please fix your JAAS
configuration.
14/04/05 05:59:51 INFO zookeeper.ClientCnxn: Socket connection established
to sandbox.hortonworks.com/10.0.2.15:2181, initiating session
14/04/05 05:59:51 INFO zookeeper.ClientCnxn: Session establishment
complete on server sandbox.hortonworks.com/10.0.2.15:2181, sessio

Re: HBase Unable to find Region Server - No Exception being thrown

2014-04-09 Thread kanwal
No. Couldn't find any error in the log.


On Wed, Apr 9, 2014 at 10:26 PM, Shengjun Xin [via Apache HBase] <
ml-node+s679495n4058051...@n3.nabble.com> wrote:

> Does hbase-regionserver log have some error message?
>
>
> On Thu, Apr 10, 2014 at 3:27 AM, kanwal <[hidden 
> email]>
> wrote:
>
> > I'm currently running into an issue on my local setup where my
> application
> > is
> > unable to connect to the hbase table but I'm successfully able to query
> the
> > table using hbase shell.
> >
> > I'm using HTable client to make the connection and would expect to get
> an
> > error after certain retries when it's unable to establish connection.
> > However I'm seeing the code is continuously retrying and not logging any
> > error exception. I had to turn on debug to find the issue.
> >
> > Is there a setting that we could use to throw an exception after certain
> > number of retries?
> >
> > countersTable = new HTable(hbaseConfig, counters)
> >
> > Using HBase Version - 0.94.15
> >
> > 14-04-09 12:11:36 DEBUG
> HConnectionManager$HConnectionImplementation:1083 -
> > locateRegionInMeta parentTable=.META., metaLocation=null, attempt=0 of
> 10
> > failed; retrying after sleep of 1000 because: Unable to find region for
> > counter,,99 after 10 tries.
> > 14-04-09 12:11:37 DEBUG ZKUtil:1597 - hconnection-0x14547b13745000e
> > Retrieved 50 byte(s) of data from znode /hbase/root-region-server and
> set
> > watcher; mseakdang.corp.service-now.co...
> > 14-04-09 12:11:37 DEBUG HConnectionManager$HConnectionImplementation:875
> -
> > Looked up root region location,
> >
> >
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@300b6421
> > ;
> > serverName=mseakdang.corp.service-now.com,60020,1397067169749
> > 14-04-09 12:12:35 DEBUG
> HConnectionManager$HConnectionImplementation:1083 -
> > locateRegionInMeta parentTable=-ROOT-,
> > metaLocation={region=-ROOT-,,0.70236052,
> > hostname=mseakdang.corp.service-now.com, port=60020}, attempt=0 of 10
> > failed; retrying after sleep of 1002 because: unknown host:
> > mseakdang.corp.service-now.com
> > 14-04-09 12:12:35 DEBUG ZKUtil:1597 - hconnection-0x14547b13745000e
> > Retrieved 50 byte(s) of data from znode /hbase/root-region-server and
> set
> > watcher; mseakdang.corp.service-now.co...
> > 14-04-09 12:12:35 DEBUG HConnectionManager$HConnectionImplementation:875
> -
> > Looked up root region location,
> >
> >
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@300b6421
> > ;
> > serverName=mseakdang.corp.service-now.com,60020,1397067169749
> > 14-04-09 12:12:36 DEBUG
> HConnectionManager$HConnectionImplementation:1083 -
> > locateRegionInMeta parentTable=.META., metaLocation=null, attempt=0 of
> 10
> > failed; retrying after sleep of 1008 because: Unable to find region for
> > counter,,99 after 10 tries.
> > 14-04-09 12:12:37 DEBUG ZKUtil:1597 - hconnection-0x14547b13745000e
> > Retrieved 50 byte(s) of data from znode /hbase/root-region-server and
> set
> > watcher; mseakdang.corp.service-now.co...
> > 14-04-09 12:12:37 DEBUG HConnectionManager$HConnectionImplementation:875
> -
> > Looked up root region location,
> >
> >
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@300b6421
> > ;
> > serverName=mseakdang.corp.service-now.com,60020,1397067169749
> > 14-04-09 12:13:35 DEBUG
> HConnectionManager$HConnectionImplementation:1083 -
> > locateRegionInMeta parentTable=-ROOT-,
> > metaLocation={region=-ROOT-,,0.70236052,
> > hostname=mseakdang.corp.service-now.com, port=60020}, attempt=0 of 10
> > failed; retrying after sleep of 1001 because: unknown host:
> > mseakdang.corp.service-now.com
> > 14-04-09 12:13:35 DEBUG ZKUtil:1597 - hconnection-0x14547b13745000e
> > Retrieved 50 byte(s) of data from znode /hbase/root-region-server and
> set
> > watcher; mseakdang.corp.service-now.co...
> > 14-04-09 12:13:35 DEBUG HConnectionManager$HConnectionImplementation:875
> -
> > Looked up root region location,
> >
> >
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@300b6421
> > ;
> > serverName=mseakdang.corp.service-now.com,60020,1397067169749
> > 14-04-09 12:13:36 DEBUG
> HConnectionManager$HConnectionImplementation:1083 -
> > locateRegionInMeta parentTable=.META., metaLocation=null, attempt=1 of
> 10
> > failed; retrying after sleep of 1000 because: Unable to find region for
> > counter,,99 after 10 tries.
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-hbase.679495.n3.nabble.com/HBase-Unable-to-find-Region-Server-No-Exception-being-thrown-tp4058033.html
> > Sent from the HBase User mailing list archive at Nabble.com.
> >
>
>
>
> --
> Regards
> Shengjun
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-hbase.679495.n3.nabble.com/HBase

Re: HBase Unable to find Region Server - No Exception being thrown

2014-04-09 Thread Shengjun Xin
Does hbase-regionserver log have some error message?


On Thu, Apr 10, 2014 at 3:27 AM, kanwal  wrote:

> I'm currently running into an issue on my local setup where my application
> is
> unable to connect to the hbase table but I'm successfully able to query the
> table using hbase shell.
>
> I'm using HTable client to make the connection and would expect to get an
> error after certain retries when it's unable to establish connection.
> However I'm seeing the code is continuously retrying and not logging any
> error exception. I had to turn on debug to find the issue.
>
> Is there a setting that we could use to throw an exception after certain
> number of retries?
>
> countersTable = new HTable(hbaseConfig, counters)
>
> Using HBase Version - 0.94.15
>
> 14-04-09 12:11:36 DEBUG HConnectionManager$HConnectionImplementation:1083 -
> locateRegionInMeta parentTable=.META., metaLocation=null, attempt=0 of 10
> failed; retrying after sleep of 1000 because: Unable to find region for
> counter,,99 after 10 tries.
> 14-04-09 12:11:37 DEBUG ZKUtil:1597 - hconnection-0x14547b13745000e
> Retrieved 50 byte(s) of data from znode /hbase/root-region-server and set
> watcher; mseakdang.corp.service-now.co...
> 14-04-09 12:11:37 DEBUG HConnectionManager$HConnectionImplementation:875 -
> Looked up root region location,
>
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@300b6421
> ;
> serverName=mseakdang.corp.service-now.com,60020,1397067169749
> 14-04-09 12:12:35 DEBUG HConnectionManager$HConnectionImplementation:1083 -
> locateRegionInMeta parentTable=-ROOT-,
> metaLocation={region=-ROOT-,,0.70236052,
> hostname=mseakdang.corp.service-now.com, port=60020}, attempt=0 of 10
> failed; retrying after sleep of 1002 because: unknown host:
> mseakdang.corp.service-now.com
> 14-04-09 12:12:35 DEBUG ZKUtil:1597 - hconnection-0x14547b13745000e
> Retrieved 50 byte(s) of data from znode /hbase/root-region-server and set
> watcher; mseakdang.corp.service-now.co...
> 14-04-09 12:12:35 DEBUG HConnectionManager$HConnectionImplementation:875 -
> Looked up root region location,
>
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@300b6421
> ;
> serverName=mseakdang.corp.service-now.com,60020,1397067169749
> 14-04-09 12:12:36 DEBUG HConnectionManager$HConnectionImplementation:1083 -
> locateRegionInMeta parentTable=.META., metaLocation=null, attempt=0 of 10
> failed; retrying after sleep of 1008 because: Unable to find region for
> counter,,99 after 10 tries.
> 14-04-09 12:12:37 DEBUG ZKUtil:1597 - hconnection-0x14547b13745000e
> Retrieved 50 byte(s) of data from znode /hbase/root-region-server and set
> watcher; mseakdang.corp.service-now.co...
> 14-04-09 12:12:37 DEBUG HConnectionManager$HConnectionImplementation:875 -
> Looked up root region location,
>
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@300b6421
> ;
> serverName=mseakdang.corp.service-now.com,60020,1397067169749
> 14-04-09 12:13:35 DEBUG HConnectionManager$HConnectionImplementation:1083 -
> locateRegionInMeta parentTable=-ROOT-,
> metaLocation={region=-ROOT-,,0.70236052,
> hostname=mseakdang.corp.service-now.com, port=60020}, attempt=0 of 10
> failed; retrying after sleep of 1001 because: unknown host:
> mseakdang.corp.service-now.com
> 14-04-09 12:13:35 DEBUG ZKUtil:1597 - hconnection-0x14547b13745000e
> Retrieved 50 byte(s) of data from znode /hbase/root-region-server and set
> watcher; mseakdang.corp.service-now.co...
> 14-04-09 12:13:35 DEBUG HConnectionManager$HConnectionImplementation:875 -
> Looked up root region location,
>
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@300b6421
> ;
> serverName=mseakdang.corp.service-now.com,60020,1397067169749
> 14-04-09 12:13:36 DEBUG HConnectionManager$HConnectionImplementation:1083 -
> locateRegionInMeta parentTable=.META., metaLocation=null, attempt=1 of 10
> failed; retrying after sleep of 1000 because: Unable to find region for
> counter,,99 after 10 tries.
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/HBase-Unable-to-find-Region-Server-No-Exception-being-thrown-tp4058033.html
> Sent from the HBase User mailing list archive at Nabble.com.
>



-- 
Regards
Shengjun


Re: restore snapshot on another cluster

2014-04-09 Thread Ted Yu
You can either give user hbase write access to /apps, or use another
directory where user hbase can write to.

Cheers


On Wed, Apr 9, 2014 at 6:28 PM, Artem Ervits  wrote:

> 2014-03-21 11:35:36,998 FATAL [master:server:6] master.HMaster:
> Unhandled exception. Starting shutdown.
> org.apache.hadoop.security.AccessControlException: Permission denied:
> user=hbase, access=WRITE, inode="/apps":hdfs:hdfs:drwxr-xr-x
> at
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:234)
> at
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:214)
> at
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:158)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5202)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5184)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5158)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3405)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3375)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3349)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:724)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:502)
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59598)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
>
> -Original Message-
> From: Artem Ervits
> Sent: Wednesday, April 09, 2014 5:58 PM
> To: 'user@hbase.apache.org'
> Subject: Re: restore snapshot on another cluster
>
> I read it thanks. I'm getting privilege access exception hbase user on
> write permission. I don't have the exception handy. I will supply it
> shortly. I'm using 0.96.1.1 on hadoop 2.
>
>
> Artem Ervits
> Data Analyst
> New York Presbyterian Hospital
>
> - Original Message -
> From: Ted Yu [mailto:yuzhih...@gmail.com]
> Sent: Wednesday, April 09, 2014 05:48 PM
> To: user@hbase.apache.org 
> Subject: Re: restore snapshot on another cluster
>
> Can you give us some more detail such as:
>
> the HBase release you're using
> the stack trace of permission error
>
> I assume you have read 15.8.7 and 15.8.8 of:
> http://hbase.apache.org/book.html#ops.snapshots
>
> Cheers
>
>
> On Wed, Apr 9, 2014 at 3:08 PM, Artem Ervits  wrote:
>
> > Hello all,
> >
> > When I take a snapshot on cluster 1, copy it to cluster 2 using
> > ExportSnapshot utility, what permissions should I set on the snapshot
> > to be able to clone it into a new table? I matched the permissions of
> > the external snapshot to the permissions of any local snapshots on
> > cluster 2 and still get hbase user write permission errors. Am I missing
> something?
> >
> > Thanks
> >
> > 
> >
> > This electronic message is intended to be for the use only of the
> > named recipient, and may contain information that is confidential or
> privileged.
> > If you are not the intended recipient, you are hereby notified that
> > any disclosure, copying, distribution or use of the contents of this
> > message is strictly prohibited. If you have received this message in
> > error or are not the named recipient, please notify us immediately by
> > contacting the sender at the electronic mail address noted above, and
> > delete and destroy all copies of this message. Thank you.
> >
> > This electronic message is intended to be for the use only of the
> > named recipient, and may contain information that is confidential or
> privileged.
> >  If you are not the intended recipient, you are hereby notified that
> > any disclosure, copying, distribution or use of the contents of this
> > message is strictly prohibited.  If you have received this message in
> > error or are not the named recipient, please notify us immediately by
> > contacting the sender at the electronic mail address noted above, and
> > delete and destroy all copies of this message.  

RE: restore snapshot on another cluster

2014-04-09 Thread Artem Ervits
2014-03-21 11:35:36,998 FATAL [master:server:6] master.HMaster: Unhandled 
exception. Starting shutdown.
org.apache.hadoop.security.AccessControlException: Permission denied: 
user=hbase, access=WRITE, inode="/apps":hdfs:hdfs:drwxr-xr-x
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:234)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:214)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:158)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5202)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5184)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5158)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3405)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3375)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3349)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:724)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:502)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59598)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)

-Original Message-
From: Artem Ervits
Sent: Wednesday, April 09, 2014 5:58 PM
To: 'user@hbase.apache.org'
Subject: Re: restore snapshot on another cluster

I read it thanks. I'm getting privilege access exception hbase user on write 
permission. I don't have the exception handy. I will supply it shortly. I'm 
using 0.96.1.1 on hadoop 2.


Artem Ervits
Data Analyst
New York Presbyterian Hospital

- Original Message -
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Wednesday, April 09, 2014 05:48 PM
To: user@hbase.apache.org 
Subject: Re: restore snapshot on another cluster

Can you give us some more detail such as:

the HBase release you're using
the stack trace of permission error

I assume you have read 15.8.7 and 15.8.8 of:
http://hbase.apache.org/book.html#ops.snapshots

Cheers


On Wed, Apr 9, 2014 at 3:08 PM, Artem Ervits  wrote:

> Hello all,
>
> When I take a snapshot on cluster 1, copy it to cluster 2 using
> ExportSnapshot utility, what permissions should I set on the snapshot
> to be able to clone it into a new table? I matched the permissions of
> the external snapshot to the permissions of any local snapshots on
> cluster 2 and still get hbase user write permission errors. Am I missing 
> something?
>
> Thanks
>
> 
>
> This electronic message is intended to be for the use only of the
> named recipient, and may contain information that is confidential or 
> privileged.
> If you are not the intended recipient, you are hereby notified that
> any disclosure, copying, distribution or use of the contents of this
> message is strictly prohibited. If you have received this message in
> error or are not the named recipient, please notify us immediately by
> contacting the sender at the electronic mail address noted above, and
> delete and destroy all copies of this message. Thank you.
>
> This electronic message is intended to be for the use only of the
> named recipient, and may contain information that is confidential or 
> privileged.
>  If you are not the intended recipient, you are hereby notified that
> any disclosure, copying, distribution or use of the contents of this
> message is strictly prohibited.  If you have received this message in
> error or are not the named recipient, please notify us immediately by
> contacting the sender at the electronic mail address noted above, and
> delete and destroy all copies of this message.  Thank you.



This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged. If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 

Re: restore snapshot on another cluster

2014-04-09 Thread Artem Ervits
I read it thanks. I'm getting privilege access exception hbase user on write 
permission. I don't have the exception handy. I will supply it shortly. I'm 
using 0.96.1.1 on hadoop 2.


Artem Ervits
Data Analyst
New York Presbyterian Hospital

- Original Message -
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Wednesday, April 09, 2014 05:48 PM
To: user@hbase.apache.org 
Subject: Re: restore snapshot on another cluster

Can you give us some more detail such as:

the HBase release you're using
the stack trace of permission error

I assume you have read 15.8.7 and 15.8.8 of:
http://hbase.apache.org/book.html#ops.snapshots

Cheers


On Wed, Apr 9, 2014 at 3:08 PM, Artem Ervits  wrote:

> Hello all,
>
> When I take a snapshot on cluster 1, copy it to cluster 2 using
> ExportSnapshot utility, what permissions should I set on the snapshot to be
> able to clone it into a new table? I matched the permissions of the
> external snapshot to the permissions of any local snapshots on cluster 2
> and still get hbase user write permission errors. Am I missing something?
>
> Thanks
>
> 
>
> This electronic message is intended to be for the use only of the named
> recipient, and may contain information that is confidential or privileged.
> If you are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of the contents of this message is
> strictly prohibited. If you have received this message in error or are not
> the named recipient, please notify us immediately by contacting the sender
> at the electronic mail address noted above, and delete and destroy all
> copies of this message. Thank you.
>
> This electronic message is intended to be for the use only of the named
> recipient, and may contain information that is confidential or privileged.
>  If you are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of the contents of this message is
> strictly prohibited.  If you have received this message in error or are not
> the named recipient, please notify us immediately by contacting the sender
> at the electronic mail address noted above, and delete and destroy all
> copies of this message.  Thank you.



This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged. If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited. If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message. Thank you.

This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged.  If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited.  If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message.  Thank you.


Re: ZooKeeper available but no active master location found

2014-04-09 Thread Ted Yu
Have you modified pom.xml of twitbase ?
If not, this is the dependency you get:

  org.apache.hbase
  hbase
  0.92.1

0.92.1 and 0.96.0 are not compatible.

Cheers


On Wed, Apr 9, 2014 at 10:58 AM, Margusja  wrote:

> Hi
>
> I downloaded and installed hortonworks sandbox 2.0 for virtualbox.
> HBase version is: 0.96.0.2.0.6.0-76-hadoop2, re6d7a56f72914d01e55c0478d74e5
> cfd3778f231
> [hbase@sandbox twitbase-master]$ cat /etc/hosts
> # Do not remove the following line, or various programs
> # that require network functionality will fail.
> 127.0.0.1   localhost.localdomain localhost
> 10.0.2.15   sandbox.hortonworks.com sandbox
>
> [hbase@sandbox twitbase-master]$ hostname
> sandbox.hortonworks.com
>
> [root@sandbox ~]# netstat -lnp | grep 2181
> tcp0  0 0.0.0.0:2181 0.0.0.0:*   LISTEN
>  19359/java
>
> [root@sandbox ~]# netstat -lnp | grep 6
> tcp0  0 10.0.2.15:6 0.0.0.0:*   LISTEN
>28549/java
>
> [hbase@sandbox twitbase-master]$ hbase shell
> 14/04/05 05:56:44 INFO Configuration.deprecation: hadoop.native.lib is
> deprecated. Instead, use io.native.lib.available
> HBase Shell; enter 'help' for list of supported commands.
> Type "exit" to leave the HBase Shell
> Version 0.96.0.2.0.6.0-76-hadoop2, re6d7a56f72914d01e55c0478d74e5cfd3778f231,
> Thu Oct 17 18:15:20 PDT 2013
>
> hbase(main):001:0> list
> TABLE
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/
> lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/
> slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> ambarismoketest
> mytable
> simple_hcat_load_table
> users
> weblogs
> 5 row(s) in 4.6040 seconds
>
> => ["ambarismoketest", "mytable", "simple_hcat_load_table", "users",
> "weblogs"]
> hbase(main):002:0>
>
> So far is good.
>
> I'd like to play with a code: https://github.com/hbaseinaction/twitbase
>
> downloaded and made package: mvn package and  got twitbase-1.0.0.jar.
>
> When I try to exec code I will get:
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
> environment:zookeeper.version=3.4.3-1240972, built on 02/06/2012 10:48 GMT
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:host.name=
> sandbox.hortonworks.com
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
> environment:java.version=1.6.0_30
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
> environment:java.vendor=Sun Microsystems Inc.
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
> environment:java.home=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
> environment:java.class.path=target/twitbase-1.0.0.jar
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
> environment:java.library.path=/usr/lib/jvm/java-1.6.0-
> openjdk-1.6.0.0.x86_64/jre/lib/amd64/server:/usr/lib/jvm/
> java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64:/usr/lib/
> jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/../lib/amd64:/
> usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
> environment:java.io.tmpdir=/tmp
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
> environment:java.compiler=
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:os.name
> =Linux
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
> environment:os.arch=amd64
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
> environment:os.version=2.6.32-431.11.2.el6.x86_64
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:user.name
> =hbase
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
> environment:user.home=/home/hbase
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client
> environment:user.dir=/home/hbase/twitbase-master
> 14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Initiating client connection,
> connectString=10.0.2.15:2181 sessionTimeout=18 watcher=hconnection
> 14/04/05 05:59:50 INFO zookeeper.ClientCnxn: Opening socket connection to
> server /10.0.2.15:2181
> 14/04/05 05:59:50 INFO zookeeper.RecoverableZooKeeper: The identifier of
> this process is 30...@sandbox.hortonworks.com
> 14/04/05 05:59:50 INFO client.ZooKeeperSaslClient: Client will not
> SASL-authenticate because the default JAAS configuration section 'Client'
> could not be found. If you are not using SASL, you may ignore this. On the
> other hand, if you expected SASL to work, please fix your JAAS
> configuration.
> 14/04/05 05:59:51 INFO zookeeper.ClientCnxn: Socket connection established
> to sandbox.hortonworks.com/10.0.2.15:2181, initiating session
> 14/04/05 05:59:51 INFO zookeeper.ClientCnxn: Session establishment
> complete on server sandbox.hortonworks.com/10.0.2.15:2181, sessionid =
> 0x1453145e9500038, negotiated timeout = 4
> 14/04/05 05:59:51 INFO client.HConnect

Re: restore snapshot on another cluster

2014-04-09 Thread Ted Yu
Can you give us some more detail such as:

the HBase release you're using
the stack trace of permission error

I assume you have read 15.8.7 and 15.8.8 of:
http://hbase.apache.org/book.html#ops.snapshots

Cheers


On Wed, Apr 9, 2014 at 3:08 PM, Artem Ervits  wrote:

> Hello all,
>
> When I take a snapshot on cluster 1, copy it to cluster 2 using
> ExportSnapshot utility, what permissions should I set on the snapshot to be
> able to clone it into a new table? I matched the permissions of the
> external snapshot to the permissions of any local snapshots on cluster 2
> and still get hbase user write permission errors. Am I missing something?
>
> Thanks
>
> 
>
> This electronic message is intended to be for the use only of the named
> recipient, and may contain information that is confidential or privileged.
> If you are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of the contents of this message is
> strictly prohibited. If you have received this message in error or are not
> the named recipient, please notify us immediately by contacting the sender
> at the electronic mail address noted above, and delete and destroy all
> copies of this message. Thank you.
>
> This electronic message is intended to be for the use only of the named
> recipient, and may contain information that is confidential or privileged.
>  If you are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of the contents of this message is
> strictly prohibited.  If you have received this message in error or are not
> the named recipient, please notify us immediately by contacting the sender
> at the electronic mail address noted above, and delete and destroy all
> copies of this message.  Thank you.


Re: BlockCache for large scans.

2014-04-09 Thread Ted Yu
Didn't quite get what you mean, Asaf.

If you're talking about HBASE-5349, please read release note of HBASE-5349.

By default, memstore min/max range is initialized to memstore percent:

globalMemStorePercentMinRange = conf.getFloat(
MEMSTORE_SIZE_MIN_RANGE_KEY,

globalMemStorePercent);

globalMemStorePercentMaxRange = conf.getFloat(
MEMSTORE_SIZE_MAX_RANGE_KEY,

globalMemStorePercent);

Cheers


On Wed, Apr 9, 2014 at 3:17 PM, Asaf Mesika  wrote:

> The Jira says it's enabled by auto. Is there an official explaining this
> feature?
>
> On Wednesday, April 9, 2014, Ted Yu  wrote:
>
> > Please take a look at http://www.n10k.com/blog/blockcache-101/
> >
> > For D, hbase.regionserver.global.memstore.size is specified in terms of
> > percentage of heap. Unless you enable HBASE-5349 'Automagically tweak
> > global memstore and block cache sizes based on workload'
> >
> >
> > On Wed, Apr 9, 2014 at 12:24 AM, gortiz >
> > wrote:
> >
> > > I've been reading the book definitive guide and hbase in action a
> little.
> > > I found this question from Cloudera that I'm not sure after looking
> some
> > > benchmarks and documentations from HBase. Could someone explain me a
> > little
> > > about? . I think that when you do a large scan you should disable the
> > > blockcache becuase the blocks are going to swat a lot, so you didn't
> get
> > > anything from cache, I guess you should be penalized since you're
> > spending
> > > memory, calling GC and CPU with this task.
> > >
> > > *You want to do a full table scan on your data. You decide to disable
> > > block caching to see if this**
> > > **improves scan performance. Will disabling block caching improve scan
> > > performance?*
> > >
> > > A.
> > > No. Disabling block caching does not improve scan performance.
> > >
> > > B.
> > > Yes. When you disable block caching, you free up that memory for other
> > > operations. With a full
> > > table scan, you cannot take advantage of block caching anyway because
> > your
> > > entire table won't fit
> > > into cache.
> > >
> > > C.
> > > No. If you disable block caching, HBase must read each block index from
> > > disk for each scan,
> > > thereby decreasing scan performance.
> > >
> > > D.
> > > Yes. When you disable block caching, you free up memory for MemStore,
> > > which improves,
> > > scan performance.
> > >
> > >
> >
>


Re: BlockCache for large scans.

2014-04-09 Thread Asaf Mesika
The Jira says it's enabled by auto. Is there an official explaining this
feature?

On Wednesday, April 9, 2014, Ted Yu  wrote:

> Please take a look at http://www.n10k.com/blog/blockcache-101/
>
> For D, hbase.regionserver.global.memstore.size is specified in terms of
> percentage of heap. Unless you enable HBASE-5349 'Automagically tweak
> global memstore and block cache sizes based on workload'
>
>
> On Wed, Apr 9, 2014 at 12:24 AM, gortiz >
> wrote:
>
> > I've been reading the book definitive guide and hbase in action a little.
> > I found this question from Cloudera that I'm not sure after looking some
> > benchmarks and documentations from HBase. Could someone explain me a
> little
> > about? . I think that when you do a large scan you should disable the
> > blockcache becuase the blocks are going to swat a lot, so you didn't get
> > anything from cache, I guess you should be penalized since you're
> spending
> > memory, calling GC and CPU with this task.
> >
> > *You want to do a full table scan on your data. You decide to disable
> > block caching to see if this**
> > **improves scan performance. Will disabling block caching improve scan
> > performance?*
> >
> > A.
> > No. Disabling block caching does not improve scan performance.
> >
> > B.
> > Yes. When you disable block caching, you free up that memory for other
> > operations. With a full
> > table scan, you cannot take advantage of block caching anyway because
> your
> > entire table won't fit
> > into cache.
> >
> > C.
> > No. If you disable block caching, HBase must read each block index from
> > disk for each scan,
> > thereby decreasing scan performance.
> >
> > D.
> > Yes. When you disable block caching, you free up memory for MemStore,
> > which improves,
> > scan performance.
> >
> >
>


restore snapshot on another cluster

2014-04-09 Thread Artem Ervits
Hello all,

When I take a snapshot on cluster 1, copy it to cluster 2 using ExportSnapshot 
utility, what permissions should I set on the snapshot to be able to clone it 
into a new table? I matched the permissions of the external snapshot to the 
permissions of any local snapshots on cluster 2 and still get hbase user write 
permission errors. Am I missing something?

Thanks



This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged. If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited. If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message. Thank you.

This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged.  If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited.  If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message.  Thank you.

Re: HBase 0.98.1 split policy change?

2014-04-09 Thread Jean-Marc Spaggiari
Ok. From the logs, I can clearly see that my settings are not getting use:

2014-04-09 15:35:03,737 DEBUG [MemStoreFlusher.0]
regionserver.IncreasingToUpperBoundRegionSplitPolicy: ShouldSplit because
info size=2242479890, sizeToCheck=2147483648, regionsWithCommonTable=2

I will try to figure why and report.

JM



2014-04-09 15:25 GMT-04:00 Jean-Marc Spaggiari :

> Hi,
>
> I try to chance my HBase 0.98.1 split policy to be
> ConstantSizeRegionSplitPolicy so I have updated my hbase-site.xml do be
> this:
>
>   
> hbase.regionserver.region.split.policy
>
> org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy
>   
>
> Then on my table I have altered to get this where regions size should be
> 100GB:
> hbase(main):002:0> describe 'TestTable'
> DESCRIPTION
>
>  'TestTable', {TABLE_ATTRIBUTES => {MAX_FILESIZE => '1073741824000'},
> {NAME => 'info', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW',
> REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '1', TTL => '
> 2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false',
> BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE =>
> 'true'}
>
> 1 row(s) in 0.2480 seconds
>
> However, I'm still getting multiple small regions with I try randomWrite
> into the table:
> hadoop@hbasetest1:~$ bin/hadoop fs -du -h /hbase/data/default/TestTable
> 14/04/09 15:08:13 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 320/hbase/data/default/TestTable/.tabledesc
> 0  /hbase/data/default/TestTable/.tmp
> 1.6 G  /hbase/data/default/TestTable/1ad732fefba8ab6080820fb235fd5180
> 2.0 G  /hbase/data/default/TestTable/585b3e11d64cabcff332a2e293a37fee
> 1.5 G  /hbase/data/default/TestTable/6e8c24a2cc03051bb8a0990e22b8ec21
> 1.5 G  /hbase/data/default/TestTable/73fc51f6d6d75e57be29a1f7d54ef1df
> 1.4 G  /hbase/data/default/TestTable/ac69f15956bf413805982439a05527fd
> 2.5 G  /hbase/data/default/TestTable/cfb60af42b92fd9c2abfab80306d257c
>
> With the split policy setting and the alter, I will have expected to have
> a single region of few GB and not multiple regions. I checked inthe WebUI
> for the config and I can see my splitpolicy entry. What did I missed? I
> checked the ConstantSizeRegionSplitPolicy in 0.98 and it uses MAX_FILESIZE
> from the table, and hbase.hregion.max.filesize which is not setup for me.
> Any idea?
>
> JM
>


HBase Unable to find Region Server - No Exception being thrown

2014-04-09 Thread kanwal
I'm currently running into an issue on my local setup where my application is
unable to connect to the hbase table but I'm successfully able to query the
table using hbase shell.

I'm using HTable client to make the connection and would expect to get an
error after certain retries when it's unable to establish connection.
However I'm seeing the code is continuously retrying and not logging any
error exception. I had to turn on debug to find the issue.

Is there a setting that we could use to throw an exception after certain
number of retries?

countersTable = new HTable(hbaseConfig, counters)

Using HBase Version - 0.94.15

14-04-09 12:11:36 DEBUG HConnectionManager$HConnectionImplementation:1083 -
locateRegionInMeta parentTable=.META., metaLocation=null, attempt=0 of 10
failed; retrying after sleep of 1000 because: Unable to find region for
counter,,99 after 10 tries.
14-04-09 12:11:37 DEBUG ZKUtil:1597 - hconnection-0x14547b13745000e
Retrieved 50 byte(s) of data from znode /hbase/root-region-server and set
watcher; mseakdang.corp.service-now.co...
14-04-09 12:11:37 DEBUG HConnectionManager$HConnectionImplementation:875 -
Looked up root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@300b6421;
serverName=mseakdang.corp.service-now.com,60020,1397067169749
14-04-09 12:12:35 DEBUG HConnectionManager$HConnectionImplementation:1083 -
locateRegionInMeta parentTable=-ROOT-,
metaLocation={region=-ROOT-,,0.70236052,
hostname=mseakdang.corp.service-now.com, port=60020}, attempt=0 of 10
failed; retrying after sleep of 1002 because: unknown host:
mseakdang.corp.service-now.com
14-04-09 12:12:35 DEBUG ZKUtil:1597 - hconnection-0x14547b13745000e
Retrieved 50 byte(s) of data from znode /hbase/root-region-server and set
watcher; mseakdang.corp.service-now.co...
14-04-09 12:12:35 DEBUG HConnectionManager$HConnectionImplementation:875 -
Looked up root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@300b6421;
serverName=mseakdang.corp.service-now.com,60020,1397067169749
14-04-09 12:12:36 DEBUG HConnectionManager$HConnectionImplementation:1083 -
locateRegionInMeta parentTable=.META., metaLocation=null, attempt=0 of 10
failed; retrying after sleep of 1008 because: Unable to find region for
counter,,99 after 10 tries.
14-04-09 12:12:37 DEBUG ZKUtil:1597 - hconnection-0x14547b13745000e
Retrieved 50 byte(s) of data from znode /hbase/root-region-server and set
watcher; mseakdang.corp.service-now.co...
14-04-09 12:12:37 DEBUG HConnectionManager$HConnectionImplementation:875 -
Looked up root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@300b6421;
serverName=mseakdang.corp.service-now.com,60020,1397067169749
14-04-09 12:13:35 DEBUG HConnectionManager$HConnectionImplementation:1083 -
locateRegionInMeta parentTable=-ROOT-,
metaLocation={region=-ROOT-,,0.70236052,
hostname=mseakdang.corp.service-now.com, port=60020}, attempt=0 of 10
failed; retrying after sleep of 1001 because: unknown host:
mseakdang.corp.service-now.com
14-04-09 12:13:35 DEBUG ZKUtil:1597 - hconnection-0x14547b13745000e
Retrieved 50 byte(s) of data from znode /hbase/root-region-server and set
watcher; mseakdang.corp.service-now.co...
14-04-09 12:13:35 DEBUG HConnectionManager$HConnectionImplementation:875 -
Looked up root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@300b6421;
serverName=mseakdang.corp.service-now.com,60020,1397067169749
14-04-09 12:13:36 DEBUG HConnectionManager$HConnectionImplementation:1083 -
locateRegionInMeta parentTable=.META., metaLocation=null, attempt=1 of 10
failed; retrying after sleep of 1000 because: Unable to find region for
counter,,99 after 10 tries.



--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/HBase-Unable-to-find-Region-Server-No-Exception-being-thrown-tp4058033.html
Sent from the HBase User mailing list archive at Nabble.com.


HBase 0.98.1 split policy change?

2014-04-09 Thread Jean-Marc Spaggiari
Hi,

I try to chance my HBase 0.98.1 split policy to be
ConstantSizeRegionSplitPolicy so I have updated my hbase-site.xml do be
this:

  
hbase.regionserver.region.split.policy

org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy
  

Then on my table I have altered to get this where regions size should be
100GB:
hbase(main):002:0> describe 'TestTable'
DESCRIPTION

 'TestTable', {TABLE_ATTRIBUTES => {MAX_FILESIZE => '1073741824000'}, {NAME
=> 'info', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW',
REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '1', TTL =>
'2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE
=> '65536', IN_MEMORY => 'false', BLOCKCACHE =>
'true'}

1 row(s) in 0.2480 seconds

However, I'm still getting multiple small regions with I try randomWrite
into the table:
hadoop@hbasetest1:~$ bin/hadoop fs -du -h /hbase/data/default/TestTable
14/04/09 15:08:13 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
320/hbase/data/default/TestTable/.tabledesc
0  /hbase/data/default/TestTable/.tmp
1.6 G  /hbase/data/default/TestTable/1ad732fefba8ab6080820fb235fd5180
2.0 G  /hbase/data/default/TestTable/585b3e11d64cabcff332a2e293a37fee
1.5 G  /hbase/data/default/TestTable/6e8c24a2cc03051bb8a0990e22b8ec21
1.5 G  /hbase/data/default/TestTable/73fc51f6d6d75e57be29a1f7d54ef1df
1.4 G  /hbase/data/default/TestTable/ac69f15956bf413805982439a05527fd
2.5 G  /hbase/data/default/TestTable/cfb60af42b92fd9c2abfab80306d257c

With the split policy setting and the alter, I will have expected to have a
single region of few GB and not multiple regions. I checked inthe WebUI for
the config and I can see my splitpolicy entry. What did I missed? I checked
the ConstantSizeRegionSplitPolicy in 0.98 and it uses MAX_FILESIZE from the
table, and hbase.hregion.max.filesize which is not setup for me. Any idea?

JM


Re: CDH5 b2 - Hbase 0.96 - REST not returning data

2014-04-09 Thread stack
On Wednesday, April 9, 2014 6:18:35 AM UTC-7, Juraj jiv wrote:

> Hello,
> i have one table in Hbase with 250GB of data and have problem while using 
> Hbase REST scanner.
>
> What i do is:
> 1. calling http://:20550//scanner
> with POST and Content-Type: text/xml
> 
>
> 2. then getting header location and calling http GET on scanner address 
> with:
> "accept", "application/json"
> And  GET ends with no data returned - http 204.
> in log file i see - org.apache.hadoop.hbase.rest.ScannerInstanceResource: 
> generator exhausted
> what it means? Its INFO so i guess its not a problem.
>
> Hbase shell from command line working fine, i tried similar scan command. 
> Other table with 7GB of data working fine using scan via REST (also json).
> Any ideas what could be wrong? Hbase version - HBase 
> 0.96.1.1-cdh5.0.0-beta-2 
>
> JV
>

You verified from shell that there is data in the table?

What if you do not stipulate end and start values?  Does the scan work?

Try enabling DEBUG on the target server.

You get 'generator exhausted' when no value was returned by your query:

  if (value == null) {
LOG.info("generator exhausted");
// respond with 204 (No Content) if an empty cell set would be
// returned
if (count == limit) {
  return Response.noContent().build();
}
break;
  }

St.Ack



 


ZooKeeper available but no active master location found

2014-04-09 Thread Margusja

Hi

I downloaded and installed hortonworks sandbox 2.0 for virtualbox.
HBase version is: 0.96.0.2.0.6.0-76-hadoop2, 
re6d7a56f72914d01e55c0478d74e5cfd3778f231

[hbase@sandbox twitbase-master]$ cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1   localhost.localdomain localhost
10.0.2.15   sandbox.hortonworks.com sandbox

[hbase@sandbox twitbase-master]$ hostname
sandbox.hortonworks.com

[root@sandbox ~]# netstat -lnp | grep 2181
tcp0  0 0.0.0.0:2181 0.0.0.0:*   LISTEN  
19359/java


[root@sandbox ~]# netstat -lnp | grep 6
tcp0  0 10.0.2.15:6 0.0.0.0:*   
LISTEN  28549/java


[hbase@sandbox twitbase-master]$ hbase shell
14/04/05 05:56:44 INFO Configuration.deprecation: hadoop.native.lib is 
deprecated. Instead, use io.native.lib.available

HBase Shell; enter 'help' for list of supported commands.
Type "exit" to leave the HBase Shell
Version 0.96.0.2.0.6.0-76-hadoop2, 
re6d7a56f72914d01e55c0478d74e5cfd3778f231, Thu Oct 17 18:15:20 PDT 2013


hbase(main):001:0> list
TABLE
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation.

ambarismoketest
mytable
simple_hcat_load_table
users
weblogs
5 row(s) in 4.6040 seconds

=> ["ambarismoketest", "mytable", "simple_hcat_load_table", "users", 
"weblogs"]

hbase(main):002:0>

So far is good.

I'd like to play with a code: https://github.com/hbaseinaction/twitbase

downloaded and made package: mvn package and  got twitbase-1.0.0.jar.

When I try to exec code I will get:
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client 
environment:zookeeper.version=3.4.3-1240972, built on 02/06/2012 10:48 GMT
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client 
environment:host.name=sandbox.hortonworks.com
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client 
environment:java.version=1.6.0_30
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client 
environment:java.vendor=Sun Microsystems Inc.
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client 
environment:java.home=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client 
environment:java.class.path=target/twitbase-1.0.0.jar
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client 
environment:java.library.path=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client 
environment:java.io.tmpdir=/tmp
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client 
environment:java.compiler=

14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client 
environment:os.version=2.6.32-431.11.2.el6.x86_64
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client 
environment:user.name=hbase
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client 
environment:user.home=/home/hbase
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Client 
environment:user.dir=/home/hbase/twitbase-master
14/04/05 05:59:50 INFO zookeeper.ZooKeeper: Initiating client 
connection, connectString=10.0.2.15:2181 sessionTimeout=18 
watcher=hconnection
14/04/05 05:59:50 INFO zookeeper.ClientCnxn: Opening socket connection 
to server /10.0.2.15:2181
14/04/05 05:59:50 INFO zookeeper.RecoverableZooKeeper: The identifier of 
this process is 30...@sandbox.hortonworks.com
14/04/05 05:59:50 INFO client.ZooKeeperSaslClient: Client will not 
SASL-authenticate because the default JAAS configuration section 
'Client' could not be found. If you are not using SASL, you may ignore 
this. On the other hand, if you expected SASL to work, please fix your 
JAAS configuration.
14/04/05 05:59:51 INFO zookeeper.ClientCnxn: Socket connection 
established to sandbox.hortonworks.com/10.0.2.15:2181, initiating session
14/04/05 05:59:51 INFO zookeeper.ClientCnxn: Session establishment 
complete on server sandbox.hortonworks.com/10.0.2.15:2181, sessionid = 
0x1453145e9500038, negotiated timeout = 4
14/04/05 05:59:51 INFO 
client.HConnectionManager$HConnectionImplementation: ZooKeeper available 
but no active master location found
14/04/05 05:59:51 INFO 
client.HConnectionManager$HConnectionImplementation: getMaster attempt 0 
of 10 failed; retrying after sleep of 1000

org.apache.hadoop.hbase.MasterNotRunningException
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionMan

Re: 答复: 答复: HBase Replication - Addition alone

2014-04-09 Thread Demai Ni
Manthosh,

not sure doing replication periodically is the right way to go. I assume
you will manually start/stop replication from Master?

anyway, if it fits your business needs, give it a shoot. Let's know how it
goes.

Cheers.

Demai


On Wed, Apr 9, 2014 at 6:10 AM, Manthosh Kumar T  wrote:

> Hi Demai,
>   I thought making the client communicate with Central Server
> frequently may not be efficient. In case of replication, I can do that
> periodically in bulk.
>
>
> On 7 April 2014 22:00, Demai Ni  wrote:
>
> > I looks to me that. You like to have the small clusters located close to
> > the client, and then use the smaller clusters as Masters. So there will
> be
> > multi-Masters and one-Slave cluster setup. And the one-Slave cluster is
> the
> > centralized and large HBase server.
> >
> > Well, it works. But I don't get the two points:
> > 1) what's saved here? to get the replication works from the smaller
> > clusters to the centralized large cluster will consume the cpu/storage
> and
> > network resource. So why not get the client to talk directly to the
> > centralized cluster. The added-on layer of the smaller clusters will only
> > reduce the performance.  One factor is that if the network doesn't allow
> > the client to talk directly to centralized cluster, the replication won't
> > work well either due to the lag.
> > 2) I still don't get why allow client to delete on smaller cluster and
> > doesn't allow such transactions replayed on the centralized cluster.
> > Assume you have a good business reason to disallow client to delete
> > existing data, an application layer of permission control may be better.
> > That is don't allow delete on smaller clusters neither.
> >
> > just my 2 cents.
> >
> > Demai
> >
> >
> >
> >
> >
> > On Sun, Apr 6, 2014 at 11:09 PM, Manthosh Kumar T  > >wrote:
> >
> > > My use case is that I need to replicate between two geographically
> > distant
> > > clusters. In the simplest form, I have several geographically distant
> > > clients that needs to add data to a geographically distant centralized
> > > HBase server. So, I thought I'll maintain a small cluster at each
> client
> > > and make that replicate to the central server. But when I delete in the
> > > small cluster, I don't want that to be replicated in the centralized
> > > server. Hence, I think the coproc route is fine. Please correct me if
> I'm
> > > wrong. Or is there a better solution for my use case?
> > >
> > >
> > > On 5 April 2014 05:16, Demai Ni  wrote:
> > >
> > > > agree about the suggestion above. just like to chime in a bit more.
> > > >
> > > > One question, how do you like to treat the 'put' on the existing row?
> > > well,
> > > > it is a delete + addition to some degree.
> > > >
> > > > If go to the coproc route, maybe better not to use replication at
> all.
> > > > Basically, you can have two tables: "source" and "backup", and when
> > ever
> > > a
> > > > 'put' is on 'source', the coproc will replay it to 'backup'. And the
> > > table
> > > > 'backup' can be on the same cluster or another cluster.
> > > >
> > > > Not sure about your use case, maybe the existing 'version' feature (
> > > > http://hbase.apache.org/book/schema.versions.html) can be used with
> a
> > > > large
> > > > max-version. If the same row/cell won't rewritten a lot, then the
> cost
> > > will
> > > > be similar as the replication or coproc.
> > > >
> > > > Demai
> > > >
> > > >
> > > > On Fri, Apr 4, 2014 at 6:45 AM, Jean-Marc Spaggiari <
> > > > jean-m...@spaggiari.org
> > > > > wrote:
> > > >
> > > > > If you add a coproc on the destination cluster and ask it to reject
> > all
> > > > the
> > > > > deletes, that might do the trick, but you might end up with some
> > issues
> > > > at
> > > > > the end if you want to do some maintenance in the target cluster...
> > > > >
> > > > >
> > > > > 2014-04-04 4:26 GMT-04:00 冯宏华 :
> > > > >
> > > > > > Can't figure out solution to achieve this behavior using existing
> > > means
> > > > > in
> > > > > > HBase immediately.
> > > > > >
> > > > > > But it seems not that hard to implement it by changing some
> code, a
> > > > rough
> > > > > > thought is to filter out delete entries when pushing entries to
> the
> > > > > > according replication peer and this behavior can be made
> > > configurable.
> > > > > > 
> > > > > > 发件人: Manthosh Kumar T [manth...@gmail.com]
> > > > > > 发送时间: 2014年4月4日 16:11
> > > > > > 收件人: user@hbase.apache.org
> > > > > > 主题: Re: 答复: HBase Replication - Addition alone
> > > > > >
> > > > > > Is it possible by any other meansin HBase?
> > > > > >
> > > > > >
> > > > > > On 4 April 2014 13:37, 冯宏华  wrote:
> > > > > >
> > > > > > > No
> > > > > > > 
> > > > > > > 发件人: Manthosh Kumar T [manth...@gmail.com]
> > > > > > > 发送时间: 2014年4月4日 16:00
> > > > > > > 收件人: user@hbase.apache.org
> > > > > > > 主题: HBase Replication - Addition alone
> > > > > > >
> > > > > > > Hi Al

Re: BlockCache for large scans.

2014-04-09 Thread gortiz

Pretty interested the link, I'll keep it in my favorites.



On 09/04/14 16:07, Ted Yu wrote:

Please take a look at http://www.n10k.com/blog/blockcache-101/

For D, hbase.regionserver.global.memstore.size is specified in terms of
percentage of heap. Unless you enable HBASE-5349 'Automagically tweak
global memstore and block cache sizes based on workload'


On Wed, Apr 9, 2014 at 12:24 AM, gortiz  wrote:


I've been reading the book definitive guide and hbase in action a little.
I found this question from Cloudera that I'm not sure after looking some
benchmarks and documentations from HBase. Could someone explain me a little
about? . I think that when you do a large scan you should disable the
blockcache becuase the blocks are going to swat a lot, so you didn't get
anything from cache, I guess you should be penalized since you're spending
memory, calling GC and CPU with this task.

*You want to do a full table scan on your data. You decide to disable
block caching to see if this**
**improves scan performance. Will disabling block caching improve scan
performance?*

A.
No. Disabling block caching does not improve scan performance.

B.
Yes. When you disable block caching, you free up that memory for other
operations. With a full
table scan, you cannot take advantage of block caching anyway because your
entire table won't fit
into cache.

C.
No. If you disable block caching, HBase must read each block index from
disk for each scan,
thereby decreasing scan performance.

D.
Yes. When you disable block caching, you free up memory for MemStore,
which improves,
scan performance.





--
*Guillermo Ortiz*
/Big Data Developer/

Telf.: +34 917 680 490
Fax: +34 913 833 301
C/ Manuel Tovar, 49-53 - 28034 Madrid - Spain

_http://www.bidoop.es_



Re: hbase region server reboot steps

2014-04-09 Thread Ted Yu
Rural:
Take a look at:
http://hbase.apache.org/book.html#decommission

especially 15.3.1.1


On Wed, Apr 9, 2014 at 8:28 AM, Jean-Marc Spaggiari  wrote:

> Hum.
>
> Disable load balancer, and move all the regions manually to other hosts
> using the shell? Then hard restart it?
>
> JM
>
>
> 2014-04-09 10:26 GMT-04:00 Rural Hunter :
>
> > Actually I have to do a hard reboot. Let me provide more info about the
> > problem: Except the ssh service(ssh error is:
> ssh_exchange_identification:
> > Connection closed by remote host) and local login problem, other services
> > are running fine on the server(including http/ftp/hbase/hadoop etc).
> >
> > 于 2014/4/9 22:14, Rural Hunter 写道:
> >
> >  Thanks. What if I'm not able to login the region server(both ssh and
> >> local)? I have to reboot and check the server because of this serious
> >> problem.
> >>
> >> 于 2014/4/9 22:01, Ted Yu 写道:
> >>
> >>> You can use bin/graceful_stop.sh to stop the region server process.
> >>>
> >>> # Move regions off a server then stop it.  Optionally restart and
> reload.
> >>> # Turn off the balancer before running this script.
> >>>
> >>> After that, you can stop hadoop (datanode, etc)
> >>>
> >>>
> >>>
> >>
> >
>


Re: hbase region server reboot steps

2014-04-09 Thread Jean-Marc Spaggiari
Hum.

Disable load balancer, and move all the regions manually to other hosts
using the shell? Then hard restart it?

JM


2014-04-09 10:26 GMT-04:00 Rural Hunter :

> Actually I have to do a hard reboot. Let me provide more info about the
> problem: Except the ssh service(ssh error is: ssh_exchange_identification:
> Connection closed by remote host) and local login problem, other services
> are running fine on the server(including http/ftp/hbase/hadoop etc).
>
> 于 2014/4/9 22:14, Rural Hunter 写道:
>
>  Thanks. What if I'm not able to login the region server(both ssh and
>> local)? I have to reboot and check the server because of this serious
>> problem.
>>
>> 于 2014/4/9 22:01, Ted Yu 写道:
>>
>>> You can use bin/graceful_stop.sh to stop the region server process.
>>>
>>> # Move regions off a server then stop it.  Optionally restart and reload.
>>> # Turn off the balancer before running this script.
>>>
>>> After that, you can stop hadoop (datanode, etc)
>>>
>>>
>>>
>>
>


Re: hbase region server reboot steps

2014-04-09 Thread Rural Hunter
Actually I have to do a hard reboot. Let me provide more info about the 
problem: Except the ssh service(ssh error is: 
ssh_exchange_identification: Connection closed by remote host) and local 
login problem, other services are running fine on the server(including 
http/ftp/hbase/hadoop etc).


于 2014/4/9 22:14, Rural Hunter 写道:
Thanks. What if I'm not able to login the region server(both ssh and 
local)? I have to reboot and check the server because of this serious 
problem.


于 2014/4/9 22:01, Ted Yu 写道:

You can use bin/graceful_stop.sh to stop the region server process.

# Move regions off a server then stop it.  Optionally restart and 
reload.

# Turn off the balancer before running this script.

After that, you can stop hadoop (datanode, etc)








Re: hbase region server reboot steps

2014-04-09 Thread Rural Hunter
Thanks. What if I'm not able to login the region server(both ssh and 
local)? I have to reboot and check the server because of this serious 
problem.


于 2014/4/9 22:01, Ted Yu 写道:

You can use bin/graceful_stop.sh to stop the region server process.

# Move regions off a server then stop it.  Optionally restart and reload.
# Turn off the balancer before running this script.

After that, you can stop hadoop (datanode, etc)






Re: BlockCache for large scans.

2014-04-09 Thread Ted Yu
Please take a look at http://www.n10k.com/blog/blockcache-101/

For D, hbase.regionserver.global.memstore.size is specified in terms of
percentage of heap. Unless you enable HBASE-5349 'Automagically tweak
global memstore and block cache sizes based on workload'


On Wed, Apr 9, 2014 at 12:24 AM, gortiz  wrote:

> I've been reading the book definitive guide and hbase in action a little.
> I found this question from Cloudera that I'm not sure after looking some
> benchmarks and documentations from HBase. Could someone explain me a little
> about? . I think that when you do a large scan you should disable the
> blockcache becuase the blocks are going to swat a lot, so you didn't get
> anything from cache, I guess you should be penalized since you're spending
> memory, calling GC and CPU with this task.
>
> *You want to do a full table scan on your data. You decide to disable
> block caching to see if this**
> **improves scan performance. Will disabling block caching improve scan
> performance?*
>
> A.
> No. Disabling block caching does not improve scan performance.
>
> B.
> Yes. When you disable block caching, you free up that memory for other
> operations. With a full
> table scan, you cannot take advantage of block caching anyway because your
> entire table won't fit
> into cache.
>
> C.
> No. If you disable block caching, HBase must read each block index from
> disk for each scan,
> thereby decreasing scan performance.
>
> D.
> Yes. When you disable block caching, you free up memory for MemStore,
> which improves,
> scan performance.
>
>


Re: hbase region server reboot steps

2014-04-09 Thread Ted Yu
You can use bin/graceful_stop.sh to stop the region server process.

# Move regions off a server then stop it.  Optionally restart and reload.
# Turn off the balancer before running this script.

After that, you can stop hadoop (datanode, etc)


On Wed, Apr 9, 2014 at 7:57 AM, Rural Hunter  wrote:

> Hi,
>
> I have one region server which needs to be rebooted for server
> maintenance. The server hosts both the hadoop and hbase slave(hadoop2-hbase
> 0.96). What is the recommended steps to reboot it without impacting hbase
> service?
>


Re: hbase region server reboot steps

2014-04-09 Thread Jean-Marc Spaggiari
Hi Rural,

Decomission the node, stop the processes, and reboot. You can look at the
scripts in bin/ to help you with that. Like bin/graceful_stop.sh.

JM


2014-04-09 9:57 GMT-04:00 Rural Hunter :

> Hi,
>
> I have one region server which needs to be rebooted for server
> maintenance. The server hosts both the hadoop and hbase slave(hadoop2-hbase
> 0.96). What is the recommended steps to reboot it without impacting hbase
> service?
>


hbase region server reboot steps

2014-04-09 Thread Rural Hunter

Hi,

I have one region server which needs to be rebooted for server 
maintenance. The server hosts both the hadoop and hbase 
slave(hadoop2-hbase 0.96). What is the recommended steps to reboot it 
without impacting hbase service?


CDH5 b2 - Hbase 0.96 - REST not returning data

2014-04-09 Thread Juraj jiv
Hello,
i have one table in Hbase with 250GB of data and have problem while using
Hbase REST scanner.

What i do is:
1. calling http://:20550//scanner
with POST and Content-Type: text/xml


2. then getting header location and calling http GET on scanner address
with:
"accept", "application/json"
And  GET ends with no data returned - http 204.
in log file i see - org.apache.hadoop.hbase.rest.ScannerInstanceResource:
generator exhausted
what it means? Its INFO so i guess its not a problem.

Hbase shell from command line working fine, i tried similar scan command.
Other table with 7GB of data working fine using scan via REST (also json).
Any ideas what could be wrong? Hbase version - HBase
0.96.1.1-cdh5.0.0-beta-2

JV


Re: 答复: 答复: HBase Replication - Addition alone

2014-04-09 Thread Manthosh Kumar T
Hi Demai,
  I thought making the client communicate with Central Server
frequently may not be efficient. In case of replication, I can do that
periodically in bulk.


On 7 April 2014 22:00, Demai Ni  wrote:

> I looks to me that. You like to have the small clusters located close to
> the client, and then use the smaller clusters as Masters. So there will be
> multi-Masters and one-Slave cluster setup. And the one-Slave cluster is the
> centralized and large HBase server.
>
> Well, it works. But I don't get the two points:
> 1) what's saved here? to get the replication works from the smaller
> clusters to the centralized large cluster will consume the cpu/storage and
> network resource. So why not get the client to talk directly to the
> centralized cluster. The added-on layer of the smaller clusters will only
> reduce the performance.  One factor is that if the network doesn't allow
> the client to talk directly to centralized cluster, the replication won't
> work well either due to the lag.
> 2) I still don't get why allow client to delete on smaller cluster and
> doesn't allow such transactions replayed on the centralized cluster.
> Assume you have a good business reason to disallow client to delete
> existing data, an application layer of permission control may be better.
> That is don't allow delete on smaller clusters neither.
>
> just my 2 cents.
>
> Demai
>
>
>
>
>
> On Sun, Apr 6, 2014 at 11:09 PM, Manthosh Kumar T  >wrote:
>
> > My use case is that I need to replicate between two geographically
> distant
> > clusters. In the simplest form, I have several geographically distant
> > clients that needs to add data to a geographically distant centralized
> > HBase server. So, I thought I'll maintain a small cluster at each client
> > and make that replicate to the central server. But when I delete in the
> > small cluster, I don't want that to be replicated in the centralized
> > server. Hence, I think the coproc route is fine. Please correct me if I'm
> > wrong. Or is there a better solution for my use case?
> >
> >
> > On 5 April 2014 05:16, Demai Ni  wrote:
> >
> > > agree about the suggestion above. just like to chime in a bit more.
> > >
> > > One question, how do you like to treat the 'put' on the existing row?
> > well,
> > > it is a delete + addition to some degree.
> > >
> > > If go to the coproc route, maybe better not to use replication at all.
> > > Basically, you can have two tables: "source" and "backup", and when
> ever
> > a
> > > 'put' is on 'source', the coproc will replay it to 'backup'. And the
> > table
> > > 'backup' can be on the same cluster or another cluster.
> > >
> > > Not sure about your use case, maybe the existing 'version' feature (
> > > http://hbase.apache.org/book/schema.versions.html) can be used with a
> > > large
> > > max-version. If the same row/cell won't rewritten a lot, then the cost
> > will
> > > be similar as the replication or coproc.
> > >
> > > Demai
> > >
> > >
> > > On Fri, Apr 4, 2014 at 6:45 AM, Jean-Marc Spaggiari <
> > > jean-m...@spaggiari.org
> > > > wrote:
> > >
> > > > If you add a coproc on the destination cluster and ask it to reject
> all
> > > the
> > > > deletes, that might do the trick, but you might end up with some
> issues
> > > at
> > > > the end if you want to do some maintenance in the target cluster...
> > > >
> > > >
> > > > 2014-04-04 4:26 GMT-04:00 冯宏华 :
> > > >
> > > > > Can't figure out solution to achieve this behavior using existing
> > means
> > > > in
> > > > > HBase immediately.
> > > > >
> > > > > But it seems not that hard to implement it by changing some code, a
> > > rough
> > > > > thought is to filter out delete entries when pushing entries to the
> > > > > according replication peer and this behavior can be made
> > configurable.
> > > > > 
> > > > > 发件人: Manthosh Kumar T [manth...@gmail.com]
> > > > > 发送时间: 2014年4月4日 16:11
> > > > > 收件人: user@hbase.apache.org
> > > > > 主题: Re: 答复: HBase Replication - Addition alone
> > > > >
> > > > > Is it possible by any other meansin HBase?
> > > > >
> > > > >
> > > > > On 4 April 2014 13:37, 冯宏华  wrote:
> > > > >
> > > > > > No
> > > > > > 
> > > > > > 发件人: Manthosh Kumar T [manth...@gmail.com]
> > > > > > 发送时间: 2014年4月4日 16:00
> > > > > > 收件人: user@hbase.apache.org
> > > > > > 主题: HBase Replication - Addition alone
> > > > > >
> > > > > > Hi All,
> > > > > >  In a master-slave replication, is it possible to
> replicate
> > > > only
> > > > > > the addition of rows?. If I delete in the master it shouldn't be
> > > > deleted
> > > > > in
> > > > > > the slave.
> > > > > >
> > > > > > --
> > > > > > Cheers,
> > > > > > Manthosh Kumar. T
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Cheers,
> > > > > Manthosh Kumar. T
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Cheers,
> > Manthosh Kumar. T
> >
>



-- 
Cheers,
Manthosh Kumar. T


Lighter Map/Reduce on HBase

2014-04-09 Thread Henning Blohm
We operate a solution that stores large amounts of data in HBASE that needs
to be available for online access.

For efficient scanning, there are three pieces of data encoded in row keys
(in particular a time dimension) and for other reasons some columns hold
JSON encoded data.

Currently, analytics data is created in two ways:

a) a non-trivial M/R job that computes pre-aggregated data sets and
offloads them into an analytical data base for interactive reporting
b) other M/R jobs that create specialize reports (heuristics) that cannot
be computed from pre-aggregated data

In particular for b) but possibly also for variations of a) I would like to
find more "user friendly" ways than Java implemented M/R jobs - at least
for some cases.

So this is not about interactive querying of data directly from HBase
tables. It is rather about pre-processing HBase stored (large) data sets
into either input to interactive query engines (some other DB, Phoenix,...)
or into some other specialized format.

I spent some time with HIVE but found that the HBase integration simply
doesn't cut it (parsing a row key, mapping JSON column content). I know
there is some more out there, but before spending an eternity trying out
various methods, I am shamelessly trying to benefit from your expertise by
asking for some good pointers.

Thanks,
Henning