how to use StochasticLoadBalancer

2016-11-12 Thread Li Li
I have many tables. some are heavily used and some others are not. But the default load balancer is balanced by region numbers. Thus some machines are very busy and other machines have no requests. So I want to try StochasticLoadBalancer. But I can't find any document tell me how to config it in hb

Re: can my old hbase client use new hbase version?

2015-12-06 Thread Li Li
thank you. On Sat, Dec 5, 2015 at 11:45 PM, Ted Yu wrote: > I think you can. > > See the following: > http://hbase.apache.org/book.html#_upgrade_paths > > It is advisable to use 1.1.2 client so that you get the full feature set > from 1.1.2 > > Cheers > > On F

can my old hbase client use new hbase version?

2015-12-04 Thread Li Li
I want to set up a hbase cluster. I found the latest stable release is 1.1.2. But I have some old client codes writen with hbase 0.98. I don't want to rewrite them. is it possible to use 0.98 client codes to interact with 1.1.2 version server?

Re: OutOfOrderScannerNextException

2015-09-09 Thread Li Li
t; than the scanner timeout; i.e. > hbase.client.scanner.timeout.period > > > St.Ack > > On Tue, Sep 8, 2015 at 11:18 PM, Li Li wrote: > >> is it possible setting it using hbase-site.xml? >> I can't modify titan codes. it only read hbase configuration file. >&

Re: OutOfOrderScannerNextException

2015-09-08 Thread Li Li
in smaller batches, smaller than > 100. > St.Ack > > On Mon, Sep 7, 2015 at 3:56 AM, Li Li wrote: > >> I am using titan which use hbase as it's storage engine. The hbase >> version is 1.0.0-cdh5.4.4. >> it's a full table scan over a large table. Is there an

OutOfOrderScannerNextException

2015-09-07 Thread Li Li
I am using titan which use hbase as it's storage engine. The hbase version is 1.0.0-cdh5.4.4. it's a full table scan over a large table. Is there any configuration I can change to tackle this problem. The exception stack is: Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.

Re: OutOfOrderScannerNextException when export table

2015-01-15 Thread Li Li
trimming down hbase.client.scanner.caching to 100 server side and > increase hbase.regionserver.lease.period, > hbase.client.scanner.timeout.period to 5 minutes and see how it goes. > > Geo > On Jan 15, 2015 6:01 AM, "Li Li" wrote: > >> I am using hbase-0.98.5-h

OutOfOrderScannerNextException when export table

2015-01-15 Thread Li Li
I am using hbase-0.98.5-hadoop1 with hadoop 1.2.1. And I want to export a table by: hbase org.apache.hadoop.hbase.mapreduce.Export table file:///data/table but the mapreduce job failed: aused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException: org.apache.hadoop.hbase.exceptions.OutOfOrderS

Re: Unable to load native-hadoop library in java hbase client

2015-01-05 Thread Li Li
.apache.org/book.html#hadoop.native.lib ? It was updated > recently. > > You have symlinked or copied the the native libs under your client? List > out the links for us here. > > St.Ack > > > On Mon, Jan 5, 2015 at 12:26 AM, Li Li wrote: > >> WARN main org.apache.hadoop.util

Unable to load native-hadoop library in java hbase client

2015-01-05 Thread Li Li
WARN main org.apache.hadoop.util.NativeCodeLoader Unable to load native-hadoop library for your platform... using builtin-java classes where applicable I have correctly installed native lib in hadoop and hbase. I can verify it by ./bin/hbase --config ~/conf_hbase org.apache.hadoop.util.NativeLibr

Re: copy from one cluster to another of different version

2014-11-28 Thread Li Li
thank you. How can I specify replication factor with this command? will hbase -Ddfs.replication=1 org.apache.hadoop.hbase.mapreduce.CopyTable work? On Fri, Nov 28, 2014 at 3:32 PM, Vineet Mishra wrote: > Hi Li Li, > > You can copy the Hbase Tables Remotely to another machine

copy from one cluster to another of different version

2014-11-27 Thread Li Li
I have a hbase cluster of version 0.98.5 with hadoop-1.2.1(no mapreduce) I want to copy all the tables to another cluster whose version is 0.98.1-cdh5.1.0 with 2.3.0-cdh5.1.0. And also I want specify the hdfs replication factor of the files in new cluster. is it possible?

Re: can't start region server after crash

2014-11-19 Thread Li Li
Nov 20 13:57:44 CST 2014 in 9065 milliseconds On Thu, Nov 20, 2014 at 11:25 AM, Ted Yu wrote: > Have you tried using fsck ? > > Cheers > > On Wed, Nov 19, 2014 at 6:56 PM, Li Li wrote: > >> also in hdfs ui, I found Number of Under-Replicated Blocks : 497741 >> it seems

Re: can't start region server after crash

2014-11-19 Thread Li Li
ou tried using fsck ? > > Cheers > > On Wed, Nov 19, 2014 at 6:56 PM, Li Li wrote: > >> also in hdfs ui, I found Number of Under-Replicated Blocks : 497741 >> it seems there are many bad blocks. is there any method to rescue good >> data? >> >> On T

Re: can't start region server after crash

2014-11-19 Thread Li Li
also in hdfs ui, I found Number of Under-Replicated Blocks : 497741 it seems there are many bad blocks. is there any method to rescue good data? On Thu, Nov 20, 2014 at 10:52 AM, Li Li wrote: > I am running a single node pseudo hbase cluster on top of a pseudo hadoop. > hadoop is 1.2

can't start region server after crash

2014-11-19 Thread Li Li
I am running a single node pseudo hbase cluster on top of a pseudo hadoop. hadoop is 1.2.1 and replication factor of hdfs is 1. And the hbase version is 0.98.5 Last night, I found the region server crashed (the process is gone) I found many logs say [JvmPauseMonitor] util.JvmPauseMonitor: Detected

can't start hbase.

2014-10-30 Thread Li Li
hi all, I am using hbase and also phoenix(some tables are managed by myself and some are created by phoenix). Last night, the disk is full . I killed the hbase and hadoop related processes. But After that I can't start hbase anymore. I am using ubuntu 12.04 and hadoop-1.2.1 and hbase 0.98.5 wi

question about incremental backup and cluster replication

2014-09-04 Thread Li Li
hi all, in my application, most of the time, we do not write to hbase but only read. Every a few hours(or even a day), We do a lot of write operations to hbase intensively. To avoid data lost, we also want backup our data every day(or week) I know hbase has advanced features like backup and

Re: how to do parallel scanning in map reduce using hbase as input?

2014-06-26 Thread Li Li
t; > Cheers > > On Jun 26, 2014, at 12:34 AM, Li Li wrote: > >> my table has about 700 million rows and about 80 regions. each task >> tracker is configured with 4 mappers and 4 reducers at the same time. >> The hadoop/hbase cluster has 5 nodes so at the same time, i

how to do parallel scanning in map reduce using hbase as input?

2014-06-26 Thread Li Li
my table has about 700 million rows and about 80 regions. each task tracker is configured with 4 mappers and 4 reducers at the same time. The hadoop/hbase cluster has 5 nodes so at the same time, it has 20 mappers running. it takes more than an hour to finish mapper stage. The hbase cluster's load

Re: TableInputFormatBase Cannot resolve the host name

2014-06-25 Thread Li Li
thanks, I got it. On Wed, Jun 25, 2014 at 5:25 PM, Ted Yu wrote: > Please see https://issues.apache.org/jira/browse/HBASE-10906 > > Cheers > > On Jun 25, 2014, at 2:12 AM, Li Li wrote: > >> there is not any DNS server for me . you mean find name by ip? if no >

Re: TableInputFormatBase Cannot resolve the host name

2014-06-25 Thread Li Li
there is no reverse > DNS setup. I believe that TableInputFormatBase class requires revers DNS > name resolution. > > Regards > Samir > > > On Wed, Jun 25, 2014 at 10:57 AM, Li Li wrote: > >> I have many map reduce jobs using hbase table as input. Others are all

Re: TableInputFormatBase Cannot resolve the host name

2014-06-25 Thread Li Li
n you ping vc141 from this machine ? > > Cheers > > On Jun 25, 2014, at 1:29 AM, Li Li wrote: > >> I have a map reduce job using hbase table as input. when the job >> starts, it says: >> >> ERROR main org.apache.hadoop.hbase.mapreduce.TableInputFormatBase >

Re: TableInputFormatBase Cannot resolve the host name

2014-06-25 Thread Li Li
yes [hadoop@vc138 ~]$ ping vc141 PING vc141 (172.16.10.141) 56(84) bytes of data. 64 bytes from vc141 (172.16.10.141): icmp_seq=1 ttl=64 time=0.118 ms On Wed, Jun 25, 2014 at 4:49 PM, Ted Yu wrote: > Can you ping vc141 from this machine ? > > Cheers > > On Jun 25, 2014, at 1:29 A

TableInputFormatBase Cannot resolve the host name

2014-06-25 Thread Li Li
I have a map reduce job using hbase table as input. when the job starts, it says: ERROR main org.apache.hadoop.hbase.mapreduce.TableInputFormatBase Cannot resolve the host name for vc141/172.16.10.141 because of javax.naming.CommunicationException: DNS error [Root exception is java.net.PortUnreach

Re: how to know run major compact successfully?

2014-06-19 Thread Li Li
gt; > 110% !!!Mind raising a jira with the details and ur observation? > > -Anoop- > > On Wed, Jun 18, 2014 at 8:56 AM, Li Li wrote: > >> I found the information in web ui >> http://mphbase-master1:60010/master-status#compactStas >> but the Compaction Progre

Re: speed control on the server side

2014-06-18 Thread Li Li
ommend to set hbase.client.max.perserver. > tasks to 1 in the client. You may also want to change the buffer size ( > hbase.client.write.buffer)... > > > > > On Wed, Jun 18, 2014 at 12:58 PM, Li Li wrote: > >> and also there so many Puts maintained by background h

Re: speed control on the server side

2014-06-18 Thread Li Li
and also there so many Puts maintained by background hbase threads that consuming too much resources On Wed, Jun 18, 2014 at 6:54 PM, Li Li wrote: > I mean client slow itself down. e.g. > my client code(one of many threads) > while(true){ > // process data and generate

Re: speed control on the server side

2014-06-18 Thread Li Li
r.tasks). > > Cheers, > > Nicolas > > > On Wed, Jun 18, 2014 at 11:59 AM, Li Li wrote: > >> hi all, >> the hbase client send too much requests and the some region server >> down. >> 1. region server down because of gc pause >>

speed control on the server side

2014-06-18 Thread Li Li
hi all, the hbase client send too much requests and the some region server down. 1. region server down because of gc pause I can see it from log: [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3056ms I ca

Re: how to know run major compact successfully?

2014-06-17 Thread Li Li
I found the information in web ui http://mphbase-master1:60010/master-status#compactStas but the Compaction Progress is confusing, which 110.55% On Wed, Jun 18, 2014 at 11:19 AM, Li Li wrote: > I have deleted many rows in a table. as far as I know, only after > major compaction, they are

how to know run major compact successfully?

2014-06-17 Thread Li Li
I have deleted many rows in a table. as far as I know, only after major compaction, they are really deleted. So I run major compact 'tablename' in hbase shell it returns very fast. is the real compact operation running in background? how do I know it's done?

high load average in one region server

2014-06-11 Thread Li Li
I have 5 region server hbase cluster. today I found one rs server's load average is above 100 while the other 4 is less than 1. I use vmstat and dstat and found that this high load machine have large number of read(about 30M/s) and network sent. Does that mean the cluster suffers hot spot?

Re: can't start hbase in distributed mode in a single machine

2014-06-09 Thread Li Li
If I use external zookeeper it's ok. 1. I modified hbase-env.sh and add export HBASE_MANAGES_ZK=false 2. start zookeeper in standalone mode 3. start-hbase.sh what's the problem? On Mon, Jun 9, 2014 at 2:35 PM, Li Li wrote: > letting start-hbase.sh do it for me > > On Mon,

Re: can't start hbase in distributed mode in a single machine

2014-06-08 Thread Li Li
letting start-hbase.sh do it for me On Mon, Jun 9, 2014 at 2:05 PM, Dima Spivak wrote: > Dear Li, > > Are you managing your own ZK instance or just letting start-hbase.sh handle > it? > > -Dima > > > On Sun, Jun 8, 2014 at 10:17 PM, Li Li wrote: > >> I run st

can't start hbase in distributed mode in a single machine

2014-06-08 Thread Li Li
I run start-hbase.sh and after a few minutes, the HMaster process disappears, but the HQuorumPeer process is ok. I can telnet localhost 2181. I am using hadoop-1.2.1(which is ok by visit http://DC-TEST-1:50070 and http://DC-TEST-1:50030) hbase version is hbase-0.96.2-hadoop1 content of hbase-site.x

Re: last mapper of mapreduce on hbase very slow

2014-05-16 Thread Li Li
); will code snipplet 2 be better? On Thu, May 15, 2014 at 1:29 AM, Stack wrote: > On Tue, May 13, 2014 at 6:45 PM, Li Li wrote: > ... > >> I found that at the beginning, the map-reduce will make about 3,000 >> hbase requests per second. But when there is only the last mapper

how to optimize my cluster setting?

2014-05-15 Thread Li Li
I have a small six nodes cluter. one node run master and namenode, another run secondary namenode. the other 4 nodes are datanodes and region servers. each node has 16GB memory and a 4 core cpu my application is very simple. I use hbase to store data for a web spider. the table is: 1. url_db

map reduce become much slower when upgrading from 0.94.11 to 0.96.2-hadoop1

2014-05-15 Thread Li Li
today I upgraded hbase 0.94.11 to 0.96.2-hadoop1. I have not changed any client codes except replace 0.94.11 client jar to 0.96.2 's When with old version. when doing mapreduce task. the requests per seconds is about 10,000. But with new one, the value is 300. What's wrong with it? The hbase put an

last mapper of mapreduce on hbase very slow

2014-05-13 Thread Li Li
I use two hbase tables as mapper input. one is url table, the other is links between url sample rows of url tabel: http://abc.com/index.htm, content1 http://abc.com/news/123.htm,content sample rows of linkstable http://abc.com/index.htm++http://abc.com/news/123.htm anchor1 mapper will aggregate url

Re: HRegionInfo was null or empty in Meta

2014-05-13 Thread Li Li
itor, tableName, row, > this.prefetchRegionLimit, HConstants.META_TABLE_NAME); > } catch (IOException e) { > LOG.warn("Encountered problems when prefetch META table: ", e); > } > > Can you scan / write to vc2.out_link ? > > Cheers > > > On Tue, May

can't get hbase-0.96.2-hadoop1 from central maven repository

2014-05-12 Thread Li Li
Downloading: http://repo.maven.apache.org/maven2/org/apache/hbase/hbase/0.96.2-hadoop1/hbase-0.96.2-hadoop1.pom Downloaded: http://repo.maven.apache.org/maven2/org/apache/hbase/hbase/0.96.2-hadoop1/hbase-0.96.2-hadoop1.pom (76 KB at 35.2 KB/sec) Downloading: http://repo.maven.apache.org/maven2/o

two rs nodes crashed

2014-05-12 Thread Li Li
by reading log, it seems the region server suffered from gc pause time. my region server jvm arguments: r -XX:OnOutOfMemoryError=kill -9 %p -Xmx6000m -server -XX:NewSize=512m -XX:MaxNewSize=1024m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=70 2014-05-12 13:31:52,21

Re: how to pre split a table whose row key is MD5(url)?

2014-05-11 Thread Li Li
4-127,128-191,191-255) > > And hope that you’ll have an even split. > > In theory, over time you will. > > > On May 8, 2014, at 1:58 PM, Li Li wrote: > >> say I have 4 region server. How to pre split a table using MD5 as row key? >> >

how to pre split a table whose row key is MD5(url)?

2014-05-11 Thread Li Li
say I have 4 region server. How to pre split a table using MD5 as row key?

Re: HRegionInfo was null or empty in Meta

2014-05-07 Thread Li Li
-1,60020,1398226921318 app-hbase-2,60020,1398226921328 app-hbase-4,60020,1398226920856 app-hbase-5,60020,1398226920317 0 inconsistencies detected. Status: OK On Tue, May 6, 2014 at 9:40 PM, Ted Yu wrote: > Have you run hbck on vc2.out_link ? > > Cheers > > On May 6, 2014, at 6:33 A

HRegionInfo was null or empty in Meta

2014-05-06 Thread Li Li
I am using 0.94.11 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation Encountered problems when prefetch META table: java.io.IOException: HRegionInfo was null or empty in Meta for vc2.out_link, row=vc2.out_link,,99 at org.apache.hadoop.hbase.client.Me

Re: question about threads count

2014-04-28 Thread Li Li
14 at 9:44 AM, Jean-Marc Spaggiari wrote: > Simply don't set your status to 0 when you write it first. > > Absence mean not read. > 1 mean read. > So there is no risk that someone try to set 0 and someone else try to set 1. > > Will that be an option? > > > 2014

Re: question about threads count

2014-04-28 Thread Li Li
the first one and > HBase will take care of the versions. > > regarding the codes fragments, I don't think the autoflush is going to do a > big difference compared to the cost of the check & put... > > > 2014-04-28 20:50 GMT-04:00 Li Li : > >> I must use checkAn

Re: question about threads count

2014-04-28 Thread Li Li
> but the result will not be the same... a batch of puts will not do any > check... > > > 2014-04-28 20:17 GMT-04:00 Li Li : > >> but I have many checkAndPut operations. >> will use batch a better solution? >> >> On Mon, Apr 28, 2014 at 8:01 PM, Jean-Mar

Re: question about threads count

2014-04-28 Thread Li Li
but I have many checkAndPut operations. will use batch a better solution? On Mon, Apr 28, 2014 at 8:01 PM, Jean-Marc Spaggiari wrote: > Hi Li Li, > > Yes, threads will impact the performances. If you send all you writes with > a single thread, a single HBase handler will take care

question about threads count

2014-04-28 Thread Li Li
hi all, with the same read/write data, will threads count affect performance? e.g. I have 10,000 write request/second. I don't care the order very much. how many writer threads should I use to obtain maximum throughput?

Re: how to split region in hbase shell?

2014-04-23 Thread Li Li
I am using 0.94.11 On Thu, Apr 24, 2014 at 11:48 AM, Ted Yu wrote: > Are you using 0.94.2 or newer release ? > > See HBASE-6643 Accept encoded region name in compacting/spliting region > from shell > > > On Wed, Apr 23, 2014 at 8:39 PM, Li Li wrote: > >> because th

Re: how to split region in hbase shell?

2014-04-23 Thread Li Li
looks like you were specifying end key. Can you try > specifying table name only ? > > Cheers > > > On Wed, Apr 23, 2014 at 8:20 PM, Li Li wrote: > >> I found one of 4 region server is heavy load than other. and I want to >> split region ma

how to split region in hbase shell?

2014-04-23 Thread Li Li
I found one of 4 region server is heavy load than other. and I want to split region manully. from the web ui name: vc2.url_db,,1398174763371.35a8599a5eb457b9e0210f86d8b6d19f. region serverapp-hbase-1:60030 start key end key \x1F\xFE\x9B\xFA\x95\x91\xB7\xF0\x9FX\x83\xC9\xBFw\xBD\xDE request 1073606

Re: is my hbase cluster overloaded?

2014-04-22 Thread Li Li
DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on vc2.url_db,\x7F\xE8\xDB\xACq\xC0R\x109\x96\xF1\x08\xA5\xD3X\x1D,1398152426132.8e596f645f035ba44991e77cd73e1b01. On Tue, Apr 22, 2014 at 4:11 PM, Li Li wrote: > jmap -heap of app-hbase-1 why oldSize so small? >

Re: is my hbase cluster overloaded?

2014-04-22 Thread Li Li
912 (1658.1095809936523MB) free = 1119039272 (1067.1990127563477MB) 60.841168034923655% used Perm Generation: capacity = 40685568 (38.80078125MB) used = 24327624 (23.20063018798828MB) free = 16357944 (15.600151062011719MB) 59.79423465342797% used On Tue, Apr 22, 2014 at 4:00 PM, Li

Re: is my hbase cluster overloaded?

2014-04-22 Thread Li Li
, maxHeapMB=7948 Total: servers: 4 requestsPerSecond=666992, numberOfOnlineRegions=20 hbase-2 and 4 only have 3 region. how to balance them? On Tue, Apr 22, 2014 at 3:53 PM, Li Li wrote: > hbase current statistics: > > Region Servers > ServerName Start time Load > app-hbase-1,60020,139814

Re: is my hbase cluster overloaded?

2014-04-22 Thread Li Li
=2, usedHeapMB=328, maxHeapMB=7948 Total: servers: 4 requestsPerSecond=11372, numberOfOnlineRegions=18 On Tue, Apr 22, 2014 at 3:40 PM, Li Li wrote: > I am now restart the sever and running. maybe an hour later the load > will become high > > On Tue, Apr 22, 2014 at 3:02 PM, Azuryy Yu

Re: is my hbase cluster overloaded?

2014-04-22 Thread Li Li
how much MaxNewSize needed for my configuration? On Tue, Apr 22, 2014 at 3:02 PM, Azuryy Yu wrote: > Do you still have the same issue? > > and: > -Xmx8000m -server -XX:NewSize=512m -XX:MaxNewSize=512m > > the Eden size is too small. > > > > On Tue, Apr 22,

Re: is my hbase cluster overloaded?

2014-04-22 Thread Li Li
> > > On Tue, Apr 22, 2014 at 2:55 PM, Li Li wrote: > >> >> dfs.datanode.handler.count >> 100 >> The number of server threads for the datanode. >> >> >> >> 1. namenode/master 192.168.10.48 >> http://pastebin.com/7M0zzAAc >&

Re: is my hbase cluster overloaded?

2014-04-21 Thread Li Li
ode, namenode, region servers JVM options? if > they are all by default, then there is also have this issue. > > > > > On Tue, Apr 22, 2014 at 2:20 PM, Li Li wrote: > >> my cluster setup: both 6 machines are virtual machine. each machine: >> 4CPU Intel(R) Xeon(R

Re: is my hbase cluster overloaded?

2014-04-21 Thread Li Li
://pastebin.com/cGtpbTLz 192.168.10.50 region log http://pastebin.com/bD6h5T6p(very strange, not log at 20:33, but have log at 20:32 and 20:34) On Tue, Apr 22, 2014 at 12:25 PM, Ted Yu wrote: > Can you post more of the data node log, around 20:33 ? > > Cheers > > > On Mon, Apr 21, 2014

Re: is my hbase cluster overloaded?

2014-04-21 Thread Li Li
; On Mon, Apr 21, 2014 at 7:39 PM, Li Li wrote: > >> I have a small hbase cluster with 1 namenode, 1 secondary namenode, 4 >> datanode. >> and the hbase master is on the same machine with namenode, 4 hbase >> slave on datanode machine. >> I found average requests p

is my hbase cluster overloaded?

2014-04-21 Thread Li Li
I have a small hbase cluster with 1 namenode, 1 secondary namenode, 4 datanode. and the hbase master is on the same machine with namenode, 4 hbase slave on datanode machine. I found average requests per seconds is about 10,000. and the clusters crashed. and I found the reason is one datanode failed

Re: hbase exception: Could not reseek StoreFileScanner

2014-04-14 Thread Li Li
t seen this one before. I assume HDFS was running at the time... > > > -- Lars > > > > ________ > From: Li Li > To: user@hbase.apache.org; lars hofhansl > Sent: Monday, April 14, 2014 9:09 PM > Subject: Re: hbase exception: Could not reseek

Re: hbase exception: Could not reseek StoreFileScanner

2014-04-14 Thread Li Li
ks. > > -- Lars > > > > ________ > From: Li Li > To: user@hbase.apache.org > Sent: Monday, April 14, 2014 5:32 PM > Subject: hbase exception: Could not reseek StoreFileScanner > > > Mon Apr 14 23:54:40 CST 2014, > org.apache.hadoop.hbase.client.HTable$9@14923f6b, jav

Re: hbase exception: Could not reseek StoreFileScanner

2014-04-14 Thread Li Li
Cheers > > > On Mon, Apr 14, 2014 at 5:32 PM, Li Li wrote: > >> Mon Apr 14 23:54:40 CST 2014, >> org.apache.hadoop.hbase.client.HTable$9@14923f6b, java.io.IOException: >> java.io.IOException: Could not reseek StoreFileScanner[HFileScanner >> for

hbase exception: Could not reseek StoreFileScanner

2014-04-14 Thread Li Li
Mon Apr 14 23:54:40 CST 2014, org.apache.hadoop.hbase.client.HTable$9@14923f6b, java.io.IOException: java.io.IOException: Could not reseek StoreFileScanner[HFileScanner for reader reader=hdfs://192.168.11.150:8020/hbase/vc2.in_link/6 b879cb43205cdae084a280c38fab34a/cf/4dc235709de44f53b2484d2903f1bb

Re: Scan vs map-reduce

2014-04-14 Thread Li Li
oing a lot of IO for a web-app so this is going to be tough to > make ³fast², but there are ways to make it ³faster.² > > But since you only have 1,000,000 rows you might not have many regions, so > this might wind up all going on the same RegionServer. > > > > > On 4/14

Re: Scan vs map-reduce

2014-04-14 Thread Li Li
or me? something like sql statement where rowkey in('abc', 'abd' ). a very long in statement On Mon, Apr 14, 2014 at 7:46 PM, Jean-Marc Spaggiari wrote: > Hi Li Li, > > If you have more than one region, might be useful. MR will scan all the > regions in parallel. If

Scan vs map-reduce

2014-04-13 Thread Li Li
I have a full table scan which cost about 10 minutes. it seems a bottleneck for our application. if use map-reduce to rewrite it. will it be faster?

use hbase as a global cache

2014-04-13 Thread Li Li
hi all I want to use hbase as a global cache. I need some advice . getFromCache(String key){ get from hbase; if exist return result; value="working..."; boolean res=checkAndPut; if(res){ value=doWork(key); Put(key, value);

Re: how to reverse an integer for rowkey?

2014-03-27 Thread Li Li
great feature but I am using 0.94 now On Thu, Mar 27, 2014 at 4:49 PM, haosdent wrote: > How about Reverse Scan? https://issues.apache.org/jira/browse/HBASE-4811 > > > On Thu, Mar 27, 2014 at 4:24 PM, Li Li wrote: > >> my rowkey is >> I want to scan it by decreasing

how to reverse an integer for rowkey?

2014-03-27 Thread Li Li
my rowkey is I want to scan it by decreasing order of the int field, how to make it reversed? if the row key is Bytes.toBytes(intField) + Bytes.toBytes(strField), then the order is increasing. one solution is replace intField with -intField. but if intField==Integer.MIN_VALUE, what will happen?

Re: how to calculate stop key for a scan?

2014-03-24 Thread Li Li
t; ROWCOLUMN+CELL > 7fc56270e7a70fa81a5935b72eacbe297fc56270e column=c1:, > timestamp=1395721490980, value=A > 7a70fa81a5935b72eacbe29 > 7fc56270e7a70fa81a5935b72eacbe299d5ed678f column=c1:, > timestamp=1395721814374, value=AB > e57bcca

Re: how to calculate stop key for a scan?

2014-03-24 Thread Li Li
1. byte[] md5=DigestUtils.md5(k1); Scan scan=new Scan(); scan.setStartRow(md5); scan.setFilter(new PrefixFilter(md5)); 2. byte[] md5=DigestUtils.md5(k1); Scan scan=new Scan(); scan.setFilter(new PrefixFilter(md5)); will code snipplet1 be faster than 2? On Tue, Mar 25, 2014 at 12:38 PM, Li Li

Re: how to calculate stop key for a scan?

2014-03-24 Thread Li Li
is it slower than scaner? On Tue, Mar 25, 2014 at 11:48 AM, Ted Yu wrote: > Please consider using PrefixFilter where MD5(key1) is the prefix. > > > On Mon, Mar 24, 2014 at 8:45 PM, Li Li wrote: > >> sorry, I want to get all the rows startsWith k1 >> example: >>

Re: how to calculate stop key for a scan?

2014-03-24 Thread Li Li
6270e7a70fa81a5935b72eacbe297fc56270e column=c1:, > timestamp=1395721490980, value=A > 7a70fa81a5935b72eacbe29 > 7fc56270e7a70fa81a5935b72eacbe299d5ed678f column=c1:, > timestamp=1395721814374, value=AB > e57bcca610140957afab571 > 2 row(s) in 0.0210 seconds > >

Re: how to calculate stop key for a scan?

2014-03-24 Thread Li Li
d28e17f72" and set > stopkey to "900150983cd24fb0d6963f7d28e17f73". > > > On Tue, Mar 25, 2014 at 11:45 AM, Li Li wrote: > >> sorry, I want to get all the rows startsWith k1 >> example: >> k1k2 rowKey >> abc aaa -> MD5(abc)MD5(aaa) >> abc bbb ->

Re: how to calculate stop key for a scan?

2014-03-24 Thread Li Li
thank you On Tue, Mar 25, 2014 at 11:48 AM, Ted Yu wrote: > Please consider using PrefixFilter where MD5(key1) is the prefix. > > > On Mon, Mar 24, 2014 at 8:45 PM, Li Li wrote: > >> sorry, I want to get all the rows startsWith k1 >> example: >> k1k2

Re: how to calculate stop key for a scan?

2014-03-24 Thread Li Li
et all the rows equals k1. > > Use Get(MD5(k1)MD5(k1)) without set startkey and stopkey. > > > On Tue, Mar 25, 2014 at 11:36 AM, Li Li wrote: > >> I have two string as primary key(k1,k2) >> and my row key in hbase is MD5(k1)MD5(k1) >> I want to get all the rows

how to calculate stop key for a scan?

2014-03-24 Thread Li Li
I have two string as primary key(k1,k2) and my row key in hbase is MD5(k1)MD5(k1) I want to get all the rows equals k1.I can set startRowKey easily. But How can I calculate stopRowKey? is following correct? what if the last byte of md5 is 127? what about overflow? any tools for this? Scan scan=new

Re: hbase shell can't connect to server

2014-03-04 Thread Li Li
0.94.11 On Wed, Mar 5, 2014 at 1:45 PM, Ted Yu wrote: > What version of HBase are you using ? > > Take a look at http://hbase.apache.org/book.html#trouble.tools.builtin.zkcli > > > On Tue, Mar 4, 2014 at 9:23 PM, Li Li wrote: > >> hi all, >> when I run ./bin

Re: hbase shell can't connect to server

2014-03-04 Thread Li Li
er.java:303) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) On Wed, Mar 5, 2014 at 1:23 PM, Li Li wrote: > hi all, > when I run ./bin/hbase shell. it's ok. but when I execute 'list

hbase shell can't connect to server

2014-03-04 Thread Li Li
hi all, when I run ./bin/hbase shell. it's ok. but when I execute 'list', it hangs. I have tested it with telnet zookeeper 2181 and it's ok. but I use netstat -lnp can't find any outgoing tcp connections. I use jstack to check the status: "main" prio=10 tid=0x7ff648009800 ni

HTablePool is deprecated, any alternatives?

2014-02-12 Thread Li Li
I am using hbase 0.94.11. it says HTablePool is deprecated. is there any alternatives for it?

Re: is this rowkey schema feasible?

2014-01-12 Thread Li Li
hat, >> just add " SALT_BUCKETS=" on to your query, where is the number of >> machines in your cluster. You can read more about salting here: >> http://phoenix.incubator.apache.org/salted.html >> >> >> On Thu, Jan 2, 2014 at 11:36 PM, Li Li wrote: >

Re: use hbase as distributed crawl's scheduler

2014-01-12 Thread Li Li
notonically increasing date in the key. To do that, >> just add " SALT_BUCKETS=" on to your query, where is the number of >> machines in your cluster. You can read more about salting here: >> http://phoenix.incubator.apache.org/salted.html >> >> >> O

Re: is this rowkey schema feasible?

2014-01-09 Thread Li Li
(path10,000,000) On Fri, Jan 10, 2014 at 2:02 AM, Stack wrote: > On Thu, Jan 9, 2014 at 2:42 AM, Li Li wrote: > >> hi all, >> I want to use hbase to store all urls for a distributed crawler. >> there is a central scheduler to schedule all unCrawled urls by >> pr

is this rowkey schema feasible?

2014-01-09 Thread Li Li
hi all, I want to use hbase to store all urls for a distributed crawler. there is a central scheduler to schedule all unCrawled urls by priority. Following is my design of rowkey and common data access pattern, is there any better rowkey design for my usecase? the row key is: reverse_host-

Re: [HbaseInAction]twitbase-async build fail

2014-01-06 Thread Li Li
t; > Thanks for your interest, > Nick > > On Monday, January 6, 2014, Li Li wrote: > >> hi all, >>I am trying to build >> twitbase-async(https://github.com/hbaseinaction/twitbase-async) but >> failed. it's the source code of the book "HBase in Actio

[HbaseInAction]twitbase-async build fail

2014-01-06 Thread Li Li
hi all, I am trying to build twitbase-async(https://github.com/hbaseinaction/twitbase-async) but failed. it's the source code of the book "HBase in Action". I can't find the authors' emails, so I post it here. the error message is : [ERROR] Failed to execute goal on project twitbase-async:

Re: how to delete all input characters in hbase shell?

2014-01-06 Thread Li Li
thanks On Mon, Jan 6, 2014 at 8:45 PM, Jean-Marc Spaggiari wrote: > Hi Li Li, > > Ctrl-C in shell doesn't clear the line, it kills the current command. Like > in HBase shell. > > In bash if you want to clear the last word you will use Ctrl-w. If you want > to clear th

how to delete all input characters in hbase shell?

2014-01-06 Thread Li Li
hi all, In bash shell, we can use ctrl+c to cancel current command. But in hbase shell, ctrl+c will terminate the hbase shell.

Re: Java Client can't connect to a remote standalone hbase server

2014-01-05 Thread Li Li
> > successfully after I set "hbase.master" to "x.x.x.x:xxx" in >> "Configuration". >> > >> > >> > On Sun, Jan 5, 2014 at 11:00 PM, Jean-Marc Spaggiari < >> > jean-m...@spaggiari.org> wrote: >> > >> >> Wha

Re: Java Client can't connect to a remote standalone hbase server

2014-01-05 Thread Li Li
;ll try it tomorrow On Sun, Jan 5, 2014 at 10:25 PM, Haosong Huang wrote: > Could you connect zookeeper correctly? > > > On Sun, Jan 5, 2014 at 8:28 PM, Li Li wrote: > >> yes, I just want to setup a test environment >> >> On Sun, Jan 5, 2014 at 6:48 PM, Ted Yu wrote

Re: Java Client can't connect to a remote standalone hbase server

2014-01-05 Thread Li Li
yes, I just want to setup a test environment On Sun, Jan 5, 2014 at 6:48 PM, Ted Yu wrote: > For hbase.rootdir, hdfs was not used. > > Is that intended ? > > Thanks > > On Jan 4, 2014, at 10:46 PM, Li Li wrote: > >> hi all, >> I am new to hbase and enco

Java Client can't connect to a remote standalone hbase server

2014-01-04 Thread Li Li
hi all, I am new to hbase and encounter a problem of client connection. I download latest stable version(0.94.15) and start the server successfully. And I can use ./bin/hbase shell to connect to server in local, But I can't connect to the server using a remote java client. My setup config

Re: use hbase as distributed crawl's scheduler

2014-01-03 Thread Li Li
27;m employed, Salesforce.com) > use it in production today. > Thanks, > James > > > On Fri, Jan 3, 2014 at 11:39 PM, Li Li wrote: > >> hi James, >> phoenix seems great but it's now only a experimental project. I >> want to use only hbase. could you

  1   2   >