Re: Problem with performance with many columns in column familie

2010-05-11 Thread Ted Yu
jstack is a handy tool: http://java.sun.com/j2se/1.5.0/docs/tooldocs/share/jstack.html On Tue, May 11, 2010 at 9:50 AM, Sebastian Bauer wrote: > Ram is not a problem, second region server using about 550mB and first > about 300mB problem is with CPU, when i making queries to both column > famiel

Re: Type mismatch in key from map

2010-04-26 Thread Ted Yu
You are mixing old mapred calls with new mapreduce calls. Please look for a sample with the following structure: @Override public void map(Text key, Writable value, Context context) ... context.write(key, objWrite); On Sun, Apr 25, 2010 at 5:12 PM, iman453 wrot

Re: Region server goes away

2010-04-14 Thread Ted Yu
How much memory did you allocate to the regionservers ? Cheers On Wed, Apr 14, 2010 at 8:27 PM, Geoff Hendrey wrote: > Hi, > > I have posted previously about issues I was having with HDFS when I was > running HBase and HDFS on the same box both pseudoclustered. Now I have > two very capable ser

Re: org.apache.hadoop.hbase.mapreduce.Export fails with an NPE

2010-04-11 Thread Ted Yu
Hi, I added initial proposal to https://issues.apache.org/jira/browse/HADOOP-6695 which should address George's use case. Cheers On Sat, Apr 10, 2010 at 4:01 PM, Ted Yu wrote: > For the reason why NPE wasn't thrown from TableInputFormatBase.getSplits(), > I think job tracker

Re: set number of map tasks for HBase MR

2010-04-11 Thread Ted Yu
https://issues.apache.org/jira/browse/HBASE-2434 has been logged. On Sun, Apr 11, 2010 at 7:09 AM, Jean-Daniel Cryans wrote: > Yes an option could be added, along with a write buffer option for Import. > > J-D > > On Sun, Apr 11, 2010 at 3:30 PM, Ted Yu wrote

Re: set number of map tasks for HBase MR

2010-04-11 Thread Ted Yu
I noticed mapreduce.Export.createSubmittableJob() doesn't call setCaching() in 0.20.3 Should call to setCaching() be added ? Thanks On Sun, Apr 11, 2010 at 2:14 AM, Jean-Daniel Cryans wrote: > A map against a HBase table by default cannot have more tasks than the > number of regions in that tab

Re: org.apache.hadoop.hbase.mapreduce.Export fails with an NPE

2010-04-10 Thread Ted Yu
For the reason why NPE wasn't thrown from TableInputFormatBase.getSplits(), I think job tracker sent the job to 192.168.1.16 and TableInputFormatBase.table was null on that machine. I guess TableInputFormatBase.table depends on zookeeper to initialize. My two cents. On Sat, Apr 10, 2010 at 3:03 P

Re: HTable Client RS caching

2010-04-08 Thread Ted Yu
> J-D > > On Thu, Apr 8, 2010 at 10:38 AM, Ted Yu wrote: > > What if there is no region information in NSRE ? > > > > 2010-04-08 10:26:38,385 ERROR [IPC Server handler 60 on 60020] > > regionserver.HRegionServer(846): Failed openScanner > > org.apach

Re: HTable Client RS caching

2010-04-08 Thread Ted Yu
What if there is no region information in NSRE ? 2010-04-08 10:26:38,385 ERROR [IPC Server handler 60 on 60020] regionserver.HRegionServer(846): Failed openScanner org.apache.hadoop.hbase.NotServingRegionException: domaincrawltable,,1270600690648 at org.apache.hadoop.hbase.regionserver.HRe

Re: Failed to create /hbase.... KeeperErrorCode = ConnectionLoss for /hbase

2010-04-01 Thread Ted Yu
Please check the following entry in hbase-env.sh: hbase-env.sh:# The directory where pid files are stored. /tmp by default. hbase-env.sh:# export HBASE_PID_DIR=/var/hadoop/pids If pid file is stored under /tmp, it might have been cleaned up. On Thu, Apr 1, 2010 at 11:44 AM, Jean-Daniel Cryans wr

Re: slow response in hbase shell

2010-03-13 Thread Ted Yu
een the time that the get > works and the get doesn't work? > > > > Also, I recommend upgrading to 0.20.3, there are critical fixes. > > > > From: Ted Yu [mailto:yuzhih...@gmail.com] > Sent: Saturday, March 13, 2010 3:48 AM > To: hbase-user@hadoop.apache.org &g

Re: slow response in hbase shell

2010-03-13 Thread Ted Yu
's going on? Is there repetitive balancing going on > that > never seems to reach steady-state? > > How many regions and how many nodes on which version of HBase? > > > -Original Message- > > From: Ted Yu [mailto:yuzhih...@gmail.com] > > Sent: Friday

slow response in hbase shell

2010-03-12 Thread Ted Yu
Hi, > We sometimes saw over 5 second delay running get in hbase 0.20.1 shell: > > hbase(main):002:0> get 'ruletable', 'ca.tsn.www' > 0 row(s) in 10.1330 seconds > > From our 3 region servers there are a lot of such messages: > > 2010-03-12 00:00:00,996 INFO [regionserver/10.10.31.135:60020] > r

Re: NoSuchColumnFamilyException

2010-03-12 Thread Ted Yu
hbase logs for ruletable,,1268431015006 and see if there's > any exception related to that region? Can you identify exactly when it > happened and what was happening? > > Thx > > J-D > > On Fri, Mar 12, 2010 at 2:33 PM, Ted Yu wrote: > > Hi, > > When I

NoSuchColumnFamilyException

2010-03-12 Thread Ted Yu
Hi, When I tried to insert into ruletable, I saw: hbase(main):003:0> put 'ruletable', 'com.yahoo.www', 'lpm_1.0:category', '1123:1' NativeException: org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: org.apache.hadoop.hba se.regionserver.NoSuchCol

importing into hbase 0.20.4

2010-03-11 Thread Ted Yu
Hi, We may upgrade to hbase 0.20.4 after it is released. This means we will have to export from hbase 0.20.1 and import into hbase 0.20.4 Has anybody gone through this path ? Thanks

Re: region server appearing twice on HBase Master page

2010-03-11 Thread Ted Yu
; That usually happens after a DNS hiccup. There's a fix for that in > https://issues.apache.org/jira/browse/HBASE-2174 > > J-D > > On Wed, Mar 10, 2010 at 1:41 PM, Ted Yu wrote: > > I noticed two lines for the same region server on HBase Master page: > > X.com:60

Re: Table broken: NoServerForRegionException

2010-03-09 Thread Ted Yu
Can I apply those two commands to resolve the following (hbase 0.20.1) ? 2010-03-09 11:06:49,229 DEBUG [pool-1-thread-1] hfile.LruBlockCache(551): Cache Stats: Sizes: Total=12.646942MB (13261280), Free=1212.828MB (1271742368), Max=1225.475MB (1285003648), Counts: Blocks=43, Access=5141, Hit=5098,

Re: Zookeeper issue, please help

2010-03-09 Thread Ted Yu
dark: Permission issue? (i.e. zkStart failed to overwrite an > old/existing pid file due to file permissions). Perhaps something started > the server twice before stopping? > > Can you reproduce this? If you do please report the issue here: > https://issues.apache.org/jira/browse/ZOOKEE

Re: Zookeeper issue, please help

2010-03-09 Thread Ted Yu
Hi, We use zookeeper-3.2.1 When I tested ruok : 2010-03-09 08:56:36,711 INFO [NIOServerCxn.Factory:2181] server.NIOServerCnxn Processing ruok command from /10.10.31.135:57084 2010-03-09 08:56:36,711 WARN [NIOServerCxn.Factory:2181] server.NIOServerCnxn Exception causing close of session 0x0 due

Re: SafeModeException

2010-03-08 Thread Ted Yu
t; > On Mon, Mar 8, 2010 at 10:02 PM, Ted Yu wrote: >> Hi, >> I saw this in master server log: >> 2010-03-08 21:13:47,428 INFO  [Thread-14] >> master.ServerManager$ServerMonitor(130): 3 region servers, 0 dead, average >> load 0.0 >> 2010-03-08 21:13:50,50

SafeModeException

2010-03-08 Thread Ted Yu
Hi, I saw this in master server log: 2010-03-08 21:13:47,428 INFO [Thread-14] master.ServerManager$ServerMonitor(130): 3 region servers, 0 dead, average load 0.0 2010-03-08 21:13:50,505 INFO [WrapperSimpleAppMain-EventThread] master.ServerManager$ServerExpirer(813): snv-it-lin-010.projectrialto.c

Re: HBaseClient call doesn't timeout

2010-03-08 Thread Ted Yu
t; > J-D > > On Mon, Mar 8, 2010 at 10:23 AM, Ted Yu wrote: > > Hi, > > We use the following code to retrieve data from hbase 0.20.1 which didn't > > return after 30 minutes: > > > >ResultScanner _scanner = _data; > >try { >

HBaseClient call doesn't timeout

2010-03-08 Thread Ted Yu
Hi, We use the following code to retrieve data from hbase 0.20.1 which didn't return after 30 minutes: ResultScanner _scanner = _data; try { Result[] _results = _scanner.next(defaultPageSize); updateResults(pTable, _scanner, _results); There is no exception thr

Re: Task attempt failed to report status

2010-03-06 Thread Ted Yu
You can introduce a second thread in the reducer which periodically reports status to hadoop. At the same time, you can record the longest put operation to see the amount of time it takes. lowering the number of cells in a put to some value under 1000 may help as well. On Saturday, March 6, 2010,

Re: WrongRegionException

2010-03-06 Thread Ted Yu
what choice(s) do I have ? On Fri, Mar 5, 2010 at 3:21 PM, Ted Yu wrote: > I want to hear other people's experience handling WrongRegionException, > especially in production. > Thanks > > > On Wed, Feb 17, 2010 at 1:54 PM, Zhenyu Zhong wrote: > >> Stack, >> &g

region server shutdown, possibly due to datanode error

2010-03-06 Thread Ted Yu
I found this in regionserver log on one machine - the region server shutdown shortly after: 2010-03-05 23:44:37,859 WARN [DataStreamer for file /hbase/.logs/ snv-it-lin-010.projectrialto.com,60020,1267695848448/hlog.dat.1267860383622] hdfs.DFSClient$DFSOutputStream(2589): Error Recovery for block

Re: UnknownScannerException

2010-03-05 Thread Ted Yu
ou can just create a > new scan and set the start row to the latest one you saw. > > J-D > > On Fri, Mar 5, 2010 at 4:13 PM, Ted Yu wrote: > > Hi, > > We use HBase 0.20.1 > > I saw the following in regionserver log: > > 2010-03-05 16:0

UnknownScannerException

2010-03-05 Thread Ted Yu
Hi, We use HBase 0.20.1 I saw the following in regionserver log: 2010-03-05 16:02:57,952 ERROR [IPC Server handler 60 on 60020] regionserver.HRegionServer(844): org.apache.hadoop.hbase.UnknownScannerException: Name: -1 at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1

Re: WrongRegionException

2010-03-05 Thread Ted Yu
I want to hear other people's experience handling WrongRegionException, especially in production. Thanks On Wed, Feb 17, 2010 at 1:54 PM, Zhenyu Zhong wrote: > Stack, > > As you described, I found a hole after the region with problem. > After the problem region(3d9d1175a7f8bf861bf75638bb1eb231, >

Re: [Indexed HBase] Can I add index in an existing table?

2010-03-04 Thread Ted Yu
should see something like the following logged in the region server > logs > for each region: > > > Filled indices for region: 'ruletable,,1267641828807' with > entries > > in 00:05:99 > > > Cheers, > Dan > > 2010/3/5 Ted Yu >

Re: [Indexed HBase] Can I add index in an existing table?

2010-03-04 Thread Ted Yu
2010/3/3 Ted Yu > Hi, > I wrote a utility to add index to my table. > After running it, I couldn't see the rows in that table I saw before. > > hbase(main):007:0> count 'ruletable' > NativeException: org.apache.hadoop.hbase.client.NoServerForRegionException:

Re: NotServingRegionException

2010-03-03 Thread Ted Yu
I use org.apache.hadoop.hbase.mapreduce.Import to import which is launched on the same VM. On Wed, Mar 3, 2010 at 11:37 AM, Jean-Daniel Cryans wrote: > Yes that's one thing, also make sure your client has connectivity... > doesn't seem so. > > J-D > > On Wed, Mar

Re: HFile backup while cluster running

2010-03-03 Thread Ted Yu
If you disable writing, you can use org.apache.hadoop.hbase.mapreduce.Export to export all your data, copy them to your new HDFS, then use org.apache.hadoop.hbase.mapreduce.Import, finally switch your clients to the new HBase cluster. On Wed, Mar 3, 2010 at 11:27 AM, Kevin Peterson wrote: > My cu

Re: NotServingRegionException

2010-03-03 Thread Ted Yu
hat address for that region. Look at the logs for when 1) > the master assigns the region and 2) when the region server opens the > region. In between I expect you should see some exceptions. > > You can also put your 2 logs somewhere and post a link here so someone > can take a look

Re: NotServingRegionException

2010-03-03 Thread Ted Yu
Previous attempt wasn't delivered. On Wed, Mar 3, 2010 at 9:30 AM, Ted Yu wrote: > Hi, > I started hbase 0.20.3 successfully on my Linux VM. Master and regionserver > are on the same VM. > There're two empty tables. > > Soon I saw the following in regionserver.lo

Re: How to back up HBase data

2010-03-02 Thread Ted Yu
You can use Export class. Please take a look at hbase-2225 as well. On Tuesday, March 2, 2010, wrote: > Hi, everyone. Recently I encountered a problem about data loss of HBase. So > it comes to the question that how to back up HBase data to recover table > records if HBase or HDFS crash. What

Cannot open filename error

2010-03-01 Thread Ted Yu
Hi, I saw this in our HBase 0.20.1 master log: 2010-03-01 12:38:42,451 INFO [HMaster] master.ProcessRegionOpen(80): Updated row domaincrawltable,,1267475905927 in region .META.,,1 with startcode=1267475746189, server=10.10.31.135:60020 2010-03-01 12:39:06,088 INFO [Thread-10] master.ServerManage

duplicate regionserver entries

2010-03-01 Thread Ted Yu
Hi, We use hbase 0.20.1 On http://snv-it-lin-006:60010/master.jsp, I see two rows for the same region server: snv-it-lin-010.projectrialto.com:600301267038448430requests=0, regions=25, usedHeap=1280, maxHeap=6127 snv-it-lin-010.projectrialto.com:60030 1267466540070requests=0, regions=2, usedHeap=12

Re: RegExpRowFilter

2010-02-27 Thread Ted Yu
; On 28 February 2010 04:05, Ted Yu wrote: > > > Hi, > > I use org.apache.hadoop.hbase. > > filter.PrefixFilter in my export utility. > > I like the flexibility of RegExpRowFilter but it cannot be used in > > Scan.setFilter(org.apache.hadoop.hbase.filter.Filter)

RegExpRowFilter

2010-02-27 Thread Ted Yu
Hi, I use org.apache.hadoop.hbase. filter.PrefixFilter in my export utility. I like the flexibility of RegExpRowFilter but it cannot be used in Scan.setFilter(org.apache.hadoop.hbase.filter.Filter) call. Is there Filter implementation that does regex filtering ? Thanks

Multiple Masters

2010-02-26 Thread Ted Yu
I read http://wiki.apache.org/hadoop/Hbase/MultipleMasters If someone uses Multiple Masters in production, please share your experience. Thanks On Tue, Feb 23, 2010 at 10:31 PM, Stack wrote: > What version are you on? There is no hbase.master in hbase 0.20.x. > Its a vestige of 0.19 hbase. No

region server Java wrapper

2010-02-24 Thread Ted Yu
Hi, We use Java wrapper from Tanuki Software Inc for our region server. Here are wrapper parameters: wrapper.startup.timeout=301 wrapper.ping.timeout=3000 wrapper.cpu.timeout=300 wrapper.shutdown.timeout=301 wrapper.jvm_exit.timeout=301 wrapper.restart.delay=301 This morning 2 hours after one of

Re: WrongRegionException

2010-02-17 Thread Ted Yu
In ASCII, 5 is ahead of a So the rowkey is outside the region. On Wed, Feb 17, 2010 at 8:33 AM, Zhenyu Zhong wrote: > Hi, > > I came across this problem recently. > > I tried to query a table with rowkey '3ec1aa5a50307aed20a222af92a53ad1'. > > The query hits on a region with startkey > '3d9d1175a

Re: request for mapreduce with hbase examples

2010-02-11 Thread Ted Yu
You can find plenty of examples in nutchbase: http://github.com/apache/nutch/tree/nutchbase On Thu, Feb 11, 2010 at 12:38 PM, David Hawthorne wrote: > I was under the impression that you could read from/write to an hbase table > from within a mapreduce job. Import and Export look like methods f

multibyte support in Hbase

2010-02-08 Thread Ted Yu
Hi, If you have experience storing multi-byte contents in HBase, please share. Thanks

Re: concurrency in exporting HBase contents

2010-01-23 Thread Ted Yu
that your import will miss some > writes unless you block them. > > In 0.21 this will be a lot easier using multi datacenter replication > along with the ability to replay logs from one cluster to another > starting from a certain point in time. > > J-D > > On Fri, Jan 22, 201

Re: Help on HBase shell get command usage

2010-01-22 Thread Ted Yu
How can I find all the rows which have value for certain qualifier ? For example, all rows which have value for 'stt:page_content'. Thanks On Tue, Dec 15, 2009 at 12:09 PM, stack wrote: > Try: > > hbase(main):005:0> get 'crawltable', 'com.onsoft.www:http/', { COLUMNS => > 'stt:'} > > i.e. '=>'

concurrency in exporting HBase contents

2010-01-22 Thread Ted Yu
Hi, Suppose during export there is ongoing write operation to HBase table I am exporting, which snapshot does export use ? Is there special action I should take ? Thanks

HBase Admin Web UI

2010-01-18 Thread Ted Yu
Hi, I am working on Admin Web UI which queries/updates data stored in HBase. This admin tool should provide role-based authorization for update operations. If you have experience working on such admin tool, please share. Thanks

Re: HBaseAdmin and ConcurrentModificationException

2010-01-15 Thread Ted Yu
instances of HBaseAdmin floating around? If so, why? > Please paste full stack trace. That'd help. > Yours, > St.Ack > > On Fri, Jan 15, 2010 at 10:22 AM, Ted Yu wrote: > > > Hi, > > We use hbase-0.20.1 > > I have seen this randomly fail with a Concurrent

HBaseAdmin and ConcurrentModificationException

2010-01-15 Thread Ted Yu
Hi, We use hbase-0.20.1 I have seen this randomly fail with a ConcurrentModificationException: HBaseAdmin admin = new HBaseAdmin(hbaseConf); Has anybody else seen this behavior ? Thanks

HBase bulk load

2010-01-13 Thread Ted Yu
Jonathan: Since you implemented https://issues.apache.org/jira/si/jira.issueviews:issue-html/HBASE-48/HBASE-48.html, maybe you can point me to some document how bulk load is used ? I found bin/loadtable.rb and assume that can be used to import data back into HBase. Thanks

Re: Help on HBase shell alter command usage

2009-12-15 Thread Ted Yu
nsoft.www:http/', { COLUMNS => > 'stt:'} > > i.e. '=>' rather than '='. Also, its COLUMNS (uppercase I believe) rather > than column. > > Run 'help' in the shell for help and examples. > > St.Ack > > On Tue, De

Re: Help on HBase shell alter command usage

2009-12-15 Thread Ted Yu
Hi, I saw the following from scan 'crawltable' command in hbase shell: ... com.onsoft.www:http/column=stt:, timestamp=1260405530801, value=\003 3 row(s) in 0.2490 seconds How do I query the value for stt column ? hbase(main):005:0> get 'crawltable', 'com.onsoft.www:http/', { column='stt: