Re: HBase: scanner with custom filter causes Exception

2009-02-09 Thread Dru Jensen
Can you can do a new() on an abstract class? On Feb 9, 2009, at 9:07 AM, stack wrote: Your new class needs to be on the server's CLASSPATH as well as on the client-side. St.Ack On Mon, Feb 9, 2009 at 4:26 AM, Michael Seibold wrote: Hi, I want to create a scanner with a custom filter, b

Re: Missing file

2009-01-13 Thread Dru Jensen
, but seems fragile. I have found that increasing the number of datanode handlers via dfs.datanode.handler.count improves DFS/DFSClient stability under heavy load. - Andy From: Dru Jensen Subject: Missing file java.io.IOException: java.io.IOException: Cannot open filename /hbase/webmaps

Missing file

2009-01-12 Thread Dru Jensen
I have an MR process that populates a table but fails with this error: java.io.IOException: java.io.IOException: Cannot open filename /hbase/ webmaps/743469791/header/mapfiles/4735713346568547172/data at org.apache.hadoop.hdfs.DFSClient $DFSInputStream.openInfo(DFSClient.java:1395) at org.apa

Re: Accessing rows with number indexes

2009-01-10 Thread Dru Jensen
I'm not sure this will work or a good idea but is it possible to use the tableindexed feature in 0.19 and create an IndexKeyGenerator that does an auto increment? http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/client/tableindexed/package.html?view=markup On

Re: corrupt hbase meta

2009-01-09 Thread Dru Jensen
t is there an index? St.Ack to at least partially recover the region. Don't try this until someone else weighs in on this approach. Since you've already suffered data loss, possibly as a result of HBASE-1104, I do not think there is any harm to try it. - Andy From: Dru Jensen Su

corrupt hbase meta

2009-01-08 Thread Dru Jensen
I have a MR process that is failing at the same point everytime I try to run it. I have restarted hadoop and hbase but that didn't fix it (unlike last time). I checked dfs and the file that hbase is looking for does not exist. Is there a tool I can run to fix this? Thanks, Dru org.apac

Re: what is considered as best / worst practice?

2008-12-22 Thread Dru Jensen
JSON+ Question: Is it an acceptable design to use the timestamp as a data element? I am currently adding the date to the column name and setting the number of versions in the table to 1. Current: htable.put('table','family:date', 'JSON'); What I would like to do is use the timestamp as a

Upgrade to 0.19.0

2008-12-17 Thread Dru Jensen
I am upgrading to 0.19.0. Do I still need these settings in hadoop- site.xml? dfs.datanode.max.xcievers 1024 dfs.datanode.socket.write.timeout 0 mapred.child.java.opts -Xmx1024m Thanks, Dru

Re: implementing selection/projection using mapreduce

2008-12-04 Thread Dru Jensen
Try passing the variables into the mapper through the JobConf.set and .get methods. On Dec 4, 2008, at 3:45 AM, abhinit wrote: Hi, I am implementing basic selection/projection using mapreduce on an HBase table. I have an outer class SelectProject which implements the tool interface and

Re: rowcounter fails - Region closed

2008-11-20 Thread Dru Jensen
cular region. Might give you a clue as to what happened. When you scan the .META., does this region appear at all? Is there a 'hole' in the .META. where this region should be? We may have to reinsert if so. To do this, will need old value for HRI. Did the upping of xceiverco

rowcounter fails - Region closed

2008-11-20 Thread Dru Jensen
I have a table that has 20+ million rows. I tried to run rowcounter MR process against it but one of the task attempts fail on the following exception: java.io.IOException: java.io.IOException: Region metrics,Game Face| News,1226620888277 closed at org.apache.hadoop.hbase.regionserver.HR

xceiverCount 257 exceeds the limit of concurrent xcievers 256

2008-11-12 Thread Dru Jensen
hbase-users, I have been running MR processes for several days against HBase with success until recently the region servers shut themselves down. Hadoop 0.18.1 Hbase 0.18.1 3 node cluster Checking the region server logs, I see the following Exception before shutdown: 2008-11-11 19:55:52,4

Re: correct performance evaluation results?

2008-10-28 Thread Dru Jensen
Did your scan have a column specified that doesn't exist? On Oct 28, 2008, at 3:10 PM, Krzysztof Szlapinski wrote: stack pisze: Yeah. Unless you got some magic going on in that Xeon of yours. No magic noticed ;) But 4 real - Any ideas why the scan test goes wrong? I got no warnings, no err

Re: NotServingRegionException - Map/Reduce process fails

2008-10-24 Thread Dru Jensen
St.Ack and J-D, Thanks for your help. Upgrading to the latest 0.19.0 and changing the region size back to 256MB along with the Premature EOF settings from Jean Adrien fixed the issues I was seeing. Dru On Oct 23, 2008, at 4:04 PM, stack wrote: Dru Jensen wrote: Stack, Sorry for the

Re: NotServingRegionException - Map/Reduce process fails

2008-10-23 Thread Dru Jensen
Stack, Sorry for the confusion, I am not using the old implementation of TableReduce. The new 0.19.0 changed this to an interface. The reduce process is performing calculations. It's not just writing to the table and requires the sort. I will change the region size back and see if that

Re: NotServingRegionException - Map/Reduce process fails

2008-10-23 Thread Dru Jensen
your logs (where the regionserver puts up temporary block of updates because it isn't able to flush fast enough). You're on recent hbase? Have you altered flush or maximum region file sizes? St.Ack Dru Jensen wrote: Stack and J-D, Thanks for your responses. It looks like the RetriesExha

Re: NotServingRegionException - Map/Reduce process fails

2008-10-23 Thread Dru Jensen
e qualifier you know is not present: e.g. if you have columnfamily 'page' and you know there is no column 'page:xyz', scan with that (Enable DEBUG in log4j so you can see regions being loaded as scan progresses): "scan 'TABLENAME', ['page:xyz']&qu

NotServingRegionException - Map/Reduce process fails

2008-10-23 Thread Dru Jensen
Hi hbase-users, During a fairly large MR process, on the Reduce cycle as its writing its results to a table, I see org.apache.hadoop.hbase.NotServingRegionException in the region server log several times and then I see a split reporting it was successful. Eventually, the Reduce process fa

Re: set split size per table

2008-10-14 Thread Dru Jensen
nm. Found the work around in HBASE-903. Thanks, Dru On Oct 14, 2008, at 11:35 AM, Dru Jensen wrote: Sorry if this was already answered. How do you specify a split size for a table? Can this be done in hbase shell with the create command? I saw hbase-42 which Andrew Purtell worked on

set split size per table

2008-10-14 Thread Dru Jensen
Sorry if this was already answered. How do you specify a split size for a table? Can this be done in hbase shell with the create command? I saw hbase-42 which Andrew Purtell worked on and released in 0.17 but I can't seem to find documentation on how to do this. Thanks, Dru

Re: hbase 0.2.1 failing to start

2008-09-29 Thread Dru Jensen
On Sep 29, 2008, at 1:27 PM, stack wrote: Dru Jensen wrote: HBase was not responding to Thrift requests so I tried to restart but it still looks frozen. I am seeing several error messages in the hmaster logs after I attempted to restart hbase: 2008-09-29 12:55:23

hbase 0.2.1 failing to start

2008-09-29 Thread Dru Jensen
HBase was not responding to Thrift requests so I tried to restart but it still looks frozen. I am seeing several error messages in the hmaster logs after I attempted to restart hbase: 2008-09-29 12:55:23,744 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: error opening region {

Re: Duplicate rows being processed when one MR task completes. Endless Loop in MR task.

2008-09-23 Thread Dru Jensen
the stuck task is taking too long to finish, and so on...) St.Ack Dru Jensen wrote: More information: When I first launch the Job, 6 MR "tasks" are created on 3 different servers in the cluster. Each "task" has 1 "task attempt" started. Hadoop map task lis

Duplicate rows being processed when one MR task completes. Endless Loop in MR task.

2008-09-19 Thread Dru Jensen
cess the same keys, they start processing the same keys over and over in an endless loop. On Sep 19, 2008, at 10:52 AM, Dru Jensen wrote: Sorry. Hadoop 0.17.2.1 - Hbase 0.2.1 On Sep 19, 2008, at 10:40 AM, Jean-Daniel Cryans wrote: Dru, Which versions? Thx J-D On Fri, Sep 19, 2008 at 1:

Re: MR process in endless loop

2008-09-19 Thread Dru Jensen
Sorry. Hadoop 0.17.2.1 - Hbase 0.2.1 On Sep 19, 2008, at 10:40 AM, Jean-Daniel Cryans wrote: Dru, Which versions? Thx J-D On Fri, Sep 19, 2008 at 1:38 PM, Dru Jensen <[EMAIL PROTECTED]> wrote: I have a MR process that gets stuck in an endless loop. It looks like the same

MR process in endless loop

2008-09-19 Thread Dru Jensen
I have a MR process that gets stuck in an endless loop. It looks like the same set of keys are being sent to one of the tasks in an endless loop. Unfortunately, Its not consistent. Sometimes it works fine. Only 1 of the 6 MR processes gets in this state and never completes. After the disk

Re: How to add a column family from hirb?

2008-09-18 Thread Dru Jensen
x27;t alter a enabled table, so we have to disable the table at first. In the other side, we can't get the table descriptor if the table is disabled. Deadlock here. On Wed, Sep 17, 2008 at 12:28 AM, Dru Jensen <[EMAIL PROTECTED]> wrote: Maybe check to see if the column exists; If

Re: build hadoop in eclipse

2008-09-17 Thread Dru Jensen
You need to setup your Eclipse Java project and make sure your source path and library paths are properly setup. 0. unzip the download in your Eclipse workspace directory. It should create the hbase-0.2.1 directory. 1. Create a new java project and select the hbase-0.2.1 directory that yo

Re: How to add a column family from hirb?

2008-09-16 Thread Dru Jensen
Maybe check to see if the column exists; If it does, modify otherwise add? def alter(tableName, args) now = Time.now raise TypeError.new("Table name must be of type String") \ unless tableName.instance_of? String descriptor = hcd(args) table = HTable.new(tabl

Re: missing rows in MR process

2008-09-08 Thread Dru Jensen
Aaahh Yes. Thanks. On Sep 8, 2008, at 10:59 AM, stack wrote: Dru Jensen wrote: Hi StAck, No, i don't think I'm hitting this. The first MR process is using in: SequenceInputFileFormat out: TableReduce. The second is using in: TableMap out TableReduce. I don't think the

Re: missing rows in MR process

2008-09-05 Thread Dru Jensen
3:59 PM, stack wrote: This is odd Dru. Do you think you are seeing https://issues.apache.org/jira/browse/HBASE-856? Are you using filters? St.Ack Dru Jensen wrote: hbase-users, I have two MR processes that run one right after the other in a script. The first reads from a file and populate

missing rows in MR process

2008-09-05 Thread Dru Jensen
hbase-users, I have two MR processes that run one right after the other in a script. The first reads from a file and populates a table. The second uses a TableMap over that table that was just populated. The first MR process inserted 1950 rows successfully and everything looked correct.

Release Candidate 0.2.1 errors in irb shell

2008-09-05 Thread Dru Jensen
I am testing the release candidate with hadoop 0.17.2.1 release. I am curious if others are seeing this or if I have something mis-configured. I reformatted the dfs and recreated everything from scratch. hbase(main):009:0> version Version: 0.2.1, r691710, Wed Sep 3 11:50:24 PDT 2008 I occas

Re: Unknown Scanner Exception

2008-08-13 Thread Dru Jensen
Excellent information. Thank You! On Aug 13, 2008, at 6:22 AM, Andrew Purtell wrote: From: Dru Jensen <[EMAIL PROTECTED]> I am putting data in the same row and column family as i am scanning. According the St.Ack's response, I need to put the data in a separate column family. I

Re: Unknown Scanner Exception

2008-08-12 Thread Dru Jensen
Hi Andy and St.Ack, I'd be interested to hear if logging turns up anything. Table commits have sub-second response times. It looks like crawling is causing the slowness. Inside the map task definitely. Job failure at the map stage would force you to redo anything that might be in the col

Re: Unknown Scanner Exception

2008-08-12 Thread Dru Jensen
ong. Might make for a useful design note. I'd be curious to know more details about what you are trying to accomplish if you are willing to share them... - Andy From: Dru Jensen <[EMAIL PROTECTED]> Subject: Re: Unknown Scanner Exception To: hbase-user@hadoop.apache.org Date: Tuesday,

Re: Unknown Scanner Exception

2008-08-12 Thread Dru Jensen
ult is 30 seconds but I recommend 60 seconds for MR jobs. See if this works for you. J-D On Mon, Aug 11, 2008 at 6:34 PM, Dru Jensen <[EMAIL PROTECTED]> wrote: Hi J-D, I watched as one of the map tasks completed successfully. Another one was launched as a child of the

Re: Unknown Scanner Exception

2008-08-11 Thread Dru Jensen
, Jean-Daniel Cryans wrote: Dru, Using another table, but we're still not so sure that it does something bad apart from restarting the Map task (see the discussion between sebastien and stack in 810). J-D On Mon, Aug 11, 2008 at 3:58 PM, Dru Jensen <[EMAIL PROTECTED]> wrote:

Re: Unknown Scanner Exception

2008-08-11 Thread Dru Jensen
: Dru, Apart from doing a 'tail' on all your region server logs, watching your number of regions in the web UI. How many regions do you have currently in your table? J-D On Mon, Aug 11, 2008 at 2:52 PM, Dru Jensen <[EMAIL PROTECTED]> wrote: Hi J-D, I am writing to the

Re: Unknown Scanner Exception

2008-08-11 Thread Dru Jensen
on is the same on as in https://issues.apache.org/jira/browse/HBASE-810 Thx, J-D On Mon, Aug 11, 2008 at 2:35 PM, Dru Jensen <[EMAIL PROTECTED]> wrote: What causes the following error? It's coming from the TableInputFormatBase class. It seems to be fairly random for me. From w

Unknown Scanner Exception

2008-08-11 Thread Dru Jensen
What causes the following error? It's coming from the TableInputFormatBase class. It seems to be fairly random for me. From what I've read so far, its related to a timeout error? I verified my map task doesn't take long to process. Anything else that may cause this? org.apache.hadoop

Re: HBase MapReduce: wrong HStoreKey's row value passed into the map method

2008-08-04 Thread Dru Jensen
I saw this behavior using the "Text" object in 0.1.3. When I replaced it with "HStoreKey", it fixed my problem. I haven't tried "Text" in 0.2.0. On Aug 4, 2008, at 1:52 PM, stack wrote: Thats odd Ruslan. Were you able to figure out whats going on? Any chance of your upgrading to 0.1.3,

Re: newbie - map reduce not distributing

2008-08-04 Thread Dru Jensen
J-D, Andy, St.Ack, The patch fixed my problem. Thanks again for your help. Dru On Aug 4, 2008, at 9:32 AM, Dru Jensen wrote: Thanks Andrew. I will test the patch and verify everything is working. On Aug 3, 2008, at 11:57 PM, Andrew Purtell wrote: Opened HBASE-793. https

Re: Question about how queries are distributed

2008-08-04 Thread Dru Jensen
What about launching a map/reduce task from the thrift API? For example, launch RowCount and either get the results sync in the response or async from a results table. Would this be a good way of handling load distribution of queries? On Aug 4, 2008, at 9:50 AM, Andrew Purtell wrote: Someth

Re: newbie - map reduce not distributing

2008-08-04 Thread Dru Jensen
persistent. St.Ack Dru Jensen wrote: J-D, I found what is causing the same rows being sent to multiple map tasks. If you have the same column family name in other tables, the Test will send the same rows to multiple map reducers. I'm attaching the DEBUG logs and the test clas

Re: newbie - map reduce not distributing

2008-08-04 Thread Dru Jensen
hbase-user@hadoop.apache.org Date: Saturday, August 2, 2008, 1:08 PM Thank you for persevering Dru. Indeed, a bug in getStartKeys will make us process all tables that have a column family name in common. [...] Dru Jensen wrote: I found what is causing the same rows being sent to multiple map

Re: newbie - map reduce not distributing

2008-07-31 Thread Dru Jensen
UPDATE: I modified the RowCounter example and verified that it is sending the same row to multiple map tasks also. Is this a known bug or am I doing something truly as(s)inine? Any help is appreciated. On Jul 30, 2008, at 3:02 PM, Dru Jensen wrote: J-D, Again, thank you for your help on

Re: newbie - map reduce not distributing

2008-07-30 Thread Dru Jensen
sk is seeing the same rows. Any help to prevent this is appreciated. Thanks, Dru On Jul 30, 2008, at 2:22 PM, Jean-Daniel Cryans wrote: Dru, It is not supposed to process many times the same rows. Can I see the log you're talking about? Also, how many regions do you have in

Re: newbie - map reduce not distributing

2008-07-30 Thread Dru Jensen
running only 1 mapper? thanks, Dru On Jul 30, 2008, at 1:44 PM, Jean-Daniel Cryans wrote: Dru, The regions will split when achieving a certain threshold so if you want your computing to be distributed, you will have to have more data. Regards, J-D On Wed, Jul 30, 2008 at 4:36 PM, Dru

newbie - map reduce not distributing

2008-07-30 Thread Dru Jensen
Hello, I created a map/reduce process by extending the TableMap and TableReduce API but for some reason when I run multiple mappers, in the logs its showing that the same rows are being processed by each Mapper. When I say logs, I mean in the hadoop task tracker (localhost:50030) and dril