Re: How to use DBInputFormat?

2009-02-21 Thread Amandeep Khurana
Thanks Brian. Sorry about getting back so late. Your input made it work.
Now, I can pull data out of Oracle as well.

Thanks
Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Thu, Feb 12, 2009 at 9:05 AM, Brian MacKay
wrote:

> Amandeep,
>
> I spoke w/ one of our Oracle DBA's and he suggested changing the query
> statement as follows:
>
> MySql Stmt:
> select * from <> limit  offset "
> ---
> Oracle Stmt:
> select *
>  from (select a.*,rownum rno
>  from (your_query_here must contain order by) a
>where rownum <= splitstart + splitlength)
>  where rno >= splitstart;
>
> This can be put into a function, but would require a type as well.
> -
>
> If you edit org.apache.hadoop.mapred.lib.db.DBInputFormat, getSelectQuery,
> it should work in Oracle
>
> protected String getSelectQuery() {
>
>... edit to include check for driver and create Oracle Stmt
>
>  return query.toString();
>}
>
>
> Brian
>
> ==
> >> On Feb 5, 2009, at 11:37 AM, Stefan Podkowinski wrote:
> >>
> >>
> >>> The 0.19 DBInputFormat class implementation is IMHO only suitable for
> >>> very simple queries working on only few datasets. Thats due to the
> >>> fact that it tries to create splits from the query by
> >>> 1) getting a count of all rows using the specified count query (huge
> >>> performance impact on large tables)
> >>> 2) creating splits by issuing an individual query for each split with
> >>> a "limit" and "offset" parameter appended to the input sql query
> >>>
> >>> Effectively your input query "select * from orders" would become
> >>> "select * from orders limit  offset " and
> >>> executed until count has been reached. I guess this is not working sql
> >>> syntax for oracle.
> >>>
> >>> Stefan
> >>>
> >>>
> >>> 2009/2/4 Amandeep Khurana :
> >>>
>  Adding a semicolon gives me the error "ORA-00911: Invalid character"
> 
>  Amandeep
> 
> 
>  Amandeep Khurana
>  Computer Science Graduate Student
>  University of California, Santa Cruz
> 
> 
>  On Wed, Feb 4, 2009 at 6:46 AM, Rasit OZDAS 
> wrote:
> 
> 
> > Amandeep,
> > "SQL command not properly ended"
> > I get this error whenever I forget the semicolon at the end.
> > I know, it doesn't make sense, but I recommend giving it a try
> >
> > Rasit
> >
> > 2009/2/4 Amandeep Khurana :
> >
> >> The same query is working if I write a simple JDBC client and query
> the
> >> database. So, I'm probably doing something wrong in the connection
> >>
> > settings.
> >
> >> But the error looks to be on the query side more than the connection
> >>
> > side.
> >
> >> Amandeep
> >>
> >>
> >> Amandeep Khurana
> >> Computer Science Graduate Student
> >> University of California, Santa Cruz
> >>
> >>
> >> On Tue, Feb 3, 2009 at 7:25 PM, Amandeep Khurana 
> >>
> > wrote:
> >
> >>> Thanks Kevin
> >>>
> >>> I couldnt get it work. Here's the error I get:
> >>>
> >>> bin/hadoop jar ~/dbload.jar LoadTable1
> >>> 09/02/03 19:21:17 INFO jvm.JvmMetrics: Initializing JVM Metrics
> with
> >>> processName=JobTracker, sessionId=
> >>> 09/02/03 19:21:20 INFO mapred.JobClient: Running job:
> job_local_0001
> >>> 09/02/03 19:21:21 INFO mapred.JobClient:  map 0% reduce 0%
> >>> 09/02/03 19:21:22 INFO mapred.MapTask: numReduceTasks: 0
> >>> 09/02/03 19:21:24 WARN mapred.LocalJobRunner: job_local_0001
> >>> java.io.IOException: ORA-00933: SQL command not properly ended
> >>>
> >>>   at
> >>>
> >>>
> >
> org.apache.hadoop.mapred.lib.db.DBInputFormat.getRecordReader(DBInputFormat.java:289)
> >
> >>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:321)
> >>>   at
> >>>
> >>>
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
> >>> java.io.IOException: Job failed!
> >>>   at
> >>>
> > org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
> >
> >>>   at LoadTable1.run(LoadTable1.java:130)
> >>>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >>>   at LoadTable1.main(LoadTable1.java:107)
> >>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> >>>   at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
> Source)
> >>>   at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
> >>>
> > Source)
> >
> >>>   at java.lang.reflect.Method.invoke(Unknown Source)
> >>>   at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
> >>>   at org.apache.hadoop.ma

Re: Super-long reduce task timeouts in hadoop-0.19.0

2009-02-21 Thread Devaraj Das
Bryan, the message

2009-02-19 22:48:19,380 INFO org.apache.hadoop.mapred.TaskTracker:
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
taskTracker/jobcache/job_200902061117_3388/
attempt_200902061117_3388_r_66_0/output/file.out in any of the
configured local directories

is spurious. That had been reported in 
https://issues.apache.org/jira/browse/HADOOP-4963 and the fix is there in the 
trunk. I guess I should commit that fix to the 0.20 and 0.19 branches too. 
Meanwhile, please apply the patch on your repository if you can.
Regarding the tasks timing out, do you know whether the reduce tasks were in 
the shuffle phase or the reducer phase? That you can deduce by looking at the 
task web UI for the failed tasks, or, the task logs.
Also, from your reduce method, do you ensure that progress reports are sent 
every so often? By default, progress reports are sent for every record-group 
that the reducer method is invoked with, and, for every record that the reducer 
emits. If the timeout is not happening in the shuffle, then the problematic 
part is the reduce method where the timeout could be happening because a lot of 
time is spent in the processing of a particular record-group, or, the write of 
the output record to the hdfs is taking a long time.


On 2/21/09 5:28 AM, "Bryan Duxbury"  wrote:

(Repost from the dev list)

I noticed some really odd behavior today while reviewing the job
history of some of our jobs. Our Ganglia graphs showed really long
periods of inactivity across the entire cluster, which should
definitely not be the case - we have a really long string of jobs in
our workflow that should execute one after another. I figured out
which jobs were running during those periods of inactivity, and
discovered that almost all of them had 4-5 failed reduce tasks, with
the reason for failure being something like:

Task attempt_200902061117_3382_r_38_0 failed to report status for
1282 seconds. Killing!

The actual timeout reported varies from 700-5000 seconds. Virtually
all of our longer-running jobs were affected by this problem. The
period of inactivity on the cluster seems to correspond to the amount
of time the job waited for these reduce tasks to fail.

I checked out the tasktracker log for the machines with timed-out
reduce tasks looking for something that might explain the problem,
but the only thing I came up with that actually referenced the failed
task was this log message, which was repeated many times:

2009-02-19 22:48:19,380 INFO org.apache.hadoop.mapred.TaskTracker:
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
taskTracker/jobcache/job_200902061117_3388/
attempt_200902061117_3388_r_66_0/output/file.out in any of the
configured local directories

I'm not sure what this means; can anyone shed some light on this
message?

Further confusing the issue, on the affected machines, I looked in
logs/userlogs/, and to my surprise, the directory and log
files existed, and the syslog file seemed to contain logs of a
perfectly good reduce task!

Overall, this seems like a pretty critical bug. It's consuming up to
50% of the runtime of our jobs in some instances, killing our
throughput. At the very least, it seems like the reduce task timeout
period should be MUCH shorter than the current 10-20 minutes.

-Bryan



Re: Hadoop build error

2009-02-21 Thread Matei Zaharia
Forrest is used just for building documentation, by the way. If you want to
compile the Hadoop core jars you can do ant jar and it won't require
Forrest.

On Fri, Feb 20, 2009 at 10:41 PM, Abdul Qadeer wrote:

> >
> >
> >
> > java5.check:
> >
> > BUILD FAILED
> > /home/raghu/src-hadoop/trunk/build.xml:890: 'java5.home' is not defined.
> >  Forrest requires Java5.  Please pass -Djava5.home= > distribution> to Ant on the command-line.
>
>
>
> I think the error is self-explanatory.  Forrest need JDK1.5 and you can
> pass
> it using -Djava5.home argument.
> May be something like the following:
>
> ant -Djavac.args="-Xlint  -Xmaxwarns 1000"  -Djava5.home={base of Java 5
> distribution} tar
>


Re: Connection problem during data import into hbase

2009-02-21 Thread Amandeep Khurana
Here's whats happening in the logs...

I get these messages pretty often:
2009-02-21 10:47:27,252 INFO org.apache.hadoop.hdfs.DFSClient: Could not
complete file
/hbase/in_table/compaction.dir/29712919/b2b/mapfiles/6353513045069085254/data
retrying...


Sometimes I get these too:
2009-02-21 10:48:46,273 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Blocking updates for 'IPC Server handler 5 on 60020' on region
in_table,,1235241411727: Memcache size 128.0m is >= than blocking 128.0m
size

Here's what it logs when the job starts to fail:
2009-02-21 10:50:52,510 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
Exception: java.net.SocketTimeoutException: 5000 millis timeout while
waiting for channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/171.69.102.51:8270 remote=/
171.69.102.51:50010]
at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:162)
at
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)
at
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)
at java.io.BufferedOutputStream.write(Unknown Source)
at java.io.DataOutputStream.write(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2209)

2009-02-21 10:50:52,511 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Finished memcache flush of ~64.0m for region in_table,,1235241411727 in
5144ms, sequence id=30842181, compaction requested=true
2009-02-21 10:50:52,511 DEBUG
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
requested for region in_table,,1235241411727/29712919 because:
regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher
2009-02-21 10:50:52,512 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block blk_-2896903198415069285_18306 bad datanode[0]
171.69.102.51:50010
2009-02-21 10:50:52,513 FATAL
org.apache.hadoop.hbase.regionserver.LogRoller: Log rolling failed with ioe:
java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting...
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160)
2009-02-21 10:50:52,513 FATAL org.apache.hadoop.hbase.regionserver.HLog:
Could not append. Requesting close of log
java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting...
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160)
2009-02-21 10:50:52,515 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer: java.io.IOException: All
datanodes 171.69.102.51:50010 are bad. Aborting...
2009-02-21 10:50:52,515 FATAL org.apache.hadoop.hbase.regionserver.HLog:
Could not append. Requesting close of log
java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting...
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160)
2009-02-21 10:50:52,515 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
request=11, regions=2, stores=33, storefiles=167, storefileIndexSize=0,
memcacheSize=1, usedHeap=156, maxHeap=963
2009-02-21 10:50:52,515 INFO org.apache.hadoop.hbase.regionserver.LogRoller:
LogRoller exiting.
2009-02-21 10:50:52,516 FATAL org.apache.hadoop.hbase.regionserver.HLog:
Could not append. Requesting close of log
java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting...
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160)
2009-02-21 10:50:52,516 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer: java.io.IOException: All
datanodes 171.69.102.51:50010 are bad. Aborting...
2009-02-21 10:50:52,516 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer: java.io.IOException: All
datanodes 171.69.102.51:50010 are bad. Aborting...
2009-02-21 10:50:52,516 FATAL org.apache.hadoop.hbase.regionserver.HLog:
Could not append. Requesting close of log
java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting...
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997)
at
org.a

Re: Super-long reduce task timeouts in hadoop-0.19.0

2009-02-21 Thread Rasit OZDAS
I agree with the timeout period, Bryan,
Reporter has a progress() method to tell the namenode that it's still
working, no need to kill the job.


2009/2/21 Bryan Duxbury 

> We didn't customize this value, to my knowledge, so I'd suspect it's the
> default.
> -Bryan
>
>
> On Feb 20, 2009, at 5:00 PM, Ted Dunning wrote:
>
>  How often do your reduce tasks report status?
>>
>> On Fri, Feb 20, 2009 at 3:58 PM, Bryan Duxbury  wrote:
>>
>>  (Repost from the dev list)
>>>
>>>
>>> I noticed some really odd behavior today while reviewing the job history
>>> of
>>> some of our jobs. Our Ganglia graphs showed really long periods of
>>> inactivity across the entire cluster, which should definitely not be the
>>> case - we have a really long string of jobs in our workflow that should
>>> execute one after another. I figured out which jobs were running during
>>> those periods of inactivity, and discovered that almost all of them had
>>> 4-5
>>> failed reduce tasks, with the reason for failure being something like:
>>>
>>> Task attempt_200902061117_3382_r_38_0 failed to report status for
>>> 1282
>>> seconds. Killing!
>>>
>>> The actual timeout reported varies from 700-5000 seconds. Virtually all
>>> of
>>> our longer-running jobs were affected by this problem. The period of
>>> inactivity on the cluster seems to correspond to the amount of time the
>>> job
>>> waited for these reduce tasks to fail.
>>>
>>> I checked out the tasktracker log for the machines with timed-out reduce
>>> tasks looking for something that might explain the problem, but the only
>>> thing I came up with that actually referenced the failed task was this
>>> log
>>> message, which was repeated many times:
>>>
>>> 2009-02-19 22:48:19,380 INFO org.apache.hadoop.mapred.TaskTracker:
>>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
>>>
>>> taskTracker/jobcache/job_200902061117_3388/attempt_200902061117_3388_r_66_0/output/file.out
>>> in any of the configured local directories
>>>
>>> I'm not sure what this means; can anyone shed some light on this message?
>>>
>>> Further confusing the issue, on the affected machines, I looked in
>>> logs/userlogs/, and to my surprise, the directory and log files
>>> existed, and the syslog file seemed to contain logs of a perfectly good
>>> reduce task!
>>>
>>> Overall, this seems like a pretty critical bug. It's consuming up to 50%
>>> of
>>> the runtime of our jobs in some instances, killing our throughput. At the
>>> very least, it seems like the reduce task timeout period should be MUCH
>>> shorter than the current 10-20 minutes.
>>>
>>> -Bryan
>>>
>>>
>>
>>
>> --
>> Ted Dunning, CTO
>> DeepDyve
>>
>> 111 West Evelyn Ave. Ste. 202
>> Sunnyvale, CA 94086
>> www.deepdyve.com
>> 408-773-0110 ext. 738
>> 858-414-0013 (m)
>> 408-773-0220 (fax)
>>
>
>


-- 
M. Raşit ÖZDAŞ


Re: Connection problem during data import into hbase

2009-02-21 Thread Amandeep Khurana
Yes, I noticed it this time. The regionserver gets slow or stops responding
and then this error comes. How do I get this to work? Is there a way of
limiting the resources that the map red job should take?

I did make the changes in the config site similar to Larry Comton's config.
It only made the job go from dying at 7% to 12% this time.

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Sat, Feb 21, 2009 at 1:14 AM, stack  wrote:

> It looks like regionserver hosting root crashed:
>
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying
> to locate root region
>
> How many servers you running?
>
> You made similar config. to that reported by Larry Compton in a mail from
> earlier today?  (See FAQ and Troubleshooting page for more on his listed
> configs.)
>
> St.Ack
>
>
> On Sat, Feb 21, 2009 at 1:01 AM, Amandeep Khurana 
> wrote:
>
> > Yes, the table exists before I start the job.
> >
> > I am not using TableOutputFormat. I picked up the sample code from the
> docs
> > and am using it.
> >
> > Here's the job conf:
> >
> > JobConf conf = new JobConf(getConf(), IN_TABLE_IMPORT.class);
> >FileInputFormat.setInputPaths(conf, new Path("import_data"));
> >conf.setMapperClass(MapClass.class);
> >conf.setNumReduceTasks(0);
> >conf.setOutputFormat(NullOutputFormat.class);
> >JobClient.runJob(conf);
> >
> > Interestingly, the hbase shell isnt working now either. Its giving errors
> > even when I give the command "list"...
> >
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
> >
> > On Sat, Feb 21, 2009 at 12:10 AM, stack  wrote:
> >
> > > The table exists before you start the MR job?
> > >
> > > When you say 'midway through the job', are you using tableoutputformat
> to
> > > insert into your table?
> > >
> > > Which version of hbase?
> > >
> > > St.Ack
> > >
> > > On Fri, Feb 20, 2009 at 9:55 PM, Amandeep Khurana 
> > > wrote:
> > >
> > > > I dont know if this is related or not, but it seems to be. After this
> > map
> > > > reduce job, I tried to count the number of entries in the table in
> > hbase
> > > > through the shell. It failed with the following error:
> > > >
> > > > hbase(main):002:0> count 'in_table'
> > > > NativeException: java.lang.NullPointerException: null
> > > >from java.lang.String:-1:in `'
> > > >from org/apache/hadoop/hbase/util/Bytes.java:92:in `toString'
> > > >from
> > > org/apache/hadoop/hbase/client/RetriesExhaustedException.java:50:in
> > > > `getMessage'
> > > >from
> > > org/apache/hadoop/hbase/client/RetriesExhaustedException.java:40:in
> > > > `'
> > > >from org/apache/hadoop/hbase/client/HConnectionManager.java:841:in
> > > > `getRegionServerWithRetries'
> > > >from org/apache/hadoop/hbase/client/MetaScanner.java:56:in
> > `metaScan'
> > > >from org/apache/hadoop/hbase/client/MetaScanner.java:30:in
> > `metaScan'
> > > >from org/apache/hadoop/hbase/client/HConnectionManager.java:411:in
> > > > `getHTableDescriptor'
> > > >from org/apache/hadoop/hbase/client/HTable.java:219:in
> > > > `getTableDescriptor'
> > > >from sun.reflect.NativeMethodAccessorImpl:-2:in `invoke0'
> > > >from sun.reflect.NativeMethodAccessorImpl:-1:in `invoke'
> > > >from sun.reflect.DelegatingMethodAccessorImpl:-1:in `invoke'
> > > >from java.lang.reflect.Method:-1:in `invoke'
> > > >from org/jruby/javasupport/JavaMethod.java:250:in
> > > > `invokeWithExceptionHandling'
> > > >from org/jruby/javasupport/JavaMethod.java:219:in `invoke'
> > > >from org/jruby/javasupport/JavaClass.java:416:in `execute'
> > > > ... 145 levels...
> > > >from org/jruby/internal/runtime/methods/DynamicMethod.java:74:in
> > > `call'
> > > >from org/jruby/internal/runtime/methods/CompiledMethod.java:48:in
> > > `call'
> > > >from org/jruby/runtime/CallSite.java:123:in `cacheAndCall'
> > > >from org/jruby/runtime/CallSite.java:298:in `call'
> > > >from
> > > >
> > > >
> > >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:429:in
> > > > `__file__'
> > > >from
> > > >
> > > >
> > >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in
> > > > `__file__'
> > > >from
> > > >
> > > >
> > >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in
> > > > `load'
> > > >from org/jruby/Ruby.java:512:in `runScript'
> > > >from org/jruby/Ruby.java:432:in `runNormally'
> > > >from org/jruby/Ruby.java:312:in `runFromMain'
> > > >from org/jruby/Main.java:144:in `run'
> > > >from org/jruby/Main.java:89:in `run'
> > > >from org/jruby/Main.java:80:in `main'
> > > >from /hadoop/install/hbase/bin/../bin/HBase.rb:444:in `count'
> > > >from /hadoop/install/hbase/bin/../bin/hirb.rb:348:in `count'
> > > >from 

Re: Connection problem during data import into hbase

2009-02-21 Thread Amandeep Khurana
Here's another thing thats happening. I was trying to truncate the table.

hbase(main):001:0> truncate 'in_table'
Truncating in_table; it may take a while
Disabling table...
NativeException: org.apache.hadoop.hbase.RegionException: Retries exhausted,
it took too long to wait for the table in_table to be disabled.
from org/apache/hadoop/hbase/client/HBaseAdmin.java:387:in
`disableTable'
from org/apache/hadoop/hbase/client/HBaseAdmin.java:348:in
`disableTable'
from sun.reflect.NativeMethodAccessorImpl:-2:in `invoke0'
from sun.reflect.NativeMethodAccessorImpl:-1:in `invoke'
from sun.reflect.DelegatingMethodAccessorImpl:-1:in `invoke'
from java.lang.reflect.Method:-1:in `invoke'
from org/jruby/javasupport/JavaMethod.java:250:in
`invokeWithExceptionHandling'
from org/jruby/javasupport/JavaMethod.java:219:in `invoke'
from org/jruby/javasupport/JavaClass.java:416:in `execute'
from org/jruby/internal/runtime/methods/SimpleCallbackMethod.java:67:in
`call'
from org/jruby/internal/runtime/methods/DynamicMethod.java:78:in `call'
from org/jruby/runtime/CallSite.java:155:in `cacheAndCall'
from org/jruby/runtime/CallSite.java:332:in `call'
from org/jruby/evaluator/ASTInterpreter.java:649:in `callNode'
from org/jruby/evaluator/ASTInterpreter.java:324:in `evalInternal'

I left it for a few minutes and tried again. It worked. There was no load on
the cluster at all. changed the config (both) and added
dfs.datanode.socket.write.timeout property with value 0. I also defined the
property in the job config.

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Sat, Feb 21, 2009 at 1:23 AM, Amandeep Khurana  wrote:

> I have 1 master + 2 slaves.
> Am using 0.19.0 for both Hadoop and Hbase.
> I didnt change any config from the default except the hbase.rootdir and the
> hbase.master.
>
> I have gone through the FAQs but couldnt find anything. What exactly are
> you pointing to?
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Sat, Feb 21, 2009 at 1:14 AM, stack  wrote:
>
>> It looks like regionserver hosting root crashed:
>>
>> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
>> trying
>> to locate root region
>>
>> How many servers you running?
>>
>> You made similar config. to that reported by Larry Compton in a mail from
>> earlier today?  (See FAQ and Troubleshooting page for more on his listed
>> configs.)
>>
>> St.Ack
>>
>>
>> On Sat, Feb 21, 2009 at 1:01 AM, Amandeep Khurana 
>> wrote:
>>
>> > Yes, the table exists before I start the job.
>> >
>> > I am not using TableOutputFormat. I picked up the sample code from the
>> docs
>> > and am using it.
>> >
>> > Here's the job conf:
>> >
>> > JobConf conf = new JobConf(getConf(), IN_TABLE_IMPORT.class);
>> >FileInputFormat.setInputPaths(conf, new Path("import_data"));
>> >conf.setMapperClass(MapClass.class);
>> >conf.setNumReduceTasks(0);
>> >conf.setOutputFormat(NullOutputFormat.class);
>> >JobClient.runJob(conf);
>> >
>> > Interestingly, the hbase shell isnt working now either. Its giving
>> errors
>> > even when I give the command "list"...
>> >
>> >
>> >
>> > Amandeep Khurana
>> > Computer Science Graduate Student
>> > University of California, Santa Cruz
>> >
>> >
>> > On Sat, Feb 21, 2009 at 12:10 AM, stack  wrote:
>> >
>> > > The table exists before you start the MR job?
>> > >
>> > > When you say 'midway through the job', are you using tableoutputformat
>> to
>> > > insert into your table?
>> > >
>> > > Which version of hbase?
>> > >
>> > > St.Ack
>> > >
>> > > On Fri, Feb 20, 2009 at 9:55 PM, Amandeep Khurana 
>> > > wrote:
>> > >
>> > > > I dont know if this is related or not, but it seems to be. After
>> this
>> > map
>> > > > reduce job, I tried to count the number of entries in the table in
>> > hbase
>> > > > through the shell. It failed with the following error:
>> > > >
>> > > > hbase(main):002:0> count 'in_table'
>> > > > NativeException: java.lang.NullPointerException: null
>> > > >from java.lang.String:-1:in `'
>> > > >from org/apache/hadoop/hbase/util/Bytes.java:92:in `toString'
>> > > >from
>> > > org/apache/hadoop/hbase/client/RetriesExhaustedException.java:50:in
>> > > > `getMessage'
>> > > >from
>> > > org/apache/hadoop/hbase/client/RetriesExhaustedException.java:40:in
>> > > > `'
>> > > >from
>> org/apache/hadoop/hbase/client/HConnectionManager.java:841:in
>> > > > `getRegionServerWithRetries'
>> > > >from org/apache/hadoop/hbase/client/MetaScanner.java:56:in
>> > `metaScan'
>> > > >from org/apache/hadoop/hbase/client/MetaScanner.java:30:in
>> > `metaScan'
>> > > >from
>> org/apache/hadoop/hbase/client/HConnectionManager.java:411:in
>> > > > `getHTableDescriptor'
>> > > >from org/apache/hadoop/hbase/client/HTable.java:219:in
>> > > > `getTableDescriptor'
>> > > >from sun.reflect.NativeMetho

Re: Connection problem during data import into hbase

2009-02-21 Thread Amandeep Khurana
I have 1 master + 2 slaves.
Am using 0.19.0 for both Hadoop and Hbase.
I didnt change any config from the default except the hbase.rootdir and the
hbase.master.

I have gone through the FAQs but couldnt find anything. What exactly are you
pointing to?


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Sat, Feb 21, 2009 at 1:14 AM, stack  wrote:

> It looks like regionserver hosting root crashed:
>
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying
> to locate root region
>
> How many servers you running?
>
> You made similar config. to that reported by Larry Compton in a mail from
> earlier today?  (See FAQ and Troubleshooting page for more on his listed
> configs.)
>
> St.Ack
>
>
> On Sat, Feb 21, 2009 at 1:01 AM, Amandeep Khurana 
> wrote:
>
> > Yes, the table exists before I start the job.
> >
> > I am not using TableOutputFormat. I picked up the sample code from the
> docs
> > and am using it.
> >
> > Here's the job conf:
> >
> > JobConf conf = new JobConf(getConf(), IN_TABLE_IMPORT.class);
> >FileInputFormat.setInputPaths(conf, new Path("import_data"));
> >conf.setMapperClass(MapClass.class);
> >conf.setNumReduceTasks(0);
> >conf.setOutputFormat(NullOutputFormat.class);
> >JobClient.runJob(conf);
> >
> > Interestingly, the hbase shell isnt working now either. Its giving errors
> > even when I give the command "list"...
> >
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
> >
> > On Sat, Feb 21, 2009 at 12:10 AM, stack  wrote:
> >
> > > The table exists before you start the MR job?
> > >
> > > When you say 'midway through the job', are you using tableoutputformat
> to
> > > insert into your table?
> > >
> > > Which version of hbase?
> > >
> > > St.Ack
> > >
> > > On Fri, Feb 20, 2009 at 9:55 PM, Amandeep Khurana 
> > > wrote:
> > >
> > > > I dont know if this is related or not, but it seems to be. After this
> > map
> > > > reduce job, I tried to count the number of entries in the table in
> > hbase
> > > > through the shell. It failed with the following error:
> > > >
> > > > hbase(main):002:0> count 'in_table'
> > > > NativeException: java.lang.NullPointerException: null
> > > >from java.lang.String:-1:in `'
> > > >from org/apache/hadoop/hbase/util/Bytes.java:92:in `toString'
> > > >from
> > > org/apache/hadoop/hbase/client/RetriesExhaustedException.java:50:in
> > > > `getMessage'
> > > >from
> > > org/apache/hadoop/hbase/client/RetriesExhaustedException.java:40:in
> > > > `'
> > > >from org/apache/hadoop/hbase/client/HConnectionManager.java:841:in
> > > > `getRegionServerWithRetries'
> > > >from org/apache/hadoop/hbase/client/MetaScanner.java:56:in
> > `metaScan'
> > > >from org/apache/hadoop/hbase/client/MetaScanner.java:30:in
> > `metaScan'
> > > >from org/apache/hadoop/hbase/client/HConnectionManager.java:411:in
> > > > `getHTableDescriptor'
> > > >from org/apache/hadoop/hbase/client/HTable.java:219:in
> > > > `getTableDescriptor'
> > > >from sun.reflect.NativeMethodAccessorImpl:-2:in `invoke0'
> > > >from sun.reflect.NativeMethodAccessorImpl:-1:in `invoke'
> > > >from sun.reflect.DelegatingMethodAccessorImpl:-1:in `invoke'
> > > >from java.lang.reflect.Method:-1:in `invoke'
> > > >from org/jruby/javasupport/JavaMethod.java:250:in
> > > > `invokeWithExceptionHandling'
> > > >from org/jruby/javasupport/JavaMethod.java:219:in `invoke'
> > > >from org/jruby/javasupport/JavaClass.java:416:in `execute'
> > > > ... 145 levels...
> > > >from org/jruby/internal/runtime/methods/DynamicMethod.java:74:in
> > > `call'
> > > >from org/jruby/internal/runtime/methods/CompiledMethod.java:48:in
> > > `call'
> > > >from org/jruby/runtime/CallSite.java:123:in `cacheAndCall'
> > > >from org/jruby/runtime/CallSite.java:298:in `call'
> > > >from
> > > >
> > > >
> > >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:429:in
> > > > `__file__'
> > > >from
> > > >
> > > >
> > >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in
> > > > `__file__'
> > > >from
> > > >
> > > >
> > >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in
> > > > `load'
> > > >from org/jruby/Ruby.java:512:in `runScript'
> > > >from org/jruby/Ruby.java:432:in `runNormally'
> > > >from org/jruby/Ruby.java:312:in `runFromMain'
> > > >from org/jruby/Main.java:144:in `run'
> > > >from org/jruby/Main.java:89:in `run'
> > > >from org/jruby/Main.java:80:in `main'
> > > >from /hadoop/install/hbase/bin/../bin/HBase.rb:444:in `count'
> > > >from /hadoop/install/hbase/bin/../bin/hirb.rb:348:in `count'
> > > >from (hbase):3:in `binding'
> > > >
> > > >
> > > > Amandeep Khurana
> > > > Computer Science Graduate S

Re: Connection problem during data import into hbase

2009-02-21 Thread stack
It looks like regionserver hosting root crashed:

org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying
to locate root region

How many servers you running?

You made similar config. to that reported by Larry Compton in a mail from
earlier today?  (See FAQ and Troubleshooting page for more on his listed
configs.)

St.Ack


On Sat, Feb 21, 2009 at 1:01 AM, Amandeep Khurana  wrote:

> Yes, the table exists before I start the job.
>
> I am not using TableOutputFormat. I picked up the sample code from the docs
> and am using it.
>
> Here's the job conf:
>
> JobConf conf = new JobConf(getConf(), IN_TABLE_IMPORT.class);
>FileInputFormat.setInputPaths(conf, new Path("import_data"));
>conf.setMapperClass(MapClass.class);
>conf.setNumReduceTasks(0);
>conf.setOutputFormat(NullOutputFormat.class);
>JobClient.runJob(conf);
>
> Interestingly, the hbase shell isnt working now either. Its giving errors
> even when I give the command "list"...
>
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Sat, Feb 21, 2009 at 12:10 AM, stack  wrote:
>
> > The table exists before you start the MR job?
> >
> > When you say 'midway through the job', are you using tableoutputformat to
> > insert into your table?
> >
> > Which version of hbase?
> >
> > St.Ack
> >
> > On Fri, Feb 20, 2009 at 9:55 PM, Amandeep Khurana 
> > wrote:
> >
> > > I dont know if this is related or not, but it seems to be. After this
> map
> > > reduce job, I tried to count the number of entries in the table in
> hbase
> > > through the shell. It failed with the following error:
> > >
> > > hbase(main):002:0> count 'in_table'
> > > NativeException: java.lang.NullPointerException: null
> > >from java.lang.String:-1:in `'
> > >from org/apache/hadoop/hbase/util/Bytes.java:92:in `toString'
> > >from
> > org/apache/hadoop/hbase/client/RetriesExhaustedException.java:50:in
> > > `getMessage'
> > >from
> > org/apache/hadoop/hbase/client/RetriesExhaustedException.java:40:in
> > > `'
> > >from org/apache/hadoop/hbase/client/HConnectionManager.java:841:in
> > > `getRegionServerWithRetries'
> > >from org/apache/hadoop/hbase/client/MetaScanner.java:56:in
> `metaScan'
> > >from org/apache/hadoop/hbase/client/MetaScanner.java:30:in
> `metaScan'
> > >from org/apache/hadoop/hbase/client/HConnectionManager.java:411:in
> > > `getHTableDescriptor'
> > >from org/apache/hadoop/hbase/client/HTable.java:219:in
> > > `getTableDescriptor'
> > >from sun.reflect.NativeMethodAccessorImpl:-2:in `invoke0'
> > >from sun.reflect.NativeMethodAccessorImpl:-1:in `invoke'
> > >from sun.reflect.DelegatingMethodAccessorImpl:-1:in `invoke'
> > >from java.lang.reflect.Method:-1:in `invoke'
> > >from org/jruby/javasupport/JavaMethod.java:250:in
> > > `invokeWithExceptionHandling'
> > >from org/jruby/javasupport/JavaMethod.java:219:in `invoke'
> > >from org/jruby/javasupport/JavaClass.java:416:in `execute'
> > > ... 145 levels...
> > >from org/jruby/internal/runtime/methods/DynamicMethod.java:74:in
> > `call'
> > >from org/jruby/internal/runtime/methods/CompiledMethod.java:48:in
> > `call'
> > >from org/jruby/runtime/CallSite.java:123:in `cacheAndCall'
> > >from org/jruby/runtime/CallSite.java:298:in `call'
> > >from
> > >
> > >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:429:in
> > > `__file__'
> > >from
> > >
> > >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in
> > > `__file__'
> > >from
> > >
> > >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in
> > > `load'
> > >from org/jruby/Ruby.java:512:in `runScript'
> > >from org/jruby/Ruby.java:432:in `runNormally'
> > >from org/jruby/Ruby.java:312:in `runFromMain'
> > >from org/jruby/Main.java:144:in `run'
> > >from org/jruby/Main.java:89:in `run'
> > >from org/jruby/Main.java:80:in `main'
> > >from /hadoop/install/hbase/bin/../bin/HBase.rb:444:in `count'
> > >from /hadoop/install/hbase/bin/../bin/hirb.rb:348:in `count'
> > >from (hbase):3:in `binding'
> > >
> > >
> > > Amandeep Khurana
> > > Computer Science Graduate Student
> > > University of California, Santa Cruz
> > >
> > >
> > > On Fri, Feb 20, 2009 at 9:46 PM, Amandeep Khurana 
> > > wrote:
> > >
> > > > Here's what it throws on the console:
> > > >
> > > > 09/02/20 21:45:29 INFO mapred.JobClient: Task Id :
> > > > attempt_200902201300_0019_m_06_0, Status : FAILED
> > > > java.io.IOException: table is null
> > > > at IN_TABLE_IMPORT$MapClass.map(IN_TABLE_IMPORT.java:33)
> > > > at IN_TABLE_IMPORT$MapClass.map(IN_TABLE_IMPORT.java:1)
> > > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:33

Re: Connection problem during data import into hbase

2009-02-21 Thread Amandeep Khurana
I just did a "count 'in_table'", it did work. So, I truncated the table and
started the map reduce job again to load the entire file. For some reason, I
had to kill the job. After that, the count command gives the following
error:


count 'in_table'

NativeException: org.apache.hadoop.hbase.client.RetriesExhaustedException:
Trying to contact region server 171.69.102.51:60020 for region
in_table,,1235194191883, row '', but failed after 5 attempts.
Exceptions:
org.apache.hadoop.hbase.NotServingRegionException:
org.apache.hadoop.hbase.NotServingRegionException: in_table,,1235194191883
at
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1699)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)

org.apache.hadoop.hbase.NotServingRegionException:
org.apache.hadoop.hbase.NotServingRegionException: in_table,,1235194191883
at
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1699)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)


It seems like the daemon is getting corrupt with the Map Red job somehow...
I cant really understand whats happening.

Amandeep

Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Sat, Feb 21, 2009 at 1:01 AM, Amandeep Khurana  wrote:

> Yes, the table exists before I start the job.
>
> I am not using TableOutputFormat. I picked up the sample code from the docs
> and am using it.
>
> Here's the job conf:
>
> JobConf conf = new JobConf(getConf(), IN_TABLE_IMPORT.class);
> FileInputFormat.setInputPaths(
> conf, new Path("import_data"));
> conf.setMapperClass(MapClass.class);
> conf.setNumReduceTasks(0);
> conf.setOutputFormat(NullOutputFormat.class);
> JobClient.runJob(conf);
>
>
> Interestingly, the hbase shell isnt working now either. Its giving errors
> even when I give the command "list"...
>
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Sat, Feb 21, 2009 at 12:10 AM, stack  wrote:
>
>> The table exists before you start the MR job?
>>
>> When you say 'midway through the job', are you using tableoutputformat to
>> insert into your table?
>>
>> Which version of hbase?
>>
>> St.Ack
>>
>> On Fri, Feb 20, 2009 at 9:55 PM, Amandeep Khurana 
>> wrote:
>>
>> > I dont know if this is related or not, but it seems to be. After this
>> map
>> > reduce job, I tried to count the number of entries in the table in hbase
>> > through the shell. It failed with the following error:
>> >
>> > hbase(main):002:0> count 'in_table'
>> > NativeException: java.lang.NullPointerException: null
>> >from java.lang.String:-1:in `'
>> >from org/apache/hadoop/hbase/util/Bytes.java:92:in `toString'
>> >from
>> org/apache/hadoop/hbase/client/RetriesExhaustedException.java:50:in
>> > `getMessage'
>> >from
>> org/apache/hadoop/hbase/client/RetriesExhaustedException.java:40:in
>> > `'
>> >from org/apache/hadoop/hbase/client/HConnectionManager.java:841:in
>> > `getRegionServerWithRetries'
>> >from org/apache/hadoop/hbase/client/MetaScanner.java:56:in `metaScan'
>> >from org/apache/hadoop/hbase/client/MetaScanner.java:30:in `metaScan'
>> >from org/apache/hadoop/hbase/client/HConnectionManager.java:411:in
>> > `getHTableDescriptor'
>> >from org/apache/hadoop/hbase/client/HTable.java:219:in
>> > `getTableDescriptor'
>> >from sun.reflect.NativeMethodAccessorImpl:-2:in `invoke0'
>> >from sun.reflect.NativeMethodAccessorImpl:-1:in `invoke'
>> >from sun.reflect.DelegatingMethodAccessorImpl:-1:in `invoke'
>> >from java.lang.reflect.Method:-1:in `invoke'
>> >from org/jruby/javasupport/JavaMethod.java:250:in
>> > `invokeWithExceptionHandling'
>> >from org/jruby/javasupport/JavaMethod.java:219:in `invoke'
>> >from org/jruby/javasupport/JavaClass.java:416:in `execute'
>> > ... 145 levels...
>> >from org/jruby/internal/runtime/methods/DynamicMethod.java:74:in
>> `call'
>> >from org/jruby/internal/runtime/methods/CompiledMethod.java:48:in
>> `call'
>> >

Re: Connection problem during data import into hbase

2009-02-21 Thread Amandeep Khurana
Yes, the table exists before I start the job.

I am not using TableOutputFormat. I picked up the sample code from the docs
and am using it.

Here's the job conf:

JobConf conf = new JobConf(getConf(), IN_TABLE_IMPORT.class);
FileInputFormat.setInputPaths(conf, new Path("import_data"));
conf.setMapperClass(MapClass.class);
conf.setNumReduceTasks(0);
conf.setOutputFormat(NullOutputFormat.class);
JobClient.runJob(conf);

Interestingly, the hbase shell isnt working now either. Its giving errors
even when I give the command "list"...



Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Sat, Feb 21, 2009 at 12:10 AM, stack  wrote:

> The table exists before you start the MR job?
>
> When you say 'midway through the job', are you using tableoutputformat to
> insert into your table?
>
> Which version of hbase?
>
> St.Ack
>
> On Fri, Feb 20, 2009 at 9:55 PM, Amandeep Khurana 
> wrote:
>
> > I dont know if this is related or not, but it seems to be. After this map
> > reduce job, I tried to count the number of entries in the table in hbase
> > through the shell. It failed with the following error:
> >
> > hbase(main):002:0> count 'in_table'
> > NativeException: java.lang.NullPointerException: null
> >from java.lang.String:-1:in `'
> >from org/apache/hadoop/hbase/util/Bytes.java:92:in `toString'
> >from
> org/apache/hadoop/hbase/client/RetriesExhaustedException.java:50:in
> > `getMessage'
> >from
> org/apache/hadoop/hbase/client/RetriesExhaustedException.java:40:in
> > `'
> >from org/apache/hadoop/hbase/client/HConnectionManager.java:841:in
> > `getRegionServerWithRetries'
> >from org/apache/hadoop/hbase/client/MetaScanner.java:56:in `metaScan'
> >from org/apache/hadoop/hbase/client/MetaScanner.java:30:in `metaScan'
> >from org/apache/hadoop/hbase/client/HConnectionManager.java:411:in
> > `getHTableDescriptor'
> >from org/apache/hadoop/hbase/client/HTable.java:219:in
> > `getTableDescriptor'
> >from sun.reflect.NativeMethodAccessorImpl:-2:in `invoke0'
> >from sun.reflect.NativeMethodAccessorImpl:-1:in `invoke'
> >from sun.reflect.DelegatingMethodAccessorImpl:-1:in `invoke'
> >from java.lang.reflect.Method:-1:in `invoke'
> >from org/jruby/javasupport/JavaMethod.java:250:in
> > `invokeWithExceptionHandling'
> >from org/jruby/javasupport/JavaMethod.java:219:in `invoke'
> >from org/jruby/javasupport/JavaClass.java:416:in `execute'
> > ... 145 levels...
> >from org/jruby/internal/runtime/methods/DynamicMethod.java:74:in
> `call'
> >from org/jruby/internal/runtime/methods/CompiledMethod.java:48:in
> `call'
> >from org/jruby/runtime/CallSite.java:123:in `cacheAndCall'
> >from org/jruby/runtime/CallSite.java:298:in `call'
> >from
> >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:429:in
> > `__file__'
> >from
> >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in
> > `__file__'
> >from
> >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in
> > `load'
> >from org/jruby/Ruby.java:512:in `runScript'
> >from org/jruby/Ruby.java:432:in `runNormally'
> >from org/jruby/Ruby.java:312:in `runFromMain'
> >from org/jruby/Main.java:144:in `run'
> >from org/jruby/Main.java:89:in `run'
> >from org/jruby/Main.java:80:in `main'
> >from /hadoop/install/hbase/bin/../bin/HBase.rb:444:in `count'
> >from /hadoop/install/hbase/bin/../bin/hirb.rb:348:in `count'
> >from (hbase):3:in `binding'
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
> >
> > On Fri, Feb 20, 2009 at 9:46 PM, Amandeep Khurana 
> > wrote:
> >
> > > Here's what it throws on the console:
> > >
> > > 09/02/20 21:45:29 INFO mapred.JobClient: Task Id :
> > > attempt_200902201300_0019_m_06_0, Status : FAILED
> > > java.io.IOException: table is null
> > > at IN_TABLE_IMPORT$MapClass.map(IN_TABLE_IMPORT.java:33)
> > > at IN_TABLE_IMPORT$MapClass.map(IN_TABLE_IMPORT.java:1)
> > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> > > at org.apache.hadoop.mapred.Child.main(Child.java:155)
> > >
> > > attempt_200902201300_0019_m_06_0:
> > > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
> > trying
> > > to locate root region
> > > attempt_200902201300_0019_m_06_0:   at
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:768)
> > > attempt_200902201300_0019_m_06_0:   at
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:448)
> > > attempt_200902201300_0019_m_06_0:   at
> > >
> >
> org.apac

Re: Connection problem during data import into hbase

2009-02-21 Thread Amandeep Khurana
Yes, the table exists before I start the job.

I am not using TableOutputFormat. I picked up the sample code from the docs
and am using it.

Here's the job conf:

JobConf conf = new JobConf(getConf(), IN_TABLE_IMPORT.class);
FileInputFormat.setInputPaths(conf, new Path("import_data"));
conf.setMapperClass(MapClass.class);
conf.setNumReduceTasks(0);
conf.setOutputFormat(NullOutputFormat.class);
JobClient.runJob(conf);


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Sat, Feb 21, 2009 at 12:10 AM, stack  wrote:

> The table exists before you start the MR job?
>
> When you say 'midway through the job', are you using tableoutputformat to
> insert into your table?
>
> Which version of hbase?
>
> St.Ack
>
> On Fri, Feb 20, 2009 at 9:55 PM, Amandeep Khurana 
> wrote:
>
> > I dont know if this is related or not, but it seems to be. After this map
> > reduce job, I tried to count the number of entries in the table in hbase
> > through the shell. It failed with the following error:
> >
> > hbase(main):002:0> count 'in_table'
> > NativeException: java.lang.NullPointerException: null
> >from java.lang.String:-1:in `'
> >from org/apache/hadoop/hbase/util/Bytes.java:92:in `toString'
> >from
> org/apache/hadoop/hbase/client/RetriesExhaustedException.java:50:in
> > `getMessage'
> >from
> org/apache/hadoop/hbase/client/RetriesExhaustedException.java:40:in
> > `'
> >from org/apache/hadoop/hbase/client/HConnectionManager.java:841:in
> > `getRegionServerWithRetries'
> >from org/apache/hadoop/hbase/client/MetaScanner.java:56:in `metaScan'
> >from org/apache/hadoop/hbase/client/MetaScanner.java:30:in `metaScan'
> >from org/apache/hadoop/hbase/client/HConnectionManager.java:411:in
> > `getHTableDescriptor'
> >from org/apache/hadoop/hbase/client/HTable.java:219:in
> > `getTableDescriptor'
> >from sun.reflect.NativeMethodAccessorImpl:-2:in `invoke0'
> >from sun.reflect.NativeMethodAccessorImpl:-1:in `invoke'
> >from sun.reflect.DelegatingMethodAccessorImpl:-1:in `invoke'
> >from java.lang.reflect.Method:-1:in `invoke'
> >from org/jruby/javasupport/JavaMethod.java:250:in
> > `invokeWithExceptionHandling'
> >from org/jruby/javasupport/JavaMethod.java:219:in `invoke'
> >from org/jruby/javasupport/JavaClass.java:416:in `execute'
> > ... 145 levels...
> >from org/jruby/internal/runtime/methods/DynamicMethod.java:74:in
> `call'
> >from org/jruby/internal/runtime/methods/CompiledMethod.java:48:in
> `call'
> >from org/jruby/runtime/CallSite.java:123:in `cacheAndCall'
> >from org/jruby/runtime/CallSite.java:298:in `call'
> >from
> >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:429:in
> > `__file__'
> >from
> >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in
> > `__file__'
> >from
> >
> >
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in
> > `load'
> >from org/jruby/Ruby.java:512:in `runScript'
> >from org/jruby/Ruby.java:432:in `runNormally'
> >from org/jruby/Ruby.java:312:in `runFromMain'
> >from org/jruby/Main.java:144:in `run'
> >from org/jruby/Main.java:89:in `run'
> >from org/jruby/Main.java:80:in `main'
> >from /hadoop/install/hbase/bin/../bin/HBase.rb:444:in `count'
> >from /hadoop/install/hbase/bin/../bin/hirb.rb:348:in `count'
> >from (hbase):3:in `binding'
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
> >
> > On Fri, Feb 20, 2009 at 9:46 PM, Amandeep Khurana 
> > wrote:
> >
> > > Here's what it throws on the console:
> > >
> > > 09/02/20 21:45:29 INFO mapred.JobClient: Task Id :
> > > attempt_200902201300_0019_m_06_0, Status : FAILED
> > > java.io.IOException: table is null
> > > at IN_TABLE_IMPORT$MapClass.map(IN_TABLE_IMPORT.java:33)
> > > at IN_TABLE_IMPORT$MapClass.map(IN_TABLE_IMPORT.java:1)
> > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> > > at org.apache.hadoop.mapred.Child.main(Child.java:155)
> > >
> > > attempt_200902201300_0019_m_06_0:
> > > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
> > trying
> > > to locate root region
> > > attempt_200902201300_0019_m_06_0:   at
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:768)
> > > attempt_200902201300_0019_m_06_0:   at
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:448)
> > > attempt_200902201300_0019_m_06_0:   at
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:430)
> > > attempt_20

Re: Connection problem during data import into hbase

2009-02-21 Thread stack
The table exists before you start the MR job?

When you say 'midway through the job', are you using tableoutputformat to
insert into your table?

Which version of hbase?

St.Ack

On Fri, Feb 20, 2009 at 9:55 PM, Amandeep Khurana  wrote:

> I dont know if this is related or not, but it seems to be. After this map
> reduce job, I tried to count the number of entries in the table in hbase
> through the shell. It failed with the following error:
>
> hbase(main):002:0> count 'in_table'
> NativeException: java.lang.NullPointerException: null
>from java.lang.String:-1:in `'
>from org/apache/hadoop/hbase/util/Bytes.java:92:in `toString'
>from org/apache/hadoop/hbase/client/RetriesExhaustedException.java:50:in
> `getMessage'
>from org/apache/hadoop/hbase/client/RetriesExhaustedException.java:40:in
> `'
>from org/apache/hadoop/hbase/client/HConnectionManager.java:841:in
> `getRegionServerWithRetries'
>from org/apache/hadoop/hbase/client/MetaScanner.java:56:in `metaScan'
>from org/apache/hadoop/hbase/client/MetaScanner.java:30:in `metaScan'
>from org/apache/hadoop/hbase/client/HConnectionManager.java:411:in
> `getHTableDescriptor'
>from org/apache/hadoop/hbase/client/HTable.java:219:in
> `getTableDescriptor'
>from sun.reflect.NativeMethodAccessorImpl:-2:in `invoke0'
>from sun.reflect.NativeMethodAccessorImpl:-1:in `invoke'
>from sun.reflect.DelegatingMethodAccessorImpl:-1:in `invoke'
>from java.lang.reflect.Method:-1:in `invoke'
>from org/jruby/javasupport/JavaMethod.java:250:in
> `invokeWithExceptionHandling'
>from org/jruby/javasupport/JavaMethod.java:219:in `invoke'
>from org/jruby/javasupport/JavaClass.java:416:in `execute'
> ... 145 levels...
>from org/jruby/internal/runtime/methods/DynamicMethod.java:74:in `call'
>from org/jruby/internal/runtime/methods/CompiledMethod.java:48:in `call'
>from org/jruby/runtime/CallSite.java:123:in `cacheAndCall'
>from org/jruby/runtime/CallSite.java:298:in `call'
>from
>
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:429:in
> `__file__'
>from
>
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in
> `__file__'
>from
>
> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in
> `load'
>from org/jruby/Ruby.java:512:in `runScript'
>from org/jruby/Ruby.java:432:in `runNormally'
>from org/jruby/Ruby.java:312:in `runFromMain'
>from org/jruby/Main.java:144:in `run'
>from org/jruby/Main.java:89:in `run'
>from org/jruby/Main.java:80:in `main'
>from /hadoop/install/hbase/bin/../bin/HBase.rb:444:in `count'
>from /hadoop/install/hbase/bin/../bin/hirb.rb:348:in `count'
>from (hbase):3:in `binding'
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Fri, Feb 20, 2009 at 9:46 PM, Amandeep Khurana 
> wrote:
>
> > Here's what it throws on the console:
> >
> > 09/02/20 21:45:29 INFO mapred.JobClient: Task Id :
> > attempt_200902201300_0019_m_06_0, Status : FAILED
> > java.io.IOException: table is null
> > at IN_TABLE_IMPORT$MapClass.map(IN_TABLE_IMPORT.java:33)
> > at IN_TABLE_IMPORT$MapClass.map(IN_TABLE_IMPORT.java:1)
> > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> > at org.apache.hadoop.mapred.Child.main(Child.java:155)
> >
> > attempt_200902201300_0019_m_06_0:
> > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
> trying
> > to locate root region
> > attempt_200902201300_0019_m_06_0:   at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:768)
> > attempt_200902201300_0019_m_06_0:   at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:448)
> > attempt_200902201300_0019_m_06_0:   at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:430)
> > attempt_200902201300_0019_m_06_0:   at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:557)
> > attempt_200902201300_0019_m_06_0:   at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:457)
> > attempt_200902201300_0019_m_06_0:   at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:430)
> > attempt_200902201300_0019_m_06_0:   at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:557)
> > attempt_200902201300_0019_m_06_0:   at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:461)
> > attempt_200902201300_0019