Re: Hbase and Hadoop Config to run in Standalone mode

Jean-Daniel Cryans Thu, 23 Jul 2009 10:11:52 -0700

Ok that explains. The problem you have is that you extended
IdentityTableMap but tried to override it with the wrong method name
so it was never called. Instead it was the parent's map that was
called.


The error it's now giving you is pretty much self-explanatory and is
not related to Hadoop or HBase, you must override the map method and
this is done with @override.

You should also take at look at this doc to learn how to build your
jobs 
http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapred/package-summary.html

J-D

On Thu, Jul 23, 2009 at 1:04 PM, bharath
vissapragada<[email protected]> wrote:
> I think this is the problem .. but when i changed it .. it gave me a weird
> error
>
>  name clash:
> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>,org.apache.hadoop.mapred.Reporter)
> in MR_DS_Scan_Case1 and
> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult>,org.apache.hadoop.mapred.Reporter)
> in org.apache.hadoop.hbase.mapred.IdentityTableMap have the same erasure,
> yet neither overrides the other
>
> I must override the map function in the IdentityTableMap ... but other
> libraries also seem to have map function ..
> so what must i do ..
>
> On Thu, Jul 23, 2009 at 10:26 PM, Jean-Daniel Cryans 
> <[email protected]>wrote:
>
>> Think I found your problem, is this a typo?
>>
>>  public void mapp(ImmutableBytesWritable row, RowResult value,
>> OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
>>
>> I should read map not mapp
>>
>> J-D
>>
>> On Thu, Jul 23, 2009 at 12:42 PM, bharath
>> vissapragada<[email protected]> wrote:
>> > I have tried apache -commons logging ...
>> >
>> > instead of printing the row ... i have written log.error(row) ...
>> > even then i got the same output as follows ...
>> >
>> > 09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> > processName=JobTracker, sessionId=
>> > 09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User
>> classes
>> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>> > 0->localhost.localdomain:,
>> > 09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
>> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>> > 0->localhost.localdomain:,
>> > 09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
>> > 09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
>> > 09/07/24 03:41:40 INFO mapred.MapTask: data buffer = 79691776/99614720
>> > 09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
>> > 09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
>> > 09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
>> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_m_000000_0
>> > is done. And is in the process of commiting
>> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>> > 'attempt_local_0001_m_000000_0' done.
>> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> > 09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
>> > 09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass, with 1
>> > segments left of total size: 333 bytes
>> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_r_000000_0
>> > is done. And is in the process of commiting
>> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
>> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>> > 'attempt_local_0001_r_000000_0' done.
>> > 09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
>> > 09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
>> > 09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes written=78346
>> > 09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8
>> >
>> >
>> > On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans <[email protected]
>> >wrote:
>> >
>> >> And you don't need any more config to run local MR jobs on HBase. But
>> >> you do need Hadoop when running MR jobs on HBase on a cluster.
>> >>
>> >> Also your code is running fine as you could see, the real question is
>> >> where is the stdout going when in local mode. When you ran your other
>> >> MR jobs, it was on a working Hadoop setup right? So you were looking
>> >> at the logs in the web UI? One simple thing to do is to do your
>> >> debugging with a logger so you are sure to see your output as I
>> >> already proposed. Another simple thing is to get a pseudo-distributed
>> >> setup and run you HBase MR jobs with Hadoop and get your logs like I'm
>> >> sure you did before.
>> >>
>> >> J-D
>> >>
>> >> On Thu, Jul 23, 2009 at 11:54 AM, bharath
>> >> vissapragada<[email protected]> wrote:
>> >> > I am really thankful to you J-D for replying me inspite of ur busy
>> >> schedule.
>> >> > I am still in a learning stage and there are no good guides on HBase
>> >> other
>> >> > than Its own one .. So please spare me and I really appreciate ur help
>> .
>> >> >
>> >> > Now i got ur point that there is no need of hadoop while running Hbase
>> MR
>> >> > programs .... But iam confused abt the config . I have only set the
>> >> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do
>> anything
>> >> ..
>> >> > so i wonder if my conf was wrong or some error in that simple code ...
>> >> > because stdout worked for me while writing mapreduce programs ...
>> >> >
>> >> > Thanks once again!
>> >> >
>> >> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <
>> [email protected]
>> >> >wrote:
>> >> >
>> >> >> The code itself is very simple, I was referring to your own
>> >> >> description of your situation. You say you use standalone HBase yet
>> >> >> you talk about Hadoop configuration. You also talk about the
>> >> >> JobTracker web UI which is in no use since you run local jobs
>> directly
>> >> >> on HBase.
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
>> >> >> vissapragada<[email protected]> wrote:
>> >> >> > I used stdout for debugging while writing codes in hadoop MR
>> programs
>> >> and
>> >> >> it
>> >> >> > worked fine ...
>> >> >> > Can you please tell me wch part of the code u found confusing so
>> that
>> >> i
>> >> >> can
>> >> >> > explain it a bit clearly ...
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
>> >> [email protected]
>> >> >> >wrote:
>> >> >> >
>> >> >> >> What you wrote is a bit confusing to me, sorry.
>> >> >> >>
>> >> >> >> The usual way to debug MR jobs is to define a logger and post with
>> >> >> >> either info or debug level, not sysout like you did. I'm not even
>> >> sure
>> >> >> >> where the standard output is logged when using a local job. Also
>> >> since
>> >> >> >> this is local you won't see anything in your host:50030 web UI. So
>> >> use
>> >> >> >> apache common logging and you should see your output.
>> >> >> >>
>> >> >> >> J-D
>> >> >> >>
>> >> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
>> >> >> >> vissapragada<[email protected]> wrote:
>> >> >> >> > Thanks for ur reply J-D ... Im pasting some part of the code ...
>> >> >> >> >
>> >> >> >> > Im doing it frm the command line .. Iam pasting some part of the
>> >> code
>> >> >> >> here
>> >> >> >> > ....
>> >> >> >> >
>> >> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult value,
>> >> >> >> > OutputCollector<Text, Text> output, Reporter reporter) throws
>> >> >> IOException
>> >> >> >> {
>> >> >> >> >                System.out.println(row);
>> >> >> >> > }
>> >> >> >> >
>> >> >> >> > public JobConf createSubmittableJob(String[] args) throws
>> >> IOException
>> >> >> {
>> >> >> >> >                JobConf c = new JobConf(getConf(),
>> >> >> >> MR_DS_Scan_Case1.class);
>> >> >> >> >                c.set("col.name", args[1]);
>> >> >> >> >                c.set("operator.name",args[2]);
>> >> >> >> >                c.set("val.name",args[3]);
>> >> >> >> >                IdentityTableMap.initJob(args[0], args[1],
>> >> >> >> this.getClass(),
>> >> >> >> > c);
>> >> >> >> >                c.setOutputFormat(NullOutputFormat.class);
>> >> >> >> >                 return c
>> >> >> >> > }
>> >> >> >> >
>> >> >> >> > As u can see ... im just printing the value of row in the map ..
>> i
>> >> >> can't
>> >> >> >> see
>> >> >> >> > in the terminal .....
>> >> >> >> > I only wan't the map phase ... so i didn't write any reduce
>> phase
>> >> ..
>> >> >> is
>> >> >> >> my
>> >> >> >> > jobConf correct??
>> >> >> >> >
>> >> >> >> > Also as i have already asked how to check the job logs and web
>> >> >> interface
>> >> >> >> > like "localhost:<port>/jobTracker.jsp"... since im running in
>> local
>> >> >> mode
>> >> >> >> ...
>> >> >> >> >
>> >> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
>> >> >> [email protected]
>> >> >> >> >wrote:
>> >> >> >> >
>> >> >> >> >> What output do you need exactly? I see that you have 8 output
>> >> records
>> >> >> >> >> in your reduce task so if you take a look in your output folder
>> or
>> >> >> >> >> table (I don't know which sink you used) you should see them.
>> >> >> >> >>
>> >> >> >> >> Also did you run your MR inside Eclipse or in command line?
>> >> >> >> >>
>> >> >> >> >> Thx,
>> >> >> >> >>
>> >> >> >> >> J-D
>> >> >> >> >>
>> >> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>> >> >> >> >> vissapragada<[email protected]> wrote:
>> >> >> >> >> > This is the output i go t.. seems everything is fine ..but no
>> >> >> output!!
>> >> >> >> >> >
>> >> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM
>> Metrics
>> >> >> with
>> >> >> >> >> > processName=JobTracker, sessionId=
>> >> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.
>> >>  User
>> >> >> >> >> classes
>> >> >> >> >> > may not be found. See JobConf(Class) or
>> JobConf#setJar(String).
>> >> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
>> >> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
>> >> >> job_local_0001
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
>> >> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
>> >> >> 79691776/99614720
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
>> >> >> 262144/327680
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map
>> >> output
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> >> Task:attempt_local_0001_m_000000_0
>> >> >> >> >> > is done. And is in the process of commiting
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> >> >> > 'attempt_local_0001_m_000000_0' done.
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted
>> segments
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last
>> >> merge-pass,
>> >> >> >> with 1
>> >> >> >> >> > segments left of total size: 333 bytes
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> >> Task:attempt_local_0001_r_000000_0
>> >> >> >> >> > is done. And is in the process of commiting
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> >> >> > 'attempt_local_0001_r_000000_0' done.
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
>> >> >> job_local_0001
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>> >> read=38949
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>> >> >> written=78378
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce
>> Framework
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>> >> groups=8
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output
>> >> >> records=0
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
>> records=8
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output
>> >> >> records=8
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>> >> bytes=315
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
>> bytes=0
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input
>> >> >> records=0
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>> >> records=8
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>> >> records=8
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
>> >> >> >> >> > [email protected]> wrote:
>> >> >> >> >> >
>> >> >> >> >> >> since i haven;t started the cluster .. i can even see the
>> >> details
>> >> >> in
>> >> >> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add
>> >> anything
>> >> >> to
>> >> >> >> >> >> hadoop/conf/hadoop-site.xml
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>> >> >> >> >> >> [email protected]> wrote:
>> >> >> >> >> >>
>> >> >> >> >> >>> Hi all ,
>> >> >> >> >> >>>
>> >> >> >> >> >>> I wanted to run HBase in standalone mode to check my Hbase
>> MR
>> >> >> >> programs
>> >> >> >> >> ...
>> >> >> >> >> >>> I have dl a built version of hbase-0.20. and i have hadoop
>> >> 0.19.3
>> >> >> >> >> >>>
>> >> >> >> >> >>> "I have set JAVA_HOME in both of them" .. then i started
>> hbase
>> >> >> and
>> >> >> >> >> >>> inserted some tables using JAVA API .. Now i have written
>> some
>> >> MR
>> >> >> >> >> programs
>> >> >> >> >> >>> onHBase and when i run them on Hbase it runs perfectly
>> without
>> >> >> any
>> >> >> >> >> errors
>> >> >> >> >> >>> and all the Map -reduce statistics are displayed correctly
>> but
>> >>  i
>> >> >> >> get
>> >> >> >> >> no
>> >> >> >> >> >>> output .
>> >> >> >> >> >>>
>> >> >> >> >> >>> I have one doubt now .. how do HBase recognize hadoop in
>> stand
>> >> >> alone
>> >> >> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple print
>> >> >> >> statements
>> >> >> >> >> donot
>> >> >> >> >> >>> work .. no output is displayed on the screen ... I doubt my
>> >> >> config
>> >> >> >> ....
>> >> >> >> >> >>>
>> >> >> >> >> >>> Do i need to add some config to run them ... Please reply
>> ...
>> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Reply via email to