Re: Hbase and Hadoop Config to run in Standalone mode

bharath vissapragada Thu, 23 Jul 2009 10:31:39 -0700

but now it is strangely saying that

 method does not override or implement a method from a supertype
    @Override
    ^
previously it had pointed that
"org.apache.hadoop.hbase.mapred.IdentityTableMap
have the same erasure"
 :(



On Thu, Jul 23, 2009 at 10:41 PM, Jean-Daniel Cryans <[email protected]>wrote:

> Ok that explains. The problem you have is that you extended
> IdentityTableMap but tried to override it with the wrong method name
> so it was never called. Instead it was the parent's map that was
> called.
>
> The error it's now giving you is pretty much self-explanatory and is
> not related to Hadoop or HBase, you must override the map method and
> this is done with @override.
>
> You should also take at look at this doc to learn how to build your
> jobs
> http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapred/package-summary.html
>
> J-D
>
> On Thu, Jul 23, 2009 at 1:04 PM, bharath
> vissapragada<[email protected]> wrote:
> > I think this is the problem .. but when i changed it .. it gave me a
> weird
> > error
> >
> >  name clash:
> >
> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>,org.apache.hadoop.mapred.Reporter)
> > in MR_DS_Scan_Case1 and
> >
> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult>,org.apache.hadoop.mapred.Reporter)
> > in org.apache.hadoop.hbase.mapred.IdentityTableMap have the same erasure,
> > yet neither overrides the other
> >
> > I must override the map function in the IdentityTableMap ... but other
> > libraries also seem to have map function ..
> > so what must i do ..
> >
> > On Thu, Jul 23, 2009 at 10:26 PM, Jean-Daniel Cryans <
> [email protected]>wrote:
> >
> >> Think I found your problem, is this a typo?
> >>
> >>  public void mapp(ImmutableBytesWritable row, RowResult value,
> >> OutputCollector<Text, Text> output, Reporter reporter) throws
> IOException {
> >>
> >> I should read map not mapp
> >>
> >> J-D
> >>
> >> On Thu, Jul 23, 2009 at 12:42 PM, bharath
> >> vissapragada<[email protected]> wrote:
> >> > I have tried apache -commons logging ...
> >> >
> >> > instead of printing the row ... i have written log.error(row) ...
> >> > even then i got the same output as follows ...
> >> >
> >> > 09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> >> > processName=JobTracker, sessionId=
> >> > 09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User
> >> classes
> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
> >> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
> >> > 0->localhost.localdomain:,
> >> > 09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
> >> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
> >> > 0->localhost.localdomain:,
> >> > 09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
> >> > 09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
> >> > 09/07/24 03:41:40 INFO mapred.MapTask: data buffer = 79691776/99614720
> >> > 09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
> >> > 09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
> >> > 09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
> >> Task:attempt_local_0001_m_000000_0
> >> > is done. And is in the process of commiting
> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
> >> > 'attempt_local_0001_m_000000_0' done.
> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> >> > 09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
> >> > 09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass,
> with 1
> >> > segments left of total size: 333 bytes
> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
> >> Task:attempt_local_0001_r_000000_0
> >> > is done. And is in the process of commiting
> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
> >> > 'attempt_local_0001_r_000000_0' done.
> >> > 09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
> >> > 09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes written=78346
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8
> >> >
> >> >
> >> > On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans <
> [email protected]
> >> >wrote:
> >> >
> >> >> And you don't need any more config to run local MR jobs on HBase. But
> >> >> you do need Hadoop when running MR jobs on HBase on a cluster.
> >> >>
> >> >> Also your code is running fine as you could see, the real question is
> >> >> where is the stdout going when in local mode. When you ran your other
> >> >> MR jobs, it was on a working Hadoop setup right? So you were looking
> >> >> at the logs in the web UI? One simple thing to do is to do your
> >> >> debugging with a logger so you are sure to see your output as I
> >> >> already proposed. Another simple thing is to get a pseudo-distributed
> >> >> setup and run you HBase MR jobs with Hadoop and get your logs like
> I'm
> >> >> sure you did before.
> >> >>
> >> >> J-D
> >> >>
> >> >> On Thu, Jul 23, 2009 at 11:54 AM, bharath
> >> >> vissapragada<[email protected]> wrote:
> >> >> > I am really thankful to you J-D for replying me inspite of ur busy
> >> >> schedule.
> >> >> > I am still in a learning stage and there are no good guides on
> HBase
> >> >> other
> >> >> > than Its own one .. So please spare me and I really appreciate ur
> help
> >> .
> >> >> >
> >> >> > Now i got ur point that there is no need of hadoop while running
> Hbase
> >> MR
> >> >> > programs .... But iam confused abt the config . I have only set the
> >> >> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do
> >> anything
> >> >> ..
> >> >> > so i wonder if my conf was wrong or some error in that simple code
> ...
> >> >> > because stdout worked for me while writing mapreduce programs ...
> >> >> >
> >> >> > Thanks once again!
> >> >> >
> >> >> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <
> >> [email protected]
> >> >> >wrote:
> >> >> >
> >> >> >> The code itself is very simple, I was referring to your own
> >> >> >> description of your situation. You say you use standalone HBase
> yet
> >> >> >> you talk about Hadoop configuration. You also talk about the
> >> >> >> JobTracker web UI which is in no use since you run local jobs
> >> directly
> >> >> >> on HBase.
> >> >> >>
> >> >> >> J-D
> >> >> >>
> >> >> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
> >> >> >> vissapragada<[email protected]> wrote:
> >> >> >> > I used stdout for debugging while writing codes in hadoop MR
> >> programs
> >> >> and
> >> >> >> it
> >> >> >> > worked fine ...
> >> >> >> > Can you please tell me wch part of the code u found confusing so
> >> that
> >> >> i
> >> >> >> can
> >> >> >> > explain it a bit clearly ...
> >> >> >> >
> >> >> >> >
> >> >> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
> >> >> [email protected]
> >> >> >> >wrote:
> >> >> >> >
> >> >> >> >> What you wrote is a bit confusing to me, sorry.
> >> >> >> >>
> >> >> >> >> The usual way to debug MR jobs is to define a logger and post
> with
> >> >> >> >> either info or debug level, not sysout like you did. I'm not
> even
> >> >> sure
> >> >> >> >> where the standard output is logged when using a local job.
> Also
> >> >> since
> >> >> >> >> this is local you won't see anything in your host:50030 web UI.
> So
> >> >> use
> >> >> >> >> apache common logging and you should see your output.
> >> >> >> >>
> >> >> >> >> J-D
> >> >> >> >>
> >> >> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
> >> >> >> >> vissapragada<[email protected]> wrote:
> >> >> >> >> > Thanks for ur reply J-D ... Im pasting some part of the code
> ...
> >> >> >> >> >
> >> >> >> >> > Im doing it frm the command line .. Iam pasting some part of
> the
> >> >> code
> >> >> >> >> here
> >> >> >> >> > ....
> >> >> >> >> >
> >> >> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult
> value,
> >> >> >> >> > OutputCollector<Text, Text> output, Reporter reporter) throws
> >> >> >> IOException
> >> >> >> >> {
> >> >> >> >> >                System.out.println(row);
> >> >> >> >> > }
> >> >> >> >> >
> >> >> >> >> > public JobConf createSubmittableJob(String[] args) throws
> >> >> IOException
> >> >> >> {
> >> >> >> >> >                JobConf c = new JobConf(getConf(),
> >> >> >> >> MR_DS_Scan_Case1.class);
> >> >> >> >> >                c.set("col.name", args[1]);
> >> >> >> >> >                c.set("operator.name",args[2]);
> >> >> >> >> >                c.set("val.name",args[3]);
> >> >> >> >> >                IdentityTableMap.initJob(args[0], args[1],
> >> >> >> >> this.getClass(),
> >> >> >> >> > c);
> >> >> >> >> >                c.setOutputFormat(NullOutputFormat.class);
> >> >> >> >> >                 return c
> >> >> >> >> > }
> >> >> >> >> >
> >> >> >> >> > As u can see ... im just printing the value of row in the map
> ..
> >> i
> >> >> >> can't
> >> >> >> >> see
> >> >> >> >> > in the terminal .....
> >> >> >> >> > I only wan't the map phase ... so i didn't write any reduce
> >> phase
> >> >> ..
> >> >> >> is
> >> >> >> >> my
> >> >> >> >> > jobConf correct??
> >> >> >> >> >
> >> >> >> >> > Also as i have already asked how to check the job logs and
> web
> >> >> >> interface
> >> >> >> >> > like "localhost:<port>/jobTracker.jsp"... since im running in
> >> local
> >> >> >> mode
> >> >> >> >> ...
> >> >> >> >> >
> >> >> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
> >> >> >> [email protected]
> >> >> >> >> >wrote:
> >> >> >> >> >
> >> >> >> >> >> What output do you need exactly? I see that you have 8
> output
> >> >> records
> >> >> >> >> >> in your reduce task so if you take a look in your output
> folder
> >> or
> >> >> >> >> >> table (I don't know which sink you used) you should see
> them.
> >> >> >> >> >>
> >> >> >> >> >> Also did you run your MR inside Eclipse or in command line?
> >> >> >> >> >>
> >> >> >> >> >> Thx,
> >> >> >> >> >>
> >> >> >> >> >> J-D
> >> >> >> >> >>
> >> >> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
> >> >> >> >> >> vissapragada<[email protected]> wrote:
> >> >> >> >> >> > This is the output i go t.. seems everything is fine ..but
> no
> >> >> >> output!!
> >> >> >> >> >> >
> >> >> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM
> >> Metrics
> >> >> >> with
> >> >> >> >> >> > processName=JobTracker, sessionId=
> >> >> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file
> set.
> >> >>  User
> >> >> >> >> >> classes
> >> >> >> >> >> > may not be found. See JobConf(Class) or
> >> JobConf#setJar(String).
> >> >> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
> >> >> >> >> >> > 0->localhost.localdomain:,
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
> >> >> >> job_local_0001
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
> >> >> >> >> >> > 0->localhost.localdomain:,
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
> >> >> >> 79691776/99614720
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
> >> >> >> 262144/327680
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of
> map
> >> >> output
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> >> >> >> >> >> Task:attempt_local_0001_m_000000_0
> >> >> >> >> >> > is done. And is in the process of commiting
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> >> >> >> >> >> > 'attempt_local_0001_m_000000_0' done.
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted
> >> segments
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last
> >> >> merge-pass,
> >> >> >> >> with 1
> >> >> >> >> >> > segments left of total size: 333 bytes
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> >> >> >> >> >> Task:attempt_local_0001_r_000000_0
> >> >> >> >> >> > is done. And is in the process of commiting
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce >
> reduce
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> >> >> >> >> >> > 'attempt_local_0001_r_000000_0' done.
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
> >> >> >> job_local_0001
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
> >> >> read=38949
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
> >> >> >> written=78378
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce
> >> Framework
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
> >> >> groups=8
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine
> output
> >> >> >> records=0
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
> >> records=8
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output
> >> >> >> records=8
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
> >> >> bytes=315
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
> >> bytes=0
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input
> >> >> >> records=0
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
> >> >> records=8
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
> >> >> records=8
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
> >> >> >> >> >> > [email protected]> wrote:
> >> >> >> >> >> >
> >> >> >> >> >> >> since i haven;t started the cluster .. i can even see the
> >> >> details
> >> >> >> in
> >> >> >> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add
> >> >> anything
> >> >> >> to
> >> >> >> >> >> >> hadoop/conf/hadoop-site.xml
> >> >> >> >> >> >>
> >> >> >> >> >> >>
> >> >> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
> >> >> >> >> >> >> [email protected]> wrote:
> >> >> >> >> >> >>
> >> >> >> >> >> >>> Hi all ,
> >> >> >> >> >> >>>
> >> >> >> >> >> >>> I wanted to run HBase in standalone mode to check my
> Hbase
> >> MR
> >> >> >> >> programs
> >> >> >> >> >> ...
> >> >> >> >> >> >>> I have dl a built version of hbase-0.20. and i have
> hadoop
> >> >> 0.19.3
> >> >> >> >> >> >>>
> >> >> >> >> >> >>> "I have set JAVA_HOME in both of them" .. then i started
> >> hbase
> >> >> >> and
> >> >> >> >> >> >>> inserted some tables using JAVA API .. Now i have
> written
> >> some
> >> >> MR
> >> >> >> >> >> programs
> >> >> >> >> >> >>> onHBase and when i run them on Hbase it runs perfectly
> >> without
> >> >> >> any
> >> >> >> >> >> errors
> >> >> >> >> >> >>> and all the Map -reduce statistics are displayed
> correctly
> >> but
> >> >>  i
> >> >> >> >> get
> >> >> >> >> >> no
> >> >> >> >> >> >>> output .
> >> >> >> >> >> >>>
> >> >> >> >> >> >>> I have one doubt now .. how do HBase recognize hadoop in
> >> stand
> >> >> >> alone
> >> >> >> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple
> print
> >> >> >> >> statements
> >> >> >> >> >> donot
> >> >> >> >> >> >>> work .. no output is displayed on the screen ... I doubt
> my
> >> >> >> config
> >> >> >> >> ....
> >> >> >> >> >> >>>
> >> >> >> >> >> >>> Do i need to add some config to run them ... Please
> reply
> >> ...
> >> >> >> >> >> >>>
> >> >> >> >> >> >>
> >> >> >> >> >> >>
> >> >> >> >> >> >
> >> >> >> >> >>
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >
> >> >> >>
> >> >> >
> >> >>
> >> >
> >>
> >
>

Re: Hbase and Hadoop Config to run in Standalone mode

Reply via email to