Re: Debugging M/R job with tez

Manuel Godbert Wed, 05 Oct 2016 05:45:58 -0700

Hello,

I just opened TEZ-3459, with attached code adressing 3 of the issues I
encountered, including the embedded jars one.


I did not manage yet to provide an example showing the issue I had with
multiple outputs. It would definitely help me if I could run my jobs
locally with Tez to understand the specificity of these jobs. Would it be
possible to get some support to set up my workstation to achieve this?

Brgds

Manuel

On Wed, Sep 28, 2016 at 8:37 PM, Hitesh Shah <[email protected]> wrote:

> Thanks for the context, Manuel.
>
> Full compat with MR is something that has not really been fully tested
> with Tez. We believe that it works for the most part but there are probably
> cases out there which have either not been addressed or some which we are
> not aware of.
>
> It is great that you are trying this out. We can definitely help you
> figure out these issues and get the fixes into Tez to allow more users to
> seamlessly run MR jobs on Tez. It will be great if you can file a jira for
> the MR distributed cache handling of archives in Tez. A simple example to
> reproduce it would help a lot too so as to allow any of the Tez
> contributors to quickly debug and fix. I am assuming you are passing in
> archives/fat-jars to the distributed cache which MR implicitly applies ./*
> + ./lib/* pattern against to add to the runtime classpath? I am guessing
> this is something we may not have handled correctly in the translation
> layer.
>
> thanks
> — Hitesh
>
> > On Sep 28, 2016, at 9:38 AM, Manuel Godbert <[email protected]>
> wrote:
> >
> > Hello,
> >
> > In non local mode my M/R jobs generally behave as expected with Tez.
> However some still resist, and I am trying to have them running locally to
> understand if I they can work with some changes (either in my code or in
> Tez code, and in that latter case I planned to contribute some way to the
> Tez effort). Runnning the WordCount locally is only a first step.
> >
> > I won't be able to provide source code easily for the real problematic
> jobs, as we use a quite big home made framework on top of hadoop and that
> is not open source... in a few words most of my issues actually seem to
> come from the task attempts IDs management. We have subclassed the output
> committers to manage multiple outputs, and when we reach the commit task
> step the produced files are not always where expected in the temporary task
> attempt paths. It is hard to say what happens exactly, and this is why I
> wanted to reproduce the issue locally before sharing it.
> >
> > Besides this, another minor issue we got is that we used to package our
> applicative jars with nested dependencies in /lib and these are ignored by
> Tez. We could easily work around this expanding these and adapting our
> classpath.
> >
> > Regards
> >
> > On Wed, Sep 28, 2016 at 5:46 PM, Hitesh Shah <[email protected]> wrote:
> > Hello Manuel,
> >
> > Thanks for reporting the issue. Let me try and reproduce this locally to
> see what is going on.
> >
> > A quick question in general though - are you hitting issues when running
> in non-local mode too? Would you mind sharing that details on the issues
> you hit?
> >
> > thanks
> > — Hitesh
> >
> >
> > > On Sep 27, 2016, at 9:53 AM, Manuel Godbert <[email protected]>
> wrote:
> > >
> > > Hello,
> > >
> > > I have map/reduce jobs that work as expected within YARN, and I want
> to see if Tez can help me improving their performance. Alas, I am
> experiencing issues and I want to understand what happens, to see if I can
> adapt my code or if I can suggest Tez enhancements. For this I need to be
> able to debug jobs from within eclipse, with breakpoints in Tez source code
> etc.
> > >
> > > I am working on a linux (ubuntu) platform
> > > I use the latest Tez version I found, i.e. 0.9.0-SNAPSHOT (also tried
> with 0.7.0)
> > > I have set up the hortonworks mini dev cluster https://github.com/
> hortonworks/mini-dev-cluster
> > > I am trying to run the basic WordCount2 code found here
> https://hadoop.apache.org/docs/r2.7.2/hadoop-mapreduce-
> client/hadoop-mapreduce-client-core/MapReduceTutorial.
> html#Example:_WordCount_v2.0
> > > I added the following code to have tez running locally:
> > >     conf.set("mapreduce.framework.name", "yarn-tez");
> > >     conf.setBoolean("tez.local.mode", true);
> > >     conf.set("fs.default.name", "file:///");
> > >     conf.setBoolean("tez.runtime.optimize.local.fetch", true);
> > >
> > > And I am getting the following error:
> > >
> > > 2016-09-27 18:32:34 Running Dag: dag_1474992804027_0003_1
> > > 2016-09-27 18:32:34 Running Dag: dag_1474992804027_0003_1
> > > Exception in thread "main" java.lang.NullPointerException
> > >       at org.apache.tez.client.LocalClient.getApplicationReport(
> LocalClient.java:153)
> > >       at org.apache.tez.dag.api.client.rpc.DAGClientRPCImpl.
> getAppReport(DAGClientRPCImpl.java:231)
> > >       at org.apache.tez.dag.api.client.rpc.DAGClientRPCImpl.
> createAMProxyIfNeeded(DAGClientRPCImpl.java:251)
> > >       at org.apache.tez.dag.api.client.rpc.DAGClientRPCImpl.
> getDAGStatus(DAGClientRPCImpl.java:96)
> > >       at org.apache.tez.dag.api.client.DAGClientImpl.
> getDAGStatusViaAM(DAGClientImpl.java:360)
> > >       at org.apache.tez.dag.api.client.DAGClientImpl.
> getDAGStatusInternal(DAGClientImpl.java:220)
> > >       at org.apache.tez.dag.api.client.DAGClientImpl.getDAGStatus(
> DAGClientImpl.java:268)
> > >       at org.apache.tez.dag.api.client.MRDAGClient.getDAGStatus(
> MRDAGClient.java:58)
> > >       at org.apache.tez.mapreduce.client.YARNRunner.
> getJobStatus(YARNRunner.java:710)
> > >       at org.apache.tez.mapreduce.client.YARNRunner.submitJob(
> YARNRunner.java:650)
> > >       at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
> JobSubmitter.java:240)
> > >       at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
> > >       at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
> > >       at java.security.AccessController.doPrivileged(Native Method)
> > >       at javax.security.auth.Subject.doAs(Subject.java:422)
> > >       at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1657)
> > >       at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
> > >       at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.
> java:1308)
> > >       at WordCount2.main(WordCount2.java:136)
> > >
> > > Please help me understanding what I am doing wrong!
> > >
> > > Regards
> >
> >
>
>

Re: Debugging M/R job with tez

Reply via email to