Sachin,
Can you attach your pig script and pig client log as well as I asked
earlier?
Regards,
Rohini
On Tue, Jul 7, 2015 at 2:43 AM, Sachin Sabbarwal <[email protected]>
wrote:
> Hi Guys
> I'm using Apache Pig version 0.14.0 (r1640057) and 0.5.3 TEZ.
> I am running a pig script in following 2 sceniors:
> 1. Without any data. In this case when run with TEZ mode script runs with
> in 1 min. With Mapred it takes around 7 minutes.
> 2. With little data, In tez mode same pig script takes around 10 mins and
> with mapred it takes around 14-15 mins.
>
> I had sent a mail to [email protected] and got to know that the problem
> is that pig auto-reduce isn't coming in to play for few of my scopes and
> hence lots of tasks are getting created. This causes the run in tez mode to
> become slow. As suggested i tried setting pig.exec.reducers.max to a
> smaller value(tried with 150,25 and 10). And with this property set to 10,
> my script ran under 4 minutes. Also from logs we could see that auto
> -reducer is set to false for few scopes. For example:
>
> 2015-07-06 14:11:35,123 INFO [AsyncDispatcher event handler]
> vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings
> minFrac:0.25 maxFrac:0.75* auto:false* desiredTaskIput:104857600 minTasks:1
> 2015-07-06 14:11:35,123 INFO [AsyncDispatcher event handler]
> impl.VertexImpl: Creating 999 for vertex:
>
> Now the problem here is that in my pig.properties i have
> set pig.tez.auto.parallelism=true and also in my pig script i have set this
> propetry to true. But still we are seeing this getting set to false for few
> scopes.
>
> You can find the My conversation with Rajesh from TEZ below. Please lemme
> know what i am missing here. Lemme know if any other information is
> required.
>
> Thanks in advance
>
>
>
>
>
> ---------- Forwarded message ----------
> From: Rajesh Balamohan <[email protected]>
> Date: Tue, Jul 7, 2015 at 2:12 PM
> Subject: Re: Same pig script running slower with Tez as compared with run
> in Mapred mode
> To: [email protected]
>
>
> Hi Sachin,
>
> That was just a temporary workaround to ensure that was the issue. Ideally
> user does not need to set this parameter. Real issue is why auto-reducer
> is set to false in certain vertices in pig-tez. Will wait for Pig folks to
> chime in.
>
> For doc/tutorial related, you can start off with the following
> - http://tez.apache.org/talks.html
> - couple of youtube videos are available from hadoop summits and meetups.
> -
>
> http://hortonworks.com/blog/apache-tez-a-new-chapter-in-hadoop-data-processing/
> (this is pretty old)
> - Pig on tez
>
> http://www.slideshare.net/Hadoop_Summit/pig-on-tez-low-latency-data-processing-with-big-data
>
> ~Rajesh.B
>
> On Tue, Jul 7, 2015 at 1:54 PM, Sachin Sabbarwal <
> [email protected]>
> wrote:
>
> > Hi Rajesh
> > Thanks for your response. *This seems to be working for me.*
> > By setting pig.exec.reducers.max to 10 i am able to complete my run in
> > under 4 mins.(Initially it was running in 14-15 mins).
> > I'm new to pig/tez/hadoop world. Do you write any blogs about
> > pig/tez/hadoop etc? Can you suggest any tutorials/links to read about
> tez?
> > I need to understand concepts like scope, DAG, parallelism etc. I just
> > have a very basic understanding of tez. If i understand all these
> concepts
> > i'll be able to tune my job.
> >
> > Thanks
> >
> >
> > On Tue, Jul 7, 2015 at 1:23 PM, Rajesh Balamohan <
> > [email protected]> wrote:
> >
> >> Forgot to add the following. Ideally auto-reduce implementation should
> >> have kicked-in and on need basis, it should have decreased the number of
> >> reducers needed. However, for the vertices of concern (scope-2037 &
> >> scope-2162), auto-reducer has been turned off in configuration by Pig
> and
> >> for the rest of the vertices it is turned on.
> >>
> >> Pig folks would be able to help in terms of providing details on why
> >> auto-reduce parallelism is turned off in certain vertices.
> >>
> >> 2015-07-06 14:11:35,109 INFO [AsyncDispatcher event handler]
> >> impl.VertexImpl: Setting vertexManager to ShuffleVertexManager for
> >> vertex_1436152736518_0210_1_28* [scope-2037]*
> >> 2015-07-06 14:11:35,123 INFO [AsyncDispatcher event handler]
> >> vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings
> >> minFrac:0.25 maxFrac:0.75* auto:false* desiredTaskIput:104857600
> >> minTasks:1
> >> 2015-07-06 14:11:35,123 INFO [AsyncDispatcher event handler]
> >> impl.VertexImpl: Creating 999 for vertex: vertex_1436152736518_0210_1_28
> >> [scope-2037]
> >> ....
> >>
> >> 2015-07-06 14:11:35,245 INFO [AsyncDispatcher event handler]
> >> impl.VertexImpl: Setting vertexManager to ShuffleVertexManager for
> >> vertex_1436152736518_0210_1_39 *[scope-2162]*
> >> 2015-07-06 14:11:35,257 INFO [AsyncDispatcher event handler]
> >> vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings
> >> minFrac:0.25 maxFrac:0.75 *auto:false* desiredTaskIput:104857600
> >> minTasks:1
> >> 2015-07-06 14:11:35,257 INFO [AsyncDispatcher event handler]
> >> impl.VertexImpl: Creating 999 for vertex: vertex_1436152736518_0210_1_39
> >> [scope-2162]
> >> ....
> >>
> >> 2015-07-06 14:11:35,417 INFO [AsyncDispatcher event handler]
> >> impl.VertexImpl: Setting user vertex manager plugin:
> >> org.apache.tez.dag.library.vertexmanager.ShuffleVertexManager on
> vertex:*
> >> scope-2185*
> >> 2015-07-06 14:11:35,419 INFO [AsyncDispatcher event handler]
> >> vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings
> >> minFrac:0.25 maxFrac:0.75 *auto:true* desiredTaskIput:104857600
> >> minTasks:1
> >> ...
> >>
> >>
> >> On Tue, Jul 7, 2015 at 12:47 PM, Rajesh Balamohan <
> >> [email protected]> wrote:
> >>
> >>>
> >>> Attaching the DAG and the swimlane for the job.
> >>>
> >>> scope-2052 which had to give the data to other vertices slowed down (~
> >>> 150-180 seconds) due to multiple spills and NumberFormatExceptions in
> >>> data. You might want to try setting
> >>>
> "tez.task.scale.memory.additional-reservation.fraction.max='PARTITIONED_UNSORTED_OUTPUT:12,UNSORTED_INPUT:1,UNSORTED_OUTPUT:12,SORTED_OUTPUT:12,SORTED_MERGED_INPUT:1,PROCESSOR:1,OTHER:1'
> >>> " to allocate more memory for unordered outputs.
> >>> Following are the details for this scope.
> >>> - attempt_1436152736518_0210_1_31_000000_0,
> >>> PigLatin:dmwith1tapin.pig-0_scope-0, VertexName: scope-2052,
> >>> VertexParallelism: 1,
> >>> TaskAttemptID:attempt_1436152736518_0210_1_31_000000_0,
> >>> - numInputs=1, numOutputs=4, JVM.maxFree=734527488
> >>> - 2015-07-06 14:11:40,047 INFO [TezChild] resources.MemoryDistributor:
> >>> Informing: INPUT, scope-546, org.apache.tez.mapreduce.input.MRInput:
> >>> requested=0, allocated=0
> >>> - Small allocation of ~7 MB allocation to unordered output lead to
> >>> multiple spills.
> >>> - 2015-07-06 14:11:40,047 INFO [TezChild] resources.MemoryDistributor:
> >>> Informing: OUTPUTPUT_RECORDS, scope-2117,
> >>> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:
> >>> requested=268435456, allocated=222303401
> >>> - 2015-07-06 14:11:40,047 INFO [TezChild] resources.MemoryDistributor:
> >>> Informing: OUTPUT, scope-2251,
> >>> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:
> >>> requested=268435456, allocated=222303401
> >>> - 2015-07-06 14:11:40,048 INFO [TezChild] resources.MemoryDistributor:
> >>> Informing: OUTPUT, scope-2063,
> >>> org.apache.tez.runtime.library.output.UnorderedPartitionedKVOutput:
> >>> requested=104857600, allocated=7236438
> >>> - 2015-07-06 14:11:40,048 INFO [TezChild] resources.MemoryDistributor:
> >>> Informing: OUTPUT, scope-2068,
> >>> org.apache.tez.runtime.library.output.UnorderedPartitionedKVOutput:
> >>> requested=104857600, allocated=7236438
> >>> - Too many number of records had issues in NumberFormatException
> >>> leading to large amount of logs. This dragged the runtime of this task
> to
> >>> around
> >>> e.g "mapReduceLayer.PigHadoopLogger:
> >>> java.lang.Class(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to
> >>> interpret value in field being converted to long, caught
> >>> NumberFormatException <empty String> field discarded"
> >>>
> >>>
> >>> - scope-2037 and scope-2162 had set the vertex parallelism to "999"
> >>> affecting subsequent execution.
> >>> - VertexName: scope-2037, VertexParallelism: 999 vertex:
> >>> vertex_1436152736518_0210_1_28 finished in *410* seconds. Tasks
> >>> themselves were small, but due to large number of tasks that had to be
> >>> executed in small containers (pretty much used the same container to
> >>> execute this) it took time.
> >>> - VertexName: scope-2162, VertexParallelism: 999 vertex:
> >>> vertex_1436152736518_0210_1_39 finished in *697* seconds. Similar
> >>> observation as previous vertex.
> >>> *Above 2 vertices have caused the entire job to slow down.*
> >>>
> >>> "999" is set as the reducer parallelism at compile time by Pig. This is
> >>> not for the input. I am not sure how pig sets the parallelism at
> compile
> >>> time. You can possibly try setting "pig.exec.reducers.max=50" in your
> case
> >>> and give it a try. Pig folks would be in a better position to explain
> that.
> >>>
> >>>
> >>> On Tue, Jul 7, 2015 at 11:22 AM, Sachin Sabbarwal <
> >>> [email protected]> wrote:
> >>>
> >>>>
> >>>> logs.gz
> >>>> <
> https://drive.google.com/file/d/0B-RFcYxUIHzzUVJpRzVDZXB5TUk/view?usp=drive_web
> >
> >>>> Hi Rajesh
> >>>>
> >>>> PFA the gziped logs.
> >>>> FYI It's a single file, when you'll gunzip it, it'll be around 1.5gb
> in
> >>>> size.
> >>>> One more thing which you might find useful:
> >>>>
> >>>> In the dmOutputTez file i could see following line, which suggests
> that
> >>>> TEZ created a total of 7660 tasks. This is surprising as my data is
> only
> >>>> few mbs(10-15 mb max). How is this number of tasks decided? is there
> any
> >>>> property to tune it?
> >>>>
> >>>> 2015-07-07 05:37:02,647 [Timer-0] INFO
> >>>> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG
> >>>> Status: status=RUNNING, progress=TotalTasks: 7660 Succeeded: 0
> Running:
> >>>> 0 Failed: 0 Killed: 0, diagnostics=
> >>>>
> >>>> Thanks
> >>>>
> >>>> On Mon, Jul 6, 2015 at 8:34 PM, Rajesh Balamohan <
> >>>> [email protected]> wrote:
> >>>>
> >>>>> yarn logs -applicationId application_1436152736518_0210
> >>>>>
> >>>>> You can possibly send the output to a log file, gzip it and post it.
> >>>>>
> >>>>> ~Rajesh.B
> >>>>>
> >>>>> On Mon, Jul 6, 2015 at 8:12 PM, Sachin Sabbarwal <
> >>>>> [email protected]> wrote:
> >>>>>
> >>>>>> Hi
> >>>>>> Thanks for reply. My tez-site.xml contains only following:
> >>>>>>
> >>>>>> <configuration>
> >>>>>> <property>
> >>>>>> <name>tez.lib.uris</name>
> >>>>>> <value>${fs.defaultFS}/apps/tez-0.5/tez-0.5.3.tar.gz,
> >>>>>>
> ${fs.defaultFS}/apps/tez-0.5/*,${fs.defaultFS}/apps/tez-0.5/lib/*</value>
> >>>>>> </property>
> >>>>>> </configuration>
> >>>>>>
> >>>>>> PFA the application logs. Here is the version information:
> >>>>>> 1. Hadoop version: Hadoop 2.5.0-cdh5.3.1
> >>>>>> 2. Pig: Apache Pig version 0.14.0 (r1640057)
> >>>>>> 3. Tez: 0.5.3
> >>>>>>
> >>>>>> Lemme know if anything else is needed.
> >>>>>>
> >>>>>> Thanks in advance
> >>>>>>
> >>>>>> On Mon, Jul 6, 2015 at 7:07 PM, Rajesh Balamohan <
> >>>>>> [email protected]> wrote:
> >>>>>>
> >>>>>>> Can you post the application logs, tez-site.xml and also the
> version
> >>>>>>> details?
> >>>>>>>
> >>>>>>> ~Rajesh.B
> >>>>>>>
> >>>>>>> On Mon, Jul 6, 2015 at 6:38 PM, Sachin Sabbarwal <
> >>>>>>> [email protected]> wrote:
> >>>>>>>
> >>>>>>>>
> >>>>>>>> ---------- Forwarded message ----------
> >>>>>>>> From: Sachin Sabbarwal <[email protected]>
> >>>>>>>> Date: Mon, Jul 6, 2015 at 5:34 PM
> >>>>>>>> Subject: Same pig script running slower with Tez as compared with
> >>>>>>>> run in Mapred mode
> >>>>>>>> To: [email protected]
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Hello Guys
> >>>>>>>> Trying Apache Tez.
> >>>>>>>> I've setup to use pig in TEZ mode.
> >>>>>>>> I'm running a pig script against i) no data and ii) with some
> data.
> >>>>>>>> In case i) when i run with pig using TEZ mode my pig scripts
> >>>>>>>> completes run in ~40secs. Whereas when i run case i) with mapred
> it takes
> >>>>>>>> around 7-8 mins.
> >>>>>>>> in case ii) when run with pig using TEZ, same pig script takes
> >>>>>>>> around 14-15 mins but with mapred it takes around 10 mins.
> >>>>>>>> When i'm running same pig script with production data(which is
> much
> >>>>>>>> more than the data i used here to run case i) and (ii) ) the job
> takes
> >>>>>>>> hours to complete.
> >>>>>>>> Hence I'm trying tez to run my pig job in a faster mode. I'm not
> >>>>>>>> really sure what i might be missing here. Please help, ask for
> any further
> >>>>>>>> info if required.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>> --
> >>>>>>>> Sachin Sabbarwal
> >>>>>>>> Linkedin:
> >>>>>>>> https://www.linkedin.com/profile?viewProfile=&key=95777265
> >>>>>>>> Facebook: facebook.com/sachinsabbarwal
> >>>>>>>> Quora: http://www.quora.com/Sachin-Sabbarwal
> >>>>>>>> Blog: http://sachinsabbarwal.tumblr.com/
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Sachin Sabbarwal
> >>>>>>>> Linkedin:
> >>>>>>>> https://www.linkedin.com/profile?viewProfile=&key=95777265
> >>>>>>>> Facebook: facebook.com/sachinsabbarwal
> >>>>>>>> Quora: http://www.quora.com/Sachin-Sabbarwal
> >>>>>>>> Blog: http://sachinsabbarwal.tumblr.com/
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> ~Rajesh.B
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Sachin Sabbarwal
> >>>>>> Linkedin:
> https://www.linkedin.com/profile?viewProfile=&key=95777265
> >>>>>> Facebook: facebook.com/sachinsabbarwal
> >>>>>> Quora: http://www.quora.com/Sachin-Sabbarwal
> >>>>>> Blog: http://sachinsabbarwal.tumblr.com/
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> ~Rajesh.B
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Sachin Sabbarwal
> >>>> Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265
> >>>> Facebook: facebook.com/sachinsabbarwal
> >>>> Quora: http://www.quora.com/Sachin-Sabbarwal
> >>>> Blog: http://sachinsabbarwal.tumblr.com/
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> ~Rajesh.B
> >>>
> >>
> >>
> >>
> >> --
> >> ~Rajesh.B
> >>
> >
> >
> >
> > --
> > Sachin Sabbarwal
> > Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265
> > Facebook: facebook.com/sachinsabbarwal
> > Quora: http://www.quora.com/Sachin-Sabbarwal
> > Blog: http://sachinsabbarwal.tumblr.com/
> >
>
>
>
> --
> ~Rajesh.B
>
>
>
> --
> Sachin Sabbarwal
> Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265
> Facebook: facebook.com/sachinsabbarwal
> Quora: http://www.quora.com/Sachin-Sabbarwal
> Blog: http://sachinsabbarwal.tumblr.com/
>