Re: Same pig script running slower with Tez as compared with run in Mapred mode

Sachin Sabbarwal Wed, 08 Jul 2015 23:36:00 -0700

Guys
Couldn't send pig earlier because needed to ask my colleagues if i am
allowed to send them. Please help as my problem is still unresolved.


Thanks in advance

On Wed, Jul 8, 2015 at 11:42 AM, Sachin Sabbarwal <[email protected]
> wrote:

> Hi Rohini
> Attached zip contains the main.pig and other Macros used in this pig.
> Please lemme know if there is anything else missing/required.
>
> Thanks
>
> On Wed, Jul 8, 2015 at 4:47 AM, Rohini Palaniswamy <
> [email protected]> wrote:
>
>> Sachin,
>>     Can you attach your pig script and pig client log as well as I asked
>> earlier?
>>
>> Regards,
>> Rohini
>>
>> On Tue, Jul 7, 2015 at 2:43 AM, Sachin Sabbarwal <
>> [email protected]>
>> wrote:
>>
>> > Hi Guys
>> > I'm using Apache Pig version 0.14.0 (r1640057) and  0.5.3 TEZ.
>> > I am running a pig script in following 2 sceniors:
>> > 1. Without any data. In this case when run with TEZ mode script runs
>> with
>> > in 1 min. With Mapred it takes around 7 minutes.
>> > 2. With little data, In tez mode same pig script takes around 10 mins
>> and
>> > with mapred it takes around 14-15 mins.
>> >
>> > I had sent a mail to [email protected] and got to know that the
>> problem
>> > is that pig auto-reduce isn't coming in to play for few of my scopes and
>> > hence lots of tasks are getting created. This causes the run in tez
>> mode to
>> > become slow. As suggested i tried setting pig.exec.reducers.max to a
>> > smaller value(tried with 150,25 and 10). And with this property set to
>> 10,
>> > my script ran under 4 minutes. Also from logs we could see that auto
>> > -reducer is set to false for few scopes. For example:
>> >
>> > 2015-07-06 14:11:35,123 INFO [AsyncDispatcher event handler]
>> > vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings
>> > minFrac:0.25 maxFrac:0.75* auto:false* desiredTaskIput:104857600
>> minTasks:1
>> > 2015-07-06 14:11:35,123 INFO [AsyncDispatcher event handler]
>> > impl.VertexImpl: Creating 999 for vertex:
>> >
>> > Now the problem here is that in my pig.properties i have
>> > set pig.tez.auto.parallelism=true and also in my pig script i have set
>> this
>> > propetry to true. But still we are seeing this getting set to false for
>> few
>> > scopes.
>> >
>> > You can find the My conversation with Rajesh from TEZ below. Please
>> lemme
>> > know what i am missing here. Lemme know if any other information is
>> > required.
>> >
>> > Thanks in advance
>> >
>> >
>> >
>> >
>> >
>> > ---------- Forwarded message ----------
>> > From: Rajesh Balamohan <[email protected]>
>> > Date: Tue, Jul 7, 2015 at 2:12 PM
>> > Subject: Re: Same pig script running slower with Tez as compared with
>> run
>> > in Mapred mode
>> > To: [email protected]
>> >
>> >
>> > Hi Sachin,
>> >
>> > That was just a temporary workaround to ensure that was the issue.
>> Ideally
>> > user does not need to set this parameter.  Real issue is why
>> auto-reducer
>> > is set to false in certain vertices in pig-tez. Will wait for Pig folks
>> to
>> > chime in.
>> >
>> > For doc/tutorial related, you can start off with the following
>> > - http://tez.apache.org/talks.html
>> > - couple of youtube videos are available from hadoop summits and
>> meetups.
>> > -
>> >
>> >
>> http://hortonworks.com/blog/apache-tez-a-new-chapter-in-hadoop-data-processing/
>> > (this is pretty old)
>> > - Pig on tez
>> >
>> >
>> http://www.slideshare.net/Hadoop_Summit/pig-on-tez-low-latency-data-processing-with-big-data
>> >
>> > ~Rajesh.B
>> >
>> > On Tue, Jul 7, 2015 at 1:54 PM, Sachin Sabbarwal <
>> > [email protected]>
>> > wrote:
>> >
>> > > Hi Rajesh
>> > > Thanks for your response. *This seems to be working for me.*
>> > > By setting pig.exec.reducers.max to 10 i am able to complete my run in
>> > > under 4 mins.(Initially it was running in 14-15 mins).
>> > > I'm new to pig/tez/hadoop world. Do you write any blogs about
>> > > pig/tez/hadoop etc? Can you suggest any tutorials/links to read about
>> > tez?
>> > > I need to understand concepts like scope, DAG, parallelism etc. I just
>> > > have a very basic understanding of tez. If i understand all these
>> > concepts
>> > > i'll be able to tune my job.
>> > >
>> > > Thanks
>> > >
>> > >
>> > > On Tue, Jul 7, 2015 at 1:23 PM, Rajesh Balamohan <
>> > > [email protected]> wrote:
>> > >
>> > >> Forgot to add the following.  Ideally auto-reduce implementation
>> should
>> > >> have kicked-in and on need basis, it should have decreased the
>> number of
>> > >> reducers needed.  However, for the vertices of concern (scope-2037 &
>> > >> scope-2162), auto-reducer has been turned off in configuration by Pig
>> > and
>> > >> for the rest of the vertices it is turned on.
>> > >>
>> > >> Pig folks would be able to help in terms of providing details on why
>> > >> auto-reduce parallelism is turned off in certain vertices.
>> > >>
>> > >> 2015-07-06 14:11:35,109 INFO [AsyncDispatcher event handler]
>> > >> impl.VertexImpl: Setting vertexManager to ShuffleVertexManager for
>> > >> vertex_1436152736518_0210_1_28* [scope-2037]*
>> > >> 2015-07-06 14:11:35,123 INFO [AsyncDispatcher event handler]
>> > >> vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings
>> > >> minFrac:0.25 maxFrac:0.75* auto:false* desiredTaskIput:104857600
>> > >> minTasks:1
>> > >> 2015-07-06 14:11:35,123 INFO [AsyncDispatcher event handler]
>> > >> impl.VertexImpl: Creating 999 for vertex:
>> vertex_1436152736518_0210_1_28
>> > >> [scope-2037]
>> > >> ....
>> > >>
>> > >> 2015-07-06 14:11:35,245 INFO [AsyncDispatcher event handler]
>> > >> impl.VertexImpl: Setting vertexManager to ShuffleVertexManager for
>> > >> vertex_1436152736518_0210_1_39 *[scope-2162]*
>> > >> 2015-07-06 14:11:35,257 INFO [AsyncDispatcher event handler]
>> > >> vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings
>> > >> minFrac:0.25 maxFrac:0.75 *auto:false* desiredTaskIput:104857600
>> > >> minTasks:1
>> > >> 2015-07-06 14:11:35,257 INFO [AsyncDispatcher event handler]
>> > >> impl.VertexImpl: Creating 999 for vertex:
>> vertex_1436152736518_0210_1_39
>> > >> [scope-2162]
>> > >> ....
>> > >>
>> > >> 2015-07-06 14:11:35,417 INFO [AsyncDispatcher event handler]
>> > >> impl.VertexImpl: Setting user vertex manager plugin:
>> > >> org.apache.tez.dag.library.vertexmanager.ShuffleVertexManager on
>> > vertex:*
>> > >> scope-2185*
>> > >> 2015-07-06 14:11:35,419 INFO [AsyncDispatcher event handler]
>> > >> vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings
>> > >> minFrac:0.25 maxFrac:0.75 *auto:true* desiredTaskIput:104857600
>> > >> minTasks:1
>> > >> ...
>> > >>
>> > >>
>> > >> On Tue, Jul 7, 2015 at 12:47 PM, Rajesh Balamohan <
>> > >> [email protected]> wrote:
>> > >>
>> > >>>
>> > >>> Attaching the DAG and the swimlane for the job.
>> > >>>
>> > >>> scope-2052 which had to give the data to other vertices slowed down
>> (~
>> > >>> 150-180 seconds) due to multiple spills and NumberFormatExceptions
>> in
>> > >>> data.  You might want to try setting
>> > >>>
>> >
>> "tez.task.scale.memory.additional-reservation.fraction.max='PARTITIONED_UNSORTED_OUTPUT:12,UNSORTED_INPUT:1,UNSORTED_OUTPUT:12,SORTED_OUTPUT:12,SORTED_MERGED_INPUT:1,PROCESSOR:1,OTHER:1'
>> > >>> " to allocate more memory for unordered outputs.
>> > >>> Following are the details for this scope.
>> > >>> - attempt_1436152736518_0210_1_31_000000_0,
>> > >>> PigLatin:dmwith1tapin.pig-0_scope-0, VertexName: scope-2052,
>> > >>> VertexParallelism: 1,
>> > >>> TaskAttemptID:attempt_1436152736518_0210_1_31_000000_0,
>> > >>> - numInputs=1, numOutputs=4, JVM.maxFree=734527488
>> > >>> - 2015-07-06 14:11:40,047 INFO [TezChild]
>> resources.MemoryDistributor:
>> > >>> Informing: INPUT, scope-546, org.apache.tez.mapreduce.input.MRInput:
>> > >>> requested=0, allocated=0
>> > >>> - Small allocation of ~7 MB allocation to unordered output lead to
>> > >>> multiple spills.
>> > >>> - 2015-07-06 14:11:40,047 INFO [TezChild]
>> resources.MemoryDistributor:
>> > >>> Informing: OUTPUTPUT_RECORDS, scope-2117,
>> > >>> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:
>> > >>> requested=268435456, allocated=222303401
>> > >>> - 2015-07-06 14:11:40,047 INFO [TezChild]
>> resources.MemoryDistributor:
>> > >>> Informing: OUTPUT, scope-2251,
>> > >>> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:
>> > >>> requested=268435456, allocated=222303401
>> > >>> - 2015-07-06 14:11:40,048 INFO [TezChild]
>> resources.MemoryDistributor:
>> > >>> Informing: OUTPUT, scope-2063,
>> > >>> org.apache.tez.runtime.library.output.UnorderedPartitionedKVOutput:
>> > >>> requested=104857600, allocated=7236438
>> > >>> - 2015-07-06 14:11:40,048 INFO [TezChild]
>> resources.MemoryDistributor:
>> > >>> Informing: OUTPUT, scope-2068,
>> > >>> org.apache.tez.runtime.library.output.UnorderedPartitionedKVOutput:
>> > >>> requested=104857600, allocated=7236438
>> > >>>  - Too many number of records had issues in NumberFormatException
>> > >>> leading to large amount of logs. This dragged the runtime of this
>> task
>> > to
>> > >>> around
>> > >>>    e.g "mapReduceLayer.PigHadoopLogger:
>> > >>> java.lang.Class(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to
>> > >>> interpret value  in field being converted to long, caught
>> > >>> NumberFormatException <empty String> field discarded"
>> > >>>
>> > >>>
>> > >>> - scope-2037 and scope-2162 had set the vertex parallelism to "999"
>> > >>> affecting subsequent execution.
>> > >>> - VertexName: scope-2037, VertexParallelism: 999 vertex:
>> > >>> vertex_1436152736518_0210_1_28 finished in *410* seconds.  Tasks
>> > >>> themselves were small, but due to large number of tasks that had to
>> be
>> > >>> executed in small containers (pretty much used the same container to
>> > >>> execute this) it took time.
>> > >>> - VertexName: scope-2162, VertexParallelism: 999 vertex:
>> > >>> vertex_1436152736518_0210_1_39 finished in *697* seconds. Similar
>> > >>> observation as previous vertex.
>> > >>> *Above 2 vertices have caused the entire job to slow down.*
>> > >>>
>> > >>> "999" is set as the reducer parallelism at compile time by Pig.
>> This is
>> > >>> not for the input. I am not sure how pig sets the parallelism at
>> > compile
>> > >>> time.  You can possibly try setting "pig.exec.reducers.max=50" in
>> your
>> > case
>> > >>> and give it a try. Pig folks would be in a better position to
>> explain
>> > that.
>> > >>>
>> > >>>
>> > >>> On Tue, Jul 7, 2015 at 11:22 AM, Sachin Sabbarwal <
>> > >>> [email protected]> wrote:
>> > >>>
>> > >>>> 
>> > >>>>  logs.gz
>> > >>>> <
>> >
>> https://drive.google.com/file/d/0B-RFcYxUIHzzUVJpRzVDZXB5TUk/view?usp=drive_web
>> > >
>> > >>>> Hi Rajesh
>> > >>>>
>> > >>>> PFA the gziped logs.
>> > >>>> FYI It's a single file, when you'll gunzip it, it'll be around
>> 1.5gb
>> > in
>> > >>>> size.
>> > >>>> One more thing which you might find useful:
>> > >>>>
>> > >>>> In the dmOutputTez file i could see following line, which suggests
>> > that
>> > >>>> TEZ created a total of 7660 tasks. This is surprising as my data is
>> > only
>> > >>>> few mbs(10-15 mb max). How is this number of tasks decided? is
>> there
>> > any
>> > >>>> property to tune it?
>> > >>>>
>> > >>>> 2015-07-07 05:37:02,647 [Timer-0] INFO
>> > >>>>  org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG
>> > >>>> Status: status=RUNNING, progress=TotalTasks: 7660 Succeeded: 0
>> > Running:
>> > >>>> 0 Failed: 0 Killed: 0, diagnostics=
>> > >>>>
>> > >>>> Thanks
>> > >>>>
>> > >>>> On Mon, Jul 6, 2015 at 8:34 PM, Rajesh Balamohan <
>> > >>>> [email protected]> wrote:
>> > >>>>
>> > >>>>> yarn logs -applicationId application_1436152736518_0210
>> > >>>>>
>> > >>>>> You can possibly send the output to a log file, gzip it and post
>> it.
>> > >>>>>
>> > >>>>> ~Rajesh.B
>> > >>>>>
>> > >>>>> On Mon, Jul 6, 2015 at 8:12 PM, Sachin Sabbarwal <
>> > >>>>> [email protected]> wrote:
>> > >>>>>
>> > >>>>>> Hi
>> > >>>>>> Thanks for reply. My tez-site.xml contains only following:
>> > >>>>>>
>> > >>>>>> <configuration>
>> > >>>>>> <property>
>> > >>>>>>   <name>tez.lib.uris</name>
>> > >>>>>>   <value>${fs.defaultFS}/apps/tez-0.5/tez-0.5.3.tar.gz,
>> > >>>>>>
>> >
>> ${fs.defaultFS}/apps/tez-0.5/*,${fs.defaultFS}/apps/tez-0.5/lib/*</value>
>> > >>>>>> </property>
>> > >>>>>> </configuration>
>> > >>>>>>
>> > >>>>>> PFA the application logs. Here is the version information:
>> > >>>>>> 1. Hadoop version: Hadoop 2.5.0-cdh5.3.1
>> > >>>>>> 2. Pig: Apache Pig version 0.14.0 (r1640057)
>> > >>>>>> 3. Tez: 0.5.3
>> > >>>>>>
>> > >>>>>> Lemme know if anything else is needed.
>> > >>>>>>
>> > >>>>>> Thanks in advance
>> > >>>>>>
>> > >>>>>> On Mon, Jul 6, 2015 at 7:07 PM, Rajesh Balamohan <
>> > >>>>>> [email protected]> wrote:
>> > >>>>>>
>> > >>>>>>> Can you post the application logs, tez-site.xml and also the
>> > version
>> > >>>>>>> details?
>> > >>>>>>>
>> > >>>>>>> ~Rajesh.B
>> > >>>>>>>
>> > >>>>>>> On Mon, Jul 6, 2015 at 6:38 PM, Sachin Sabbarwal <
>> > >>>>>>> [email protected]> wrote:
>> > >>>>>>>
>> > >>>>>>>>
>> > >>>>>>>> ---------- Forwarded message ----------
>> > >>>>>>>> From: Sachin Sabbarwal <[email protected]>
>> > >>>>>>>> Date: Mon, Jul 6, 2015 at 5:34 PM
>> > >>>>>>>> Subject: Same pig script running slower with Tez as compared
>> with
>> > >>>>>>>> run in Mapred mode
>> > >>>>>>>> To: [email protected]
>> > >>>>>>>>
>> > >>>>>>>>
>> > >>>>>>>> Hello Guys
>> > >>>>>>>> Trying Apache Tez.
>> > >>>>>>>> I've setup to use pig in TEZ mode.
>> > >>>>>>>> I'm running a pig script against i) no data and ii) with some
>> > data.
>> > >>>>>>>> In case i) when i run with pig using TEZ mode my pig scripts
>> > >>>>>>>> completes run in ~40secs. Whereas when i run case i) with
>> mapred
>> > it takes
>> > >>>>>>>> around 7-8 mins.
>> > >>>>>>>> in case ii) when run with pig using TEZ, same pig script takes
>> > >>>>>>>> around 14-15 mins but with mapred it takes around 10 mins.
>> > >>>>>>>> When i'm running same pig script with production data(which is
>> > much
>> > >>>>>>>> more than the data i used here to run case i) and (ii) ) the
>> job
>> > takes
>> > >>>>>>>> hours to complete.
>> > >>>>>>>> Hence I'm trying tez to run my pig job in a faster mode. I'm
>> not
>> > >>>>>>>> really sure what i might be missing here. Please help, ask for
>> > any further
>> > >>>>>>>> info if required.
>> > >>>>>>>>
>> > >>>>>>>>
>> > >>>>>>>> Thanks
>> > >>>>>>>> --
>> > >>>>>>>> Sachin Sabbarwal
>> > >>>>>>>> Linkedin:
>> > >>>>>>>> https://www.linkedin.com/profile?viewProfile=&key=95777265
>> > >>>>>>>> Facebook: facebook.com/sachinsabbarwal
>> > >>>>>>>> Quora: http://www.quora.com/Sachin-Sabbarwal
>> > >>>>>>>> Blog: http://sachinsabbarwal.tumblr.com/
>> > >>>>>>>>
>> > >>>>>>>>
>> > >>>>>>>>
>> > >>>>>>>> --
>> > >>>>>>>> Sachin Sabbarwal
>> > >>>>>>>> Linkedin:
>> > >>>>>>>> https://www.linkedin.com/profile?viewProfile=&key=95777265
>> > >>>>>>>> Facebook: facebook.com/sachinsabbarwal
>> > >>>>>>>> Quora: http://www.quora.com/Sachin-Sabbarwal
>> > >>>>>>>> Blog: http://sachinsabbarwal.tumblr.com/
>> > >>>>>>>>
>> > >>>>>>>
>> > >>>>>>>
>> > >>>>>>>
>> > >>>>>>> --
>> > >>>>>>> ~Rajesh.B
>> > >>>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> --
>> > >>>>>> Sachin Sabbarwal
>> > >>>>>> Linkedin:
>> > https://www.linkedin.com/profile?viewProfile=&key=95777265
>> > >>>>>> Facebook: facebook.com/sachinsabbarwal
>> > >>>>>> Quora: http://www.quora.com/Sachin-Sabbarwal
>> > >>>>>> Blog: http://sachinsabbarwal.tumblr.com/
>> > >>>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>> --
>> > >>>>> ~Rajesh.B
>> > >>>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> --
>> > >>>> Sachin Sabbarwal
>> > >>>> Linkedin:
>> https://www.linkedin.com/profile?viewProfile=&key=95777265
>> > >>>> Facebook: facebook.com/sachinsabbarwal
>> > >>>> Quora: http://www.quora.com/Sachin-Sabbarwal
>> > >>>> Blog: http://sachinsabbarwal.tumblr.com/
>> > >>>>
>> > >>>
>> > >>>
>> > >>>
>> > >>> --
>> > >>> ~Rajesh.B
>> > >>>
>> > >>
>> > >>
>> > >>
>> > >> --
>> > >> ~Rajesh.B
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Sachin Sabbarwal
>> > > Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265
>> > > Facebook: facebook.com/sachinsabbarwal
>> > > Quora: http://www.quora.com/Sachin-Sabbarwal
>> > > Blog: http://sachinsabbarwal.tumblr.com/
>> > >
>> >
>> >
>> >
>> > --
>> > ~Rajesh.B
>> >
>> >
>> >
>> > --
>> > Sachin Sabbarwal
>> > Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265
>> > Facebook: facebook.com/sachinsabbarwal
>> > Quora: http://www.quora.com/Sachin-Sabbarwal
>> > Blog: http://sachinsabbarwal.tumblr.com/
>> >
>>
>
>
>
> --
> Sachin Sabbarwal
> Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265
> Facebook: facebook.com/sachinsabbarwal
> Quora: http://www.quora.com/Sachin-Sabbarwal
> Blog: http://sachinsabbarwal.tumblr.com/
>



-- 
Sachin Sabbarwal
Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265
Facebook: facebook.com/sachinsabbarwal
Quora: http://www.quora.com/Sachin-Sabbarwal
Blog: http://sachinsabbarwal.tumblr.com/

Re: Same pig script running slower with Tez as compared with run in Mapred mode

Reply via email to