Julien removed a dozen or so loader/storer instantiations.
That can do it if you do work in constructors.

D

On Fri, Aug 10, 2012 at 1:15 PM, Prashant Kommireddi
<prash1...@gmail.com> wrote:
> Thanks Chun.
>
> Jon, any idea what on 0.11 might have fixed it?
>
> On Thu, Aug 9, 2012 at 3:32 PM, Chun Yang
> <cy...@contractor.salesforce.com>wrote:
>
>> I tried with pig11 (from git), timing for the two variants are more
>> comparable.
>>
>> stats for `pig11 -b -e 'explain -script students-a.pig'`
>> 6.33s user 0.74s system 153% cpu 4.611 total
>> 6.55s user 0.68s system 155% cpu 4.664 total
>> 6.40s user 0.79s system 157% cpu 4.560 total
>> 6.47s user 0.62s system 155% cpu 4.560 total
>>
>> stats for `pig11 -b -e 'explain -script students-b.pig'`
>> 5.66s user 0.62s system 169% cpu 3.707 total
>> 5.69s user 0.53s system 165% cpu 3.758 total
>> 5.44s user 0.70s system 165% cpu 3.706 total
>> 5.68s user 0.51s system 166% cpu 3.708 total
>>
>> So looks like it was fixed somewhere for 0.11?
>> ________________________________________
>> From: Jonathan Coveney [jcove...@gmail.com]
>> Sent: Thursday, August 09, 2012 11:00 AM
>> To: user@pig.apache.org
>> Subject: Re: Pig 0.10.0 slow startup
>>
>> Can you do me a favor and run the exact same stuff with pig11? Just to
>> isolate if this is an issue that has been removed. I will also try and run
>> this on pig10, to see if I can see te same issue.
>>
>> 2012/8/8 Chun Yang <cy...@contractor.salesforce.com>
>>
>> > Thanks Jonathan,
>> >
>> > Here are some numbers that I'm getting from Pig 0.10 and Pig 0.9.1:
>> >
>> > pig10 -b -e 'explain -script students-a.pig'  35.35s user 8.52s system
>> 63%
>> > cpu 1:08.77 total
>> >
>> > pig10 -b -e 'explain -script students-b.pig'  5.32s user 0.48s system
>> 130%
>> > cpu 4.460 total
>> >
>> > pig9 -b -e 'explain -script students-a.pig'  4.93s user 0.51s system 131%
>> > cpu 4.153 total
>> >
>> > pig9 -b -e 'explain -script students-b.pig'  3.86s user 0.41s system 131%
>> > cpu 3.254 total
>> >
>> > Seems like the first run is always slower, but subsequent runs are about
>> > the
>> > same:
>> >
>> > pig10 -b -e 'explain -script students-a.pig'  35.17s user 8.20s system
>> 123%
>> > cpu 35.017 total
>> >
>> > pig10 -b -e 'explain -script students-a.pig'  35.41s user 8.55s system
>> 122%
>> > cpu 35.803 total
>> >
>> > A little more than 1.5s slowdown :)
>> >
>> > Thanks,
>> > Chun
>> >
>> > On 8/8/12 5:38 PM, "Jonathan Coveney" <jcove...@gmail.com> wrote:
>> >
>> > > Thanks for putting that together, Chun.
>> > >
>> > > So, it looks like there are ~400 instantiations of the class, and the
>> > time
>> > > from the first instantiation to the last one is about ~1.5s. Is that on
>> > the
>> > > order of the slowdown your experiencing?
>> > >
>> > > (note: I'm testing with Pig 11...if your slowdown is much higher than
>> > that,
>> > > I'll test on Pig 10)
>> > >
>> > > Either way, it seems like the slowdown is directly attributable to UDF
>> > > invocations. Have you seen slowdowns much larger than this?
>> > >
>> > > 2012/8/8 Chun Yang <cy...@contractor.salesforce.com>
>> > >
>> > >> Hi Jonathan,
>> > >>
>> > >> Here is a more self-contained example than what I had before:
>> > >> http://ews.illinois.edu/~yang43/shared/students.tar.gz
>> > >>
>> > >> I wrote a trivial GFV class, but the slowdown still exists.
>> > >> students-a.pig starts up noticeably slower than students-b.pig .
>> > >>
>> > >> Thanks,
>> > >> Chun
>> > >>
>> > >> On 8/8/12 12:22 PM, "Jonathan Coveney" <jcove...@gmail.com> wrote:
>> > >>
>> > >>> Thanks for this info. Can you go ahead and paste the whole GFV class?
>> > >>>
>> > >>> Thanks
>> > >>>
>> > >>> 2012/8/8 Chun Yang <cy...@contractor.salesforce.com>
>> > >>>
>> > >>>> Thanks Jonathan,
>> > >>>>
>> > >>>> I've tried to produce an example script which exhibits the slowdown
>> > and
>> > >>>> posted it on Pastebin: http://pastebin.com/kTSsDUr3
>> > >>>>
>> > >>>> The slowdown seems to occur when we are using a lot of UDFs to parse
>> > our
>> > >>>> input data. Variant A in the script is noticeably slower than
>> variant
>> > B
>> > >> in
>> > >>>> Pig 0.10 while performance is similar in Pig 0.9.1
>> > >>>>
>> > >>>> I've pasted the exec() function of the GFV function on Pastebin as
>> > well:
>> > >>>> http://pastebin.com/FVnkQCJ5
>> > >>>>
>> > >>>> Please let us know if you need more details.
>> > >>>>
>> > >>>> Thanks,
>> > >>>> Chun
>> > >>>>
>> > >>>> On 8/7/12 10:07 PM, "Jonathan Coveney" <jcove...@gmail.com> wrote:
>> > >>>>
>> > >>>>> Can you guys give a script that has the issue? My tactic would be
>> to
>> > >> use
>> > >>>>> some sort of profiler (we have access to YourKit for open source
>> Pig
>> > >>>>> contribution work) and try and isolate what is triggering GC.
>> > >>>>>
>> > >>>>> 2012/8/7 Prashant Kommireddi <prash1...@gmail.com>
>> > >>>>>
>> > >>>>>> Hi All,
>> > >>>>>>
>> > >>>>>> Just wanted to follow-up on Chun's question. Several of our Pig
>> > users
>> > >>>> have
>> > >>>>>> been experiencing slow start-ups with Pig 0.10.0, when the same
>> > script
>> > >>>> runs
>> > >>>>>> fine with 0.9.1. Anyone else facing similar issues?
>> > >>>>>>
>> > >>>>>> Thanks,
>> > >>>>>> Prashant
>> > >>>>>>
>> > >>>>>> Hi all,
>> > >>>>>>
>> > >>>>>> I'm trying to move from Pig 0.9.1 to Pig 0.10.0 . When I try to
>> run
>> > >> the
>> > >>>>>> same
>> > >>>>>> script using the two Pig versions, 0.9.1 starts off fast and
>> almost
>> > >>>>>> immediately submits the job to the cluster. On the other hand, Pig
>> > >>>> 0.10.0
>> > >>>>>> takes forever to submit the job. When I use the java option
>> > >>>>>> -XX:+PrintGCDetails, I see that for 0.10.0 the GC is being run
>> many
>> > >>>> times
>> > >>>>>> before and after the job is submitted to the cluster.
>> > >>>>>>
>> > >>>>>> Does anyone know what is causing this and/or how I might be able
>> to
>> > >>>>>> troubleshoot it?
>> > >>>>>>
>> > >>>>>> I've uploaded truncated output showing when GC happens to
>> > >>>>>> Pastebin:http://pastebin.com/B8WTHW9r
>> > >>>>>>
>> > >>>>>> Thanks,
>> > >>>>>> Chun
>> > >>>>>>
>> > >>>>
>> > >>>>
>> > >>
>> > >>
>> >
>> >
>>

Reply via email to