Thanks Chun.

Jon, any idea what on 0.11 might have fixed it?

On Thu, Aug 9, 2012 at 3:32 PM, Chun Yang
<cy...@contractor.salesforce.com>wrote:

> I tried with pig11 (from git), timing for the two variants are more
> comparable.
>
> stats for `pig11 -b -e 'explain -script students-a.pig'`
> 6.33s user 0.74s system 153% cpu 4.611 total
> 6.55s user 0.68s system 155% cpu 4.664 total
> 6.40s user 0.79s system 157% cpu 4.560 total
> 6.47s user 0.62s system 155% cpu 4.560 total
>
> stats for `pig11 -b -e 'explain -script students-b.pig'`
> 5.66s user 0.62s system 169% cpu 3.707 total
> 5.69s user 0.53s system 165% cpu 3.758 total
> 5.44s user 0.70s system 165% cpu 3.706 total
> 5.68s user 0.51s system 166% cpu 3.708 total
>
> So looks like it was fixed somewhere for 0.11?
> ________________________________________
> From: Jonathan Coveney [jcove...@gmail.com]
> Sent: Thursday, August 09, 2012 11:00 AM
> To: user@pig.apache.org
> Subject: Re: Pig 0.10.0 slow startup
>
> Can you do me a favor and run the exact same stuff with pig11? Just to
> isolate if this is an issue that has been removed. I will also try and run
> this on pig10, to see if I can see te same issue.
>
> 2012/8/8 Chun Yang <cy...@contractor.salesforce.com>
>
> > Thanks Jonathan,
> >
> > Here are some numbers that I'm getting from Pig 0.10 and Pig 0.9.1:
> >
> > pig10 -b -e 'explain -script students-a.pig'  35.35s user 8.52s system
> 63%
> > cpu 1:08.77 total
> >
> > pig10 -b -e 'explain -script students-b.pig'  5.32s user 0.48s system
> 130%
> > cpu 4.460 total
> >
> > pig9 -b -e 'explain -script students-a.pig'  4.93s user 0.51s system 131%
> > cpu 4.153 total
> >
> > pig9 -b -e 'explain -script students-b.pig'  3.86s user 0.41s system 131%
> > cpu 3.254 total
> >
> > Seems like the first run is always slower, but subsequent runs are about
> > the
> > same:
> >
> > pig10 -b -e 'explain -script students-a.pig'  35.17s user 8.20s system
> 123%
> > cpu 35.017 total
> >
> > pig10 -b -e 'explain -script students-a.pig'  35.41s user 8.55s system
> 122%
> > cpu 35.803 total
> >
> > A little more than 1.5s slowdown :)
> >
> > Thanks,
> > Chun
> >
> > On 8/8/12 5:38 PM, "Jonathan Coveney" <jcove...@gmail.com> wrote:
> >
> > > Thanks for putting that together, Chun.
> > >
> > > So, it looks like there are ~400 instantiations of the class, and the
> > time
> > > from the first instantiation to the last one is about ~1.5s. Is that on
> > the
> > > order of the slowdown your experiencing?
> > >
> > > (note: I'm testing with Pig 11...if your slowdown is much higher than
> > that,
> > > I'll test on Pig 10)
> > >
> > > Either way, it seems like the slowdown is directly attributable to UDF
> > > invocations. Have you seen slowdowns much larger than this?
> > >
> > > 2012/8/8 Chun Yang <cy...@contractor.salesforce.com>
> > >
> > >> Hi Jonathan,
> > >>
> > >> Here is a more self-contained example than what I had before:
> > >> http://ews.illinois.edu/~yang43/shared/students.tar.gz
> > >>
> > >> I wrote a trivial GFV class, but the slowdown still exists.
> > >> students-a.pig starts up noticeably slower than students-b.pig .
> > >>
> > >> Thanks,
> > >> Chun
> > >>
> > >> On 8/8/12 12:22 PM, "Jonathan Coveney" <jcove...@gmail.com> wrote:
> > >>
> > >>> Thanks for this info. Can you go ahead and paste the whole GFV class?
> > >>>
> > >>> Thanks
> > >>>
> > >>> 2012/8/8 Chun Yang <cy...@contractor.salesforce.com>
> > >>>
> > >>>> Thanks Jonathan,
> > >>>>
> > >>>> I've tried to produce an example script which exhibits the slowdown
> > and
> > >>>> posted it on Pastebin: http://pastebin.com/kTSsDUr3
> > >>>>
> > >>>> The slowdown seems to occur when we are using a lot of UDFs to parse
> > our
> > >>>> input data. Variant A in the script is noticeably slower than
> variant
> > B
> > >> in
> > >>>> Pig 0.10 while performance is similar in Pig 0.9.1
> > >>>>
> > >>>> I've pasted the exec() function of the GFV function on Pastebin as
> > well:
> > >>>> http://pastebin.com/FVnkQCJ5
> > >>>>
> > >>>> Please let us know if you need more details.
> > >>>>
> > >>>> Thanks,
> > >>>> Chun
> > >>>>
> > >>>> On 8/7/12 10:07 PM, "Jonathan Coveney" <jcove...@gmail.com> wrote:
> > >>>>
> > >>>>> Can you guys give a script that has the issue? My tactic would be
> to
> > >> use
> > >>>>> some sort of profiler (we have access to YourKit for open source
> Pig
> > >>>>> contribution work) and try and isolate what is triggering GC.
> > >>>>>
> > >>>>> 2012/8/7 Prashant Kommireddi <prash1...@gmail.com>
> > >>>>>
> > >>>>>> Hi All,
> > >>>>>>
> > >>>>>> Just wanted to follow-up on Chun's question. Several of our Pig
> > users
> > >>>> have
> > >>>>>> been experiencing slow start-ups with Pig 0.10.0, when the same
> > script
> > >>>> runs
> > >>>>>> fine with 0.9.1. Anyone else facing similar issues?
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Prashant
> > >>>>>>
> > >>>>>> Hi all,
> > >>>>>>
> > >>>>>> I'm trying to move from Pig 0.9.1 to Pig 0.10.0 . When I try to
> run
> > >> the
> > >>>>>> same
> > >>>>>> script using the two Pig versions, 0.9.1 starts off fast and
> almost
> > >>>>>> immediately submits the job to the cluster. On the other hand, Pig
> > >>>> 0.10.0
> > >>>>>> takes forever to submit the job. When I use the java option
> > >>>>>> -XX:+PrintGCDetails, I see that for 0.10.0 the GC is being run
> many
> > >>>> times
> > >>>>>> before and after the job is submitted to the cluster.
> > >>>>>>
> > >>>>>> Does anyone know what is causing this and/or how I might be able
> to
> > >>>>>> troubleshoot it?
> > >>>>>>
> > >>>>>> I've uploaded truncated output showing when GC happens to
> > >>>>>> Pastebin:http://pastebin.com/B8WTHW9r
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Chun
> > >>>>>>
> > >>>>
> > >>>>
> > >>
> > >>
> >
> >
>

Reply via email to