Thanks for this info. Can you go ahead and paste the whole GFV class? Thanks
2012/8/8 Chun Yang <cy...@contractor.salesforce.com> > Thanks Jonathan, > > I've tried to produce an example script which exhibits the slowdown and > posted it on Pastebin: http://pastebin.com/kTSsDUr3 > > The slowdown seems to occur when we are using a lot of UDFs to parse our > input data. Variant A in the script is noticeably slower than variant B in > Pig 0.10 while performance is similar in Pig 0.9.1 > > I've pasted the exec() function of the GFV function on Pastebin as well: > http://pastebin.com/FVnkQCJ5 > > Please let us know if you need more details. > > Thanks, > Chun > > On 8/7/12 10:07 PM, "Jonathan Coveney" <jcove...@gmail.com> wrote: > > > Can you guys give a script that has the issue? My tactic would be to use > > some sort of profiler (we have access to YourKit for open source Pig > > contribution work) and try and isolate what is triggering GC. > > > > 2012/8/7 Prashant Kommireddi <prash1...@gmail.com> > > > >> Hi All, > >> > >> Just wanted to follow-up on Chun's question. Several of our Pig users > have > >> been experiencing slow start-ups with Pig 0.10.0, when the same script > runs > >> fine with 0.9.1. Anyone else facing similar issues? > >> > >> Thanks, > >> Prashant > >> > >> Hi all, > >> > >> I'm trying to move from Pig 0.9.1 to Pig 0.10.0 . When I try to run the > >> same > >> script using the two Pig versions, 0.9.1 starts off fast and almost > >> immediately submits the job to the cluster. On the other hand, Pig > 0.10.0 > >> takes forever to submit the job. When I use the java option > >> -XX:+PrintGCDetails, I see that for 0.10.0 the GC is being run many > times > >> before and after the job is submitted to the cluster. > >> > >> Does anyone know what is causing this and/or how I might be able to > >> troubleshoot it? > >> > >> I've uploaded truncated output showing when GC happens to > >> Pastebin:http://pastebin.com/B8WTHW9r > >> > >> Thanks, > >> Chun > >> > >