Thank you for the suggestions. I will file a jira and add our discussion
there.


On Fri, Jan 25, 2013 at 4:23 PM, Rohini Palaniswamy <rohini.adi...@gmail.com
> wrote:

> Jon,
>   Those are good areas to check. Few things I have seen regarding those are
>
>  1) JythonScriptEngine -PythonInterpreter is static and is not suitable for
> multiple runs if the script names are same (hit this issue in PIG-2433 unit
> tests).
>  2) QueryParserDriver - There is a static cache with macro name to macro
> file mapping. So same macro names with different file locations will cause
> problems.
>  3) FileLocalizer.relativeRoot - If single cluster no issues. Just need to
> reinitialize if supporting Multiple clusters.
>
> Regards,
> Rohini
>
>
> On Fri, Jan 25, 2013 at 9:37 AM, Jonathan Coveney <jcove...@gmail.com
> >wrote:
>
> > user to bcc, +dev
> >
> > Cheolsoo,
> >
> > Can you make a JIRA for this? I can imagine a slightly heavier test
> suite,
> > but I like where you started. If it's not far off, then I think it'll be
> a
> > win to make it thread safe. But we need to make sure to test the most
> > advanced features...UDF's (esp the same name but different udf in
> different
> > invocations), scripting UDFs (same thing), and so on.
> >
> >
> > 2013/1/25 Cheolsoo Park <cheol...@cloudera.com>
> >
> > > >> if you have multiple threads that run a query via PigServer, there
> is
> > a
> > > great chance of the internals clashing because of the use of static
> > > variable within Pig.
> > >
> > > Recently, I spent some time on this, and what I found is that the Pig
> > > front-end is quite thread-safe. Here is how I tested it:
> > >
> > > 1) Wrote a PigUnit test that runs in MR mode.
> > > 2) Executed test cases concurrently in 4 threads using a JUnit
> extension
> > > called temps-fugit:
> > > http://tempusfugitlibrary.org/documentation/junit/parallel/
> > >
> > > After fixing PIG-3096, I was able to successfully run Pig queries in
> > > parallel. It's important to note that only the front-end needs to be
> > > thread-safe since that's what is executed in parallel.
> > >
> > > I arbitrarily selected queries from e2e test cases, so they are
> probably
> > > not complex enough to mimic real-world examples. Nevertheless, my test
> > > program ran without a problem for few days. I couldn't continue my
> > > experiment because I was pulled out into something else. However, I
> think
> > > that making the front-end thread-safe is an achievable goal.
> > >
> > > Thanks,
> > > Cheolsoo
> > >
> > >
> > >
> > > On Thu, Jan 24, 2013 at 11:18 PM, Ramakrishna Nalam
> > > <nramakris...@gmail.com>wrote:
> > >
> > > > That clarifies it for me, thanks a lot.
> > > >
> > > > Regards,
> > > > Rama.
> > > >
> > > >
> > > > On Fri, Jan 25, 2013 at 10:09 AM, Jonathan Coveney <
> jcove...@gmail.com
> > > > >wrote:
> > > >
> > > > > Well, when I say that Pig is not multi-threaded, what I mean is
> that
> > if
> > > > you
> > > > > have multiple threads that run a query via PigServer, there is a
> > great
> > > > > chance of the internals clashing because of the use of static
> > variables
> > > > > within Pig. Pig itself, when running a single query, is
> > multi-threaded.
> > > > > It's just not "multi-threaded" in the sense that multiple instances
> > can
> > > > > safely be run in the same JVM.
> > > > >
> > > > >
> > > > > 2013/1/24 Ramakrishna Nalam <nramakris...@gmail.com>
> > > > >
> > > > > > Hi Jonathan,
> > > > > >
> > > > > > Pardon if it's a naive question, but Interesting that you say Pig
> > is
> > > > not
> > > > > > multithreaded.
> > > > > > We're using Pig 0.10.0, and looking at the code, it seems to do
> the
> > > > right
> > > > > > things to handle multi threaded requests (ThreadLocal for
> > ScriptState
> > > > for
> > > > > > eg).
> > > > > >
> > > > > > Would be great if you can point out to the kind of issues there
> > could
> > > > be.
> > > > > >
> > > > > >
> > > > > > Regards,
> > > > > > Rama.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Jan 24, 2013 at 8:32 PM, Praveen M <
> > lefthandma...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Are there any plans on making the pigserver multi-threaded?
> > > > > > >
> > > > > > > since there is "PigProcessNotificationListener" to subscribe
> for
> > > > async
> > > > > > > callbacks when the pig job completes, is there any real need to
> > > keep
> > > > > the
> > > > > > > pig job submitting thread waiting until the job completes?
> > > > > > >
> > > > > > > Is this just a shortcoming today or are there more concrete
> > reasons
> > > > > > against
> > > > > > > providing with a pigserver which can submit to the cluster in
> > > > mapreduce
> > > > > > > mode async?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Praveen
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Jan 23, 2013 at 10:56 PM, Jonathan Coveney <
> > > > jcove...@gmail.com
> > > > > > > >wrote:
> > > > > > >
> > > > > > > > I think whatever way you slice it, handling thousands of pig
> > jobs
> > > > > > > > asynchronously is going to be a bear. I mean, this is
> > essentially
> > > > > what
> > > > > > > the
> > > > > > > > job tracker does, albeit with a lot less information.
> > > > > > > >
> > > > > > > > Either way, Pig is not multi-threaded so having more than one
> > > > > instance
> > > > > > of
> > > > > > > > Pig in the same JVM is going to start causing problems (which
> > is
> > > > > why, I
> > > > > > > > imagine, there is no async way to call Pig). So multiple
> > > processes
> > > > is
> > > > > > > > really the only way around it that I know of.
> > > > > > > >
> > > > > > > > At Twitter we have a deployment of mesos, and our long term
> > > > solution
> > > > > is
> > > > > > > > going to be running all of our pig jobs on mesos, in the
> short
> > > term
> > > > > by
> > > > > > > > deploying daemons that run pig jobs as local processes.
> > > > > > > >
> > > > > > > >
> > > > > > > > 2013/1/23 Prashant Kommireddi <prash1...@gmail.com>
> > > > > > > >
> > > > > > > > > Both. Think of it as an app server handling all of these
> > > > requests.
> > > > > > > > >
> > > > > > > > > Sent from my iPhone
> > > > > > > > >
> > > > > > > > > On Jan 23, 2013, at 9:09 PM, Jonathan Coveney <
> > > > jcove...@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Thousands of requests, or thousands of Pig jobs? Or both?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 2013/1/23 Prashant Kommireddi <prash1...@gmail.com>
> > > > > > > > > >
> > > > > > > > > >> Did not want to have several threads launched for this.
> We
> > > > might
> > > > > > > have
> > > > > > > > > >> thousands of requests coming in, and the app is doing a
> > lot
> > > > more
> > > > > > > than
> > > > > > > > > only
> > > > > > > > > >> Pig.
> > > > > > > > > >>
> > > > > > > > > >> On Wed, Jan 23, 2013 at 5:44 PM, Jonathan Coveney <
> > > > > > > jcove...@gmail.com
> > > > > > > > > >>> wrote:
> > > > > > > > > >>
> > > > > > > > > >>> start a separate Process which runs Pig?
> > > > > > > > > >>>
> > > > > > > > > >>>
> > > > > > > > > >>> 2013/1/23 Prashant Kommireddi <prash1...@gmail.com>
> > > > > > > > > >>>
> > > > > > > > > >>>> Hey guys,
> > > > > > > > > >>>>
> > > > > > > > > >>>> I am trying to do the following:
> > > > > > > > > >>>>
> > > > > > > > > >>>>   1. Launch a pig job asynchronously via Java program
> > > > > > > > > >>>>   2. Get a notification once the job is complete
> > > (something
> > > > > > > similar
> > > > > > > > to
> > > > > > > > > >>>>   Hadoop callback with a servlet)
> > > > > > > > > >>>>
> > > > > > > > > >>>> I looked at PigServer.executeBatch() and it seems to
> be
> > > > > waiting
> > > > > > > > until
> > > > > > > > > >> job
> > > > > > > > > >>>> completes.This is not what I would like my app to do.
> > > > > > > > > >>>>
> > > > > > > > > >>>> Any ideas?
> > > > > > > > > >>>>
> > > > > > > > > >>>> Thanks,
> > > > > > > > > >>>>
> > > > > > > > > >>>
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > -Praveen
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to