I think whatever way you slice it, handling thousands of pig jobs
asynchronously is going to be a bear. I mean, this is essentially what the
job tracker does, albeit with a lot less information.
Either way, Pig is not multi-threaded so having more than one instance of
Pig in the same JVM is going
Both. Think of it as an app server handling all of these requests.
Sent from my iPhone
On Jan 23, 2013, at 9:09 PM, Jonathan Coveney wrote:
> Thousands of requests, or thousands of Pig jobs? Or both?
>
>
> 2013/1/23 Prashant Kommireddi
>
>> Did not want to have several threads launched for thi
Thousands of requests, or thousands of Pig jobs? Or both?
2013/1/23 Prashant Kommireddi
> Did not want to have several threads launched for this. We might have
> thousands of requests coming in, and the app is doing a lot more than only
> Pig.
>
> On Wed, Jan 23, 2013 at 5:44 PM, Jonathan Coven
Thanks all, looks I need to upgrade to pig0.10 since UDF(*) looks not
supported in 0.8.1
Best wishes,
Stanley Xu
On Tue, Jan 22, 2013 at 7:34 PM, Vitalii Tymchyshyn wrote:
> BTW: http://pig.apache.org/docs/r0.10.0/basic.html has next example:
> C = FOREACH A GENERATE name, age, MyUDF(*);
> Look
Did not want to have several threads launched for this. We might have
thousands of requests coming in, and the app is doing a lot more than only
Pig.
On Wed, Jan 23, 2013 at 5:44 PM, Jonathan Coveney wrote:
> start a separate Process which runs Pig?
>
>
> 2013/1/23 Prashant Kommireddi
>
> > Hey
start a separate Process which runs Pig?
2013/1/23 Prashant Kommireddi
> Hey guys,
>
> I am trying to do the following:
>
>1. Launch a pig job asynchronously via Java program
>2. Get a notification once the job is complete (something similar to
>Hadoop callback with a servlet)
>
> I
You can create in instance of PigProcessNotificationListener that calls
back when the job finishes.
On Wed, Jan 23, 2013 at 4:48 PM, Prashant Kommireddi wrote:
> Hey guys,
>
> I am trying to do the following:
>
>1. Launch a pig job asynchronously via Java program
>2. Get a notification o
Hey guys,
I am trying to do the following:
1. Launch a pig job asynchronously via Java program
2. Get a notification once the job is complete (something similar to
Hadoop callback with a servlet)
I looked at PigServer.executeBatch() and it seems to be waiting until job
completes.This is
On Tue, Jan 22, 2013 at 11:31:23AM -0800, Cheolsoo Park wrote:
>
> Try this:
>
> data1 = LOAD '1.txt' USING PigStorage('|') AS (n:int,
> B:bag{(m:int,s:chararray)});
> data2 = FOREACH data1 GENERATE n, FLATTEN(B);
> data3 = FILTER data2 BY B::m <= n;
> data4 = GROUP data3 BY n;
> data5 = FOREACH d
On Wed, Jan 23, 2013 at 01:58:29PM +0800, Dongliang Sun wrote:
> I import a third-party module 'Pandas'.
>
> It's successful when I directly run the python code.
> Also successful when run the pig script in local mode.
>
> But has error when run pig script in MapReduce, to debug I comment all of
Ok, I was a bit to quick. I'm able to answer to my own question now.
from org.apache.pig.scripting import ScriptPigContext
ctx = ScriptPigContext.get()
params = ctx.getPigContext().getParams()
paramFiles = ctx.getPigContext().getParamFiles()
getParamFiles() will give me the path to the file
Hi, is there a way to get access to the params passed with the pig command
in the python code?
pig -p param1=val1 -param_file=filepath script.py
Based on this: https://issues.apache.org/jira/browse/PIG-2165 I know that
those params will be automatically bound.
Is there a way access those paramet
12 matches
Mail list logo