SAMPLE command should accept parameters
---------------------------------------

                 Key: PIG-1713
                 URL: https://issues.apache.org/jira/browse/PIG-1713
             Project: Pig
          Issue Type: Improvement
            Reporter: Viraj Bhat


I have a script which takes in a command line parameter.

{code}
pig -p number=100 script.pig
{code}

The script contains the following parameters:

{code}
A = load '/user/viraj/test' using PigStorage() as (a,b,c);

B = SAMPLE A 1/$number;

dump B;
{code}

Realistic use cases of SAMPLE require statisticians to calculate SAMPLE data on 
demand.

Ideally I would like to calculate SAMPLE from within Pig script without having 
to run one Pig script first get it's results and another to pass the results.

Ideal use case:

{code}
A = load '/user/viraj/input' using PigStorage() as (col1, col2, col3);

...
...

W = group X by col1;

Z = foreach Y generate AVG(X);

AA = load '/user/viraj/test' using PigStorage() as (a,b,c);

BB = SAMPLE AA 1/Z;

dump BB;
{code}

Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to