[ https://issues.apache.org/jira/browse/PIG-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich updated PIG-1713: -------------------------------- Fix Version/s: 0.10 > SAMPLE command should accept parameters > --------------------------------------- > > Key: PIG-1713 > URL: https://issues.apache.org/jira/browse/PIG-1713 > Project: Pig > Issue Type: Improvement > Reporter: Viraj Bhat > Fix For: 0.10 > > > I have a script which takes in a command line parameter. > {code} > pig -p number=100 script.pig > {code} > The script contains the following parameters: > {code} > A = load '/user/viraj/test' using PigStorage() as (a,b,c); > B = SAMPLE A 1/$number; > dump B; > {code} > Realistic use cases of SAMPLE require statisticians to calculate SAMPLE data > on demand. > Ideally I would like to calculate SAMPLE from within Pig script without > having to run one Pig script first get it's results and another to pass the > results. > Ideal use case: > {code} > A = load '/user/viraj/input' using PigStorage() as (col1, col2, col3); > ... > ... > W = group X by col1; > Z = foreach Y generate AVG(X); > AA = load '/user/viraj/test' using PigStorage() as (a,b,c); > BB = SAMPLE AA 1/Z; > dump BB; > {code} > Viraj -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira