[ 
https://issues.apache.org/jira/browse/PIG-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142096#comment-13142096
 ] 

Gianmarco De Francisci Morales commented on PIG-1660:
-----------------------------------------------------

I tested the script (correcting some small mistakes) and PIG-1926 actually 
solves it.
The tests in TestLimitVariable actually test for the same features, but are 
actually e2e tests so they would stay better in their right place among e2e 
tests.
I can port them.
2 questions, should I open a jira for this, and should we remove the java unit 
tests?
                
> Consider passing result of COUNT/COUNT_STAR to LIMIT 
> -----------------------------------------------------
>
>                 Key: PIG-1660
>                 URL: https://issues.apache.org/jira/browse/PIG-1660
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.7.0
>            Reporter: Viraj Bhat
>
> In realistic scenarios we need to split a dataset into segments by using 
> LIMIT, and like to achieve that goal within the same pig script. Here is a 
> case:
> {code}
> A = load '$DATA' using PigStorage(',') as (id, pvs);
> B = group A by ALL;
> C = foreach B generate COUNT_STAR(A) as row_cnt;
> -- get the low 50% segment
> D = order A by pvs;
> E = limit D (C.row_cnt * 0.2);
> store E in '$Eoutput';
> -- get the high 20% segment
> F = order A by pvs DESC;
> G = limit F (C.row_cnt * 0.2);
> store G in '$Goutput';
> {code}
> Since LIMIT only accepts constants, we have to split the operation to two 
> steps in order to pass in the constants for the LIMIT statements. Please 
> consider bringing this feature in so the processing can be more efficient.
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to