[ https://issues.apache.org/jira/browse/PIG-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich updated PIG-1660: -------------------------------- Fix Version/s: (was: 0.10) > Consider passing result of COUNT/COUNT_STAR to LIMIT > ----------------------------------------------------- > > Key: PIG-1660 > URL: https://issues.apache.org/jira/browse/PIG-1660 > Project: Pig > Issue Type: Improvement > Affects Versions: 0.7.0 > Reporter: Viraj Bhat > > In realistic scenarios we need to split a dataset into segments by using > LIMIT, and like to achieve that goal within the same pig script. Here is a > case: > {code} > A = load '$DATA' using PigStorage(',') as (id, pvs); > B = group A by ALL; > C = foreach B generate COUNT_STAR(A) as row_cnt; > -- get the low 50% segment > D = order A by pvs; > E = limit D (C.row_cnt * 0.2); > store E in '$Eoutput'; > -- get the high 20% segment > F = order A by pvs DESC; > G = limit F (C.row_cnt * 0.2); > store G in '$Goutput'; > {code} > Since LIMIT only accepts constants, we have to split the operation to two > steps in order to pass in the constants for the LIMIT statements. Please > consider bringing this feature in so the processing can be more efficient. > Viraj -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira