Hi, Dmitriy, Can you give the script you are thinking of? On Sat, Jun 16, 2012 at 8:43 AM, Dmitriy Ryaboy <dvrya...@gmail.com> wrote:
> I don't think a union is required for this to make sense. > > On Jun 11, 2012, at 11:58 AM, "Daniel Dai (JIRA)" <j...@apache.org> wrote: > > > > > [ > https://issues.apache.org/jira/browse/PIG-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13292983#comment-13292983] > > > > Daniel Dai commented on PIG-2747: > > --------------------------------- > > > > My understanding is there is a union after T1, T2, right? > > > > Yes we only merge the consecutive filter into "and" condition. We don't > merge "or" condition. So you want > > > > filter cond1, filter cond2 -> union ==> filter cond1 or cond2 > > > >> Support more predicate pushdown to a data source by pulling up multiple > predicates from branches using the same data source > >> > --------------------------------------------------------------------------------------------------------------------------- > >> > >> Key: PIG-2747 > >> URL: https://issues.apache.org/jira/browse/PIG-2747 > >> Project: Pig > >> Issue Type: Improvement > >> Reporter: Yu Xu > >> Priority: Minor > >> > >> consider the following example: > >> T = load ... ; > >> T1 = filter T by col == 'hello'; > >> T2 = filter T by col =='world'; > >> currently Pig optimizer does not combine the two predicates and cannot > push down the predicates to the data sources (via LoadMetadata). Thus the > data source cannot do any filtering. A full table/file scan is required. > >> A current more efficient workaround (by hand) is to rewrite the above > script to the following equivalent one: > >> T = load ...; > >> T = filter T by col == 'hello' or col == 'world' ; > >> T1 = filter T by col == 'hello'; > >> T2 = filter T by col == 'world'; > >> the above script enables Pig to push down the predicate (col == 'hello' > or col == 'world') to the data source to use available partitions/indexes > for potentially much more efficient processing. > >> This JIRA is created to request PIG optimizer to perform the above type > of optimization automatically. > > > > -- > > This message is automatically generated by JIRA. > > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > > For more information on JIRA, see: > http://www.atlassian.com/software/jira > > > > >