Optimization rule FilterAboveForeach is too restrictive and doesn't handle project * correctly ----------------------------------------------------------------------------------------------
Key: PIG-1567 URL: https://issues.apache.org/jira/browse/PIG-1567 Project: Pig Issue Type: Bug Reporter: Xuefu Zhang Fix For: 0.8.0 FilterAboveForeach rule is to optimize the plan by pushing up filter above previous foreach operator. However, during code review, two major problems were found: 1. Current implementation assumes that if no projection is found in the filter condition then all columns from foreach are projected. This issue prevents the following optimization: A = LOAD 'file.txt' AS (a(u,v), b, c); B = FOREACH A GENERATE $0, b; C = FILTER B BY 8 > 5; STORE C INTO 'empty'; 2. Current implementation doesn't handle * probjection, which means project all columns. As a result, it wasn't able to optimize the following: A = LOAD 'file.txt' AS (a(u,v), b, c); B = FOREACH A GENERATE $0, b; C = FILTER B BY Identity.class.getName(*) > 5; STORE C INTO 'empty'; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.