[jira] [Commented] (PIG-3760) Predicate pushdown for columnar file formats
[ https://issues.apache.org/jira/browse/PIG-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081948#comment-14081948 ] Julien Le Dem commented on PIG-3760: FYI in Parquet the filter is not a hint and it will be applied to records after the metadata > Predicate pushdown for columnar file formats > > > Key: PIG-3760 > URL: https://issues.apache.org/jira/browse/PIG-3760 > Project: Pig > Issue Type: New Feature >Reporter: Andrew Musselman > Fix For: 0.14.0 > > > From the conversation on dev@pig: > "Partition pruning for ORC is not addressed in PIG-3558. We will need > to do partition pruning for both ORC and Parquet in a new ticket. > Curently there is no interface to deal with this kind of pushdown > (LoadMetadata.setPartitionFilter push the filter to loader, but remove > the filter statement, for ORC/Parquet, filter is a hint, and we need > to do the filter again in Pig even it is pushed to loader), we will > need to define a new interface for that. You are welcome to initiate > the work. I know Aniket is also interested in doing that, so be sure > the talk with him about this work. > Thanks, > Daniel > On Mon, Feb 10, 2014 at 11:42 AM, Andrew Musselman > wrote: > > I had a chat with a couple people last week about a feature request for > > Pig: in a "where" or "filter" clause, when loading an ORC file, to skip > > directly to the right offset instead of scanning the whole file." -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3760) Predicate pushdown for columnar file formats
[ https://issues.apache.org/jira/browse/PIG-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081944#comment-14081944 ] Julien Le Dem commented on PIG-3760: [~rohini] I added to the description of PIG-4092 > Predicate pushdown for columnar file formats > > > Key: PIG-3760 > URL: https://issues.apache.org/jira/browse/PIG-3760 > Project: Pig > Issue Type: New Feature >Reporter: Andrew Musselman > Fix For: 0.14.0 > > > From the conversation on dev@pig: > "Partition pruning for ORC is not addressed in PIG-3558. We will need > to do partition pruning for both ORC and Parquet in a new ticket. > Curently there is no interface to deal with this kind of pushdown > (LoadMetadata.setPartitionFilter push the filter to loader, but remove > the filter statement, for ORC/Parquet, filter is a hint, and we need > to do the filter again in Pig even it is pushed to loader), we will > need to define a new interface for that. You are welcome to initiate > the work. I know Aniket is also interested in doing that, so be sure > the talk with him about this work. > Thanks, > Daniel > On Mon, Feb 10, 2014 at 11:42 AM, Andrew Musselman > wrote: > > I had a chat with a couple people last week about a feature request for > > Pig: in a "where" or "filter" clause, when loading an ORC file, to skip > > directly to the right offset instead of scanning the whole file." -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3760) Predicate pushdown for columnar file formats
[ https://issues.apache.org/jira/browse/PIG-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081715#comment-14081715 ] Rohini Palaniswamy commented on PIG-3760: - Attached initial patch with PIG-4091 with basic functionality required of Predicate Pushdown interface. The interface needs some more enhancements. Filed PIG-4093 and PIG-4094 for that. [~julienledem]/ [~dvryaboy], Is there someone in Twitter that we can work with for the Parquet implementation? It would help us flush out and finalize the APIs. > Predicate pushdown for columnar file formats > > > Key: PIG-3760 > URL: https://issues.apache.org/jira/browse/PIG-3760 > Project: Pig > Issue Type: New Feature >Reporter: Andrew Musselman > Fix For: 0.14.0 > > > From the conversation on dev@pig: > "Partition pruning for ORC is not addressed in PIG-3558. We will need > to do partition pruning for both ORC and Parquet in a new ticket. > Curently there is no interface to deal with this kind of pushdown > (LoadMetadata.setPartitionFilter push the filter to loader, but remove > the filter statement, for ORC/Parquet, filter is a hint, and we need > to do the filter again in Pig even it is pushed to loader), we will > need to define a new interface for that. You are welcome to initiate > the work. I know Aniket is also interested in doing that, so be sure > the talk with him about this work. > Thanks, > Daniel > On Mon, Feb 10, 2014 at 11:42 AM, Andrew Musselman > wrote: > > I had a chat with a couple people last week about a feature request for > > Pig: in a "where" or "filter" clause, when loading an ORC file, to skip > > directly to the right offset instead of scanning the whole file." -- This message was sent by Atlassian JIRA (v6.2#6252)