[jira] [Commented] (PIG-3760) Predicate pushdown for columnar file formats

2014-07-31 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081948#comment-14081948
 ] 

Julien Le Dem commented on PIG-3760:


FYI in Parquet the filter is not a hint and it will be applied to records after 
the metadata

> Predicate pushdown for columnar file formats
> 
>
> Key: PIG-3760
> URL: https://issues.apache.org/jira/browse/PIG-3760
> Project: Pig
>  Issue Type: New Feature
>Reporter: Andrew Musselman
> Fix For: 0.14.0
>
>
> From the conversation on dev@pig:
> "Partition pruning for ORC is not addressed in PIG-3558. We will need
> to do partition pruning for both ORC and Parquet in a new ticket.
> Curently there is no interface to deal with this kind of pushdown
> (LoadMetadata.setPartitionFilter push the filter to loader, but remove
> the filter statement, for ORC/Parquet, filter is a hint, and we need
> to do the filter again in Pig even it is pushed to loader), we will
> need to define a new interface for that. You are welcome to initiate
> the work. I know Aniket is also interested in doing that, so be sure
> the talk with him about this work.
> Thanks,
> Daniel
> On Mon, Feb 10, 2014 at 11:42 AM, Andrew Musselman
>  wrote:
> > I had a chat with a couple people last week about a feature request for
> > Pig:  in a "where" or "filter" clause, when loading an ORC file, to skip
> > directly to the right offset instead of scanning the whole file."



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3760) Predicate pushdown for columnar file formats

2014-07-31 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081944#comment-14081944
 ] 

Julien Le Dem commented on PIG-3760:


[~rohini] I added to the description of PIG-4092

> Predicate pushdown for columnar file formats
> 
>
> Key: PIG-3760
> URL: https://issues.apache.org/jira/browse/PIG-3760
> Project: Pig
>  Issue Type: New Feature
>Reporter: Andrew Musselman
> Fix For: 0.14.0
>
>
> From the conversation on dev@pig:
> "Partition pruning for ORC is not addressed in PIG-3558. We will need
> to do partition pruning for both ORC and Parquet in a new ticket.
> Curently there is no interface to deal with this kind of pushdown
> (LoadMetadata.setPartitionFilter push the filter to loader, but remove
> the filter statement, for ORC/Parquet, filter is a hint, and we need
> to do the filter again in Pig even it is pushed to loader), we will
> need to define a new interface for that. You are welcome to initiate
> the work. I know Aniket is also interested in doing that, so be sure
> the talk with him about this work.
> Thanks,
> Daniel
> On Mon, Feb 10, 2014 at 11:42 AM, Andrew Musselman
>  wrote:
> > I had a chat with a couple people last week about a feature request for
> > Pig:  in a "where" or "filter" clause, when loading an ORC file, to skip
> > directly to the right offset instead of scanning the whole file."



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3760) Predicate pushdown for columnar file formats

2014-07-31 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081715#comment-14081715
 ] 

Rohini Palaniswamy commented on PIG-3760:
-

Attached initial patch with PIG-4091 with basic functionality required of 
Predicate Pushdown interface. The interface needs some more enhancements. Filed 
PIG-4093 and PIG-4094 for that. 

[~julienledem]/ [~dvryaboy],
 Is there someone in Twitter that we can work with for the Parquet 
implementation? It would help us flush out and finalize the APIs. 

> Predicate pushdown for columnar file formats
> 
>
> Key: PIG-3760
> URL: https://issues.apache.org/jira/browse/PIG-3760
> Project: Pig
>  Issue Type: New Feature
>Reporter: Andrew Musselman
> Fix For: 0.14.0
>
>
> From the conversation on dev@pig:
> "Partition pruning for ORC is not addressed in PIG-3558. We will need
> to do partition pruning for both ORC and Parquet in a new ticket.
> Curently there is no interface to deal with this kind of pushdown
> (LoadMetadata.setPartitionFilter push the filter to loader, but remove
> the filter statement, for ORC/Parquet, filter is a hint, and we need
> to do the filter again in Pig even it is pushed to loader), we will
> need to define a new interface for that. You are welcome to initiate
> the work. I know Aniket is also interested in doing that, so be sure
> the talk with him about this work.
> Thanks,
> Daniel
> On Mon, Feb 10, 2014 at 11:42 AM, Andrew Musselman
>  wrote:
> > I had a chat with a couple people last week about a feature request for
> > Pig:  in a "where" or "filter" clause, when loading an ORC file, to skip
> > directly to the right offset instead of scanning the whole file."



--
This message was sent by Atlassian JIRA
(v6.2#6252)