Is this putting lipstick on a pig? (that paper on provenance)

Russell Jurney
twitter.com/rjurney
russell.jur...@gmail.com
datasyndrome.com

On Feb 16, 2012, at 10:08 PM, "Prashant Kommireddi (Commented) (JIRA)"
<j...@apache.org> wrote:

>
>    [ 
> https://issues.apache.org/jira/browse/PIG-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210090#comment-13210090
>  ]
>
> Prashant Kommireddi commented on PIG-2541:
> ------------------------------------------
>
> Thanks Dmitriy for the review.
>
> 1. I will look at the case of loaded schema. I wonder how regression passed, 
> may be I should take a closer look at the test cases.
>
> 2. Great point, I had the exact same thought while coding it but decided 
> against adding path to the front. Reason being with this feature we would 
> want users to not make changes to existing scripts other than adding this 
> capability. For any scripts written with positional references; adding path 
> to the front would involve shifting all those references.
>
> May be add another feature (incorporate into Pig syntax or have a UDF, 
> GETLASTCOLUMN() ) to be able to parse the last column from a Tuple? That 
> actually makes for a nice feature which can be used in cases schema is 
> unknown/variable.
>
> 3. Will make the code style and Javadoc changes.
>
>> Automatic record provenance (source tagging) for PigStorage
>> -----------------------------------------------------------
>>
>>                Key: PIG-2541
>>                URL: https://issues.apache.org/jira/browse/PIG-2541
>>            Project: Pig
>>         Issue Type: Improvement
>>         Components: impl
>>   Affects Versions: 0.9.1
>>           Reporter: Richard Ding
>>           Assignee: Prashant Kommireddi
>>        Attachments: PIG-2541.patch
>>
>>
>> There are a lot of interests in knowing where the data comes from when 
>> loading from a directory (or a set of directories). One can do it manually 
>> (see https://cwiki.apache.org/confluence/display/PIG/FAQ). But it will be 
>> more convenient for users if we implement this in the PigStorage with a 
>> command line option (e.g., pig.source.tagging=true/false) to turn it on/off. 
>> By default it will be off.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA 
> administrators: 
> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>

Reply via email to