Is this putting lipstick on a pig? (that paper on provenance) Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com
On Feb 16, 2012, at 10:08 PM, "Prashant Kommireddi (Commented) (JIRA)" <j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/PIG-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210090#comment-13210090 > ] > > Prashant Kommireddi commented on PIG-2541: > ------------------------------------------ > > Thanks Dmitriy for the review. > > 1. I will look at the case of loaded schema. I wonder how regression passed, > may be I should take a closer look at the test cases. > > 2. Great point, I had the exact same thought while coding it but decided > against adding path to the front. Reason being with this feature we would > want users to not make changes to existing scripts other than adding this > capability. For any scripts written with positional references; adding path > to the front would involve shifting all those references. > > May be add another feature (incorporate into Pig syntax or have a UDF, > GETLASTCOLUMN() ) to be able to parse the last column from a Tuple? That > actually makes for a nice feature which can be used in cases schema is > unknown/variable. > > 3. Will make the code style and Javadoc changes. > >> Automatic record provenance (source tagging) for PigStorage >> ----------------------------------------------------------- >> >> Key: PIG-2541 >> URL: https://issues.apache.org/jira/browse/PIG-2541 >> Project: Pig >> Issue Type: Improvement >> Components: impl >> Affects Versions: 0.9.1 >> Reporter: Richard Ding >> Assignee: Prashant Kommireddi >> Attachments: PIG-2541.patch >> >> >> There are a lot of interests in knowing where the data comes from when >> loading from a directory (or a set of directories). One can do it manually >> (see https://cwiki.apache.org/confluence/display/PIG/FAQ). But it will be >> more convenient for users if we implement this in the PigStorage with a >> command line option (e.g., pig.source.tagging=true/false) to turn it on/off. >> By default it will be off. > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > >