[ https://issues.apache.org/jira/browse/HIVE-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851718#action_12851718 ]
Zheng Shao commented on HIVE-1131: ---------------------------------- > Look at the DataContainer class. That has a partition in it. And the > Dependency has a mapping from Partition to the dependencies. Can you explain > more your concerns on inefficiency? I see. So the DataContainer captures the output partition information, but we don't have input partition information (BaseColumnInfo/TableAliasInfo). This is reasonable since the input can be lots of partitions. > For S6 actually the queryplan is the wrong place to store the lineageinfo. > Because of the dynamic partitioning work that Ning is doing, I have to > generate the partition to dependency mapping at run time. So I would rather > store it in a run time structure as opposed to a compile time structure. > SessionState fits that bill, though I think we should have another structure > called ExecutionCtx for this. But otherwise I think we want to store this in > a runtime structure. +1 on the ExecutionCtx idea. SessionState is at the session level, and LineageInfo is at the query level. It will be great to put LineageInfo into ExecutionCtx. > Add column lineage information to the pre execution hooks > --------------------------------------------------------- > > Key: HIVE-1131 > URL: https://issues.apache.org/jira/browse/HIVE-1131 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor > Reporter: Ashish Thusoo > Assignee: Ashish Thusoo > Attachments: HIVE-1131.patch, HIVE-1131_2.patch, HIVE-1131_3.patch, > HIVE-1131_4.patch > > > We need a mechanism to pass the lineage information of the various columns of > a table to a pre execution hook so that applications can use that for: > - auditing > - dependency checking > and many other applications. > The proposal is to expose this through a bunch of classes to the pre > execution hook interface to the clients and put in the necessary > transformation logic in the optimizer to generate this information. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.