[ 
https://issues.apache.org/jira/browse/HIVE-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851664#action_12851664
 ] 

Zheng Shao commented on HIVE-1131:
----------------------------------

> S1. Can we make lineage partition-level instead of table-level?
I don't see this implemented in the new patch. After looking at the code more, 
I'd agree that this is too hard (and inefficient) to do, when the query has a 
range over a lot of partitions.

> S3. Use "{}" even for single statement in "if", "for" etc.
I cannot find any instances of these now.


Still have some questions:
> S2. We might want to define formally the concepts of these levels, especially 
> how they are composited (What will be UDAF of UDF, or UDF of UDAF, like 
> round(sum(col)), or sum(round(col))) 
LineageInfo.java: Can you add some comments on what DependencyType the nested 
dependencies like "round(sum(col))" or "sum(round(col)))" have?

S6. The best place to store LineageInfo is probably in the QueryPlan instead of 
SessionState.  Otherwise the LineageInfo will be lost when we run a query that 
is compiled earlier. Thoughts?


> Add column lineage information to the pre execution hooks
> ---------------------------------------------------------
>
>                 Key: HIVE-1131
>                 URL: https://issues.apache.org/jira/browse/HIVE-1131
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>         Attachments: HIVE-1131.patch, HIVE-1131_2.patch, HIVE-1131_3.patch, 
> HIVE-1131_4.patch
>
>
> We need a mechanism to pass the lineage information of the various columns of 
> a table to a pre execution hook so that applications can use that for:
> - auditing
> - dependency checking
> and many other applications.
> The proposal is to expose this through a bunch of classes to the pre 
> execution hook interface to the clients and put in the necessary 
> transformation logic in the optimizer to generate this information.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to