kasakrisz commented on code in PR #5370:
URL: https://github.com/apache/hive/pull/5370#discussion_r1728746348


##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/Generator.java:
##########
@@ -58,12 +67,63 @@
 public class Generator extends Transform {
 
   private static final Logger LOG = LoggerFactory.getLogger(Generator.class);
+  private static final String ALL = "ALL";
+  private static final String NONE = "NONE";
+  private static final Map<HiveOperation, Function<ParseContext, Boolean>> 
filterMap;
 
-  private final Set<String> hooks;
-  private static final String ATLAS_HOOK_CLASSNAME = 
"org.apache.atlas.hive.hook.HiveHook";
+  private final Predicate<ParseContext> statementFilter;
 
-  public Generator(Set<String> hooks) {
-    this.hooks = hooks;
+  static {
+    Map<HiveOperation, Function<ParseContext, Boolean>> map = new 
EnumMap<>(HiveOperation.class);
+    map.put(HiveOperation.CREATETABLE, parseContext -> 
parseContext.getCreateTable() != null);
+    map.put(HiveOperation.CREATETABLE_AS_SELECT, parseContext -> 
parseContext.getQueryProperties().isCTAS());
+    map.put(HiveOperation.CREATEVIEW, parseContext -> 
parseContext.getQueryProperties().isView());
+    map.put(HiveOperation.CREATE_MATERIALIZED_VIEW,
+        parseContext -> 
parseContext.getQueryProperties().isMaterializedView());
+    map.put(HiveOperation.LOAD,
+        parseContext -> !(parseContext.getLoadTableWork() == null || 
parseContext.getLoadTableWork().isEmpty()));
+    map.put(HiveOperation.QUERY, parseContext -> 
parseContext.getQueryProperties().isQuery());
+    filterMap = Collections.unmodifiableMap(map);
+  }
+
+  public static Generator fromConf(HiveConf conf) {
+    return new Generator(createFilterPredicateFromConf(conf));
+  }
+
+  static Predicate<ParseContext> createFilterPredicateFromConf(Configuration 
conf) {
+    Set<HiveOperation> operations = new HashSet<>();
+    boolean noneSpecified = false;
+    for (String valueText : 
conf.getStringCollection(HiveConf.ConfVars.HIVE_LINEAGE_STATEMENT_FILTER.varname))
 {
+      if (ALL.equalsIgnoreCase(valueText)) {
+        return parseContext -> true;
+      }
+      if (NONE.equalsIgnoreCase(valueText)) {
+        noneSpecified = true;

Review Comment:
   > we didn't log all HiveOperation before, why should we now
   
   Only if `org.apache.atlas.hive.hook.HiveHook` present:
   
https://github.com/apache/hive/blob/b775c708f7dd4cd2088e68f994460b42a22189a2/ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/Generator.java#L75
   Which never happens in any of our tests.
   
   > filterMap contains just a few operations, what happens in operation is not 
in a filter list?
   
   Good point. I found that we can not check every HiveOperation based on the 
parse context so I introduced a new enum only with the supported operation 
types.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to