kasakrisz commented on code in PR #5370:
URL: https://github.com/apache/hive/pull/5370#discussion_r1728746348
##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/Generator.java:
##########
@@ -58,12 +67,63 @@
public class Generator extends Transform {
private static final Logger LOG = LoggerFactory.getLogger(Generator.class);
+ private static final String ALL = "ALL";
+ private static final String NONE = "NONE";
+ private static final Map<HiveOperation, Function<ParseContext, Boolean>>
filterMap;
- private final Set<String> hooks;
- private static final String ATLAS_HOOK_CLASSNAME =
"org.apache.atlas.hive.hook.HiveHook";
+ private final Predicate<ParseContext> statementFilter;
- public Generator(Set<String> hooks) {
- this.hooks = hooks;
+ static {
+ Map<HiveOperation, Function<ParseContext, Boolean>> map = new
EnumMap<>(HiveOperation.class);
+ map.put(HiveOperation.CREATETABLE, parseContext ->
parseContext.getCreateTable() != null);
+ map.put(HiveOperation.CREATETABLE_AS_SELECT, parseContext ->
parseContext.getQueryProperties().isCTAS());
+ map.put(HiveOperation.CREATEVIEW, parseContext ->
parseContext.getQueryProperties().isView());
+ map.put(HiveOperation.CREATE_MATERIALIZED_VIEW,
+ parseContext ->
parseContext.getQueryProperties().isMaterializedView());
+ map.put(HiveOperation.LOAD,
+ parseContext -> !(parseContext.getLoadTableWork() == null ||
parseContext.getLoadTableWork().isEmpty()));
+ map.put(HiveOperation.QUERY, parseContext ->
parseContext.getQueryProperties().isQuery());
+ filterMap = Collections.unmodifiableMap(map);
+ }
+
+ public static Generator fromConf(HiveConf conf) {
+ return new Generator(createFilterPredicateFromConf(conf));
+ }
+
+ static Predicate<ParseContext> createFilterPredicateFromConf(Configuration
conf) {
+ Set<HiveOperation> operations = new HashSet<>();
+ boolean noneSpecified = false;
+ for (String valueText :
conf.getStringCollection(HiveConf.ConfVars.HIVE_LINEAGE_STATEMENT_FILTER.varname))
{
+ if (ALL.equalsIgnoreCase(valueText)) {
+ return parseContext -> true;
+ }
+ if (NONE.equalsIgnoreCase(valueText)) {
+ noneSpecified = true;
Review Comment:
> we didn't log all HiveOperation before, why should we now
Only if `org.apache.atlas.hive.hook.HiveHook` present:
https://github.com/apache/hive/blob/b775c708f7dd4cd2088e68f994460b42a22189a2/ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/Generator.java#L75
Which never happens in any of our tests.
> filterMap contains just a few operations, what happens in operation is not
in a filter list?
Good point. I found that we can not check every HiveOperation based on the
parse context so I introduced a new enum only with the supported operation
types.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]