Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/729#discussion_r100676401
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
@@ -390,4 +391,15 @@
String DYNAMIC_UDF_SUPPORT_ENABLED = "exec.udf.enable_dynamic_support";
BooleanValidator DYNAMIC_UDF_SUPPORT_ENABLED_VALIDATOR = new
BooleanValidator(DYNAMIC_UDF_SUPPORT_ENABLED, true, true);
+
+ /**
+ * Option whose value is a long value representing the number of bits
required for computing ndv (using HLL)
+ */
+ LongValidator NDV_MEMORY_LIMIT = new
PositiveLongValidator("exec.statistics.ndv_memory_limit", 30, 20);
+
+ /**
+ * Option whose value represents the current version of the statistics.
Decreasing the value will generate
+ * the older version of statistics
+ */
+ LongValidator STATISTICS_VERSION = new
NonNegativeLongValidator("exec.statistics.capability_version", 1, 1);
--- End diff --
Not sure this is clear, or desirable. When the stats are computed, they use
the version for the code that computes them, right? Are we saying that the user
can select to use an older version of the code for computation? Or that the
code has if statements to support all old versions? If so, this would be the
only place in Drill to do so.
On read size, doesn't the code have to use the version of code compatible
with the version of the stats in the file? How can I use, say, version 2 of
stats with a version 3 file?
Maybe some background explanation is needed (in the spec? Somewhere in the
JIRA or code?) to explain the use case.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---