> On July 10, 2017, 10:02 p.m., Prasanth_J wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Line 1723 (original), 1723 (patched)
> > <https://reviews.apache.org/r/60753/diff/1/?file=1773281#file1773281line1723>
> >
> >     I am not sure if we need this config. 
> >     Any reason to support storing FM sketch's bitvectors?
> >     
> >     If this is for 3.0.0 release only, we could even remove this config.
> >     
> >     If we need to support FM sketch based NDV then having a separate field 
> > in metastore will be better as Ashutosh suggested.
> 
> pengcheng xiong wrote:
>     This config is just to give user an option to choose whether to use FM or 
> HLL to COMPUTE ndv. This patch does not change the fact that we do not store 
> bit vectors for both FM and NDV in object store.
> 
> Prasanth_J wrote:
>     I don't think it will be useful for user to tune this.
>     IMHO, most users won't care a lot about this. This can be used as 
> fallback but this will only add more checks.
> 
> Ashutosh Chauhan wrote:
>     I agree with Prasanth. Overloading this config seems hacky. It is leading 
> to confusing code, e.g, NDVEstimator uses number of bit vectors to determine 
> ndv algo, thats confusing. Its confusing for end user too that +ve value from 
> [0,100] is useful for one algo, but -ve value is only to switch over to diff 
> algo. I suggest that we leave this config as is and introduce a new config 
> which dictates which algo is used for ndv computation. This config value is 
> than passed to compute_stats() udaf by ColumnStatsSemanticAnalyzer. 
> GenericUDAFComputeStats than uses this config to determine which algo its 
> using. 
>     This config we can store in metastore (either as is or mapped as int) so 
> that we can deserialize the bit vectors correctly when we retrieve them.
> 
> pengcheng xiong wrote:
>     I agree it is a bit confusing but then we need to store two extra 
> information in a metastore: (1) the algorithm that we use, FMSketch or HLL 
> and (2) the number of bitvectors for FMSketch.

We can just store algo. For # of bit vectors for FMSketch we can use existing 
logic of counting '{' in serialized string for bit vectors.


- Ashutosh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60753/#review180119
-----------------------------------------------------------


On July 10, 2017, 9:29 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60753/
> -----------------------------------------------------------
> 
> (Updated July 10, 2017, 9:29 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-16966
> 
> 
> Diffs
> -----
> 
>   common/pom.xml e6722babd8 
>   common/src/java/org/apache/hadoop/hive/common/HiveStatsUtils.java 
> 7c9d72fbd2 
>   
> common/src/java/org/apache/hadoop/hive/common/ndv/NumDistinctValueEstimator.java
>  PRE-CREATION 
>   
> common/src/java/org/apache/hadoop/hive/common/ndv/NumDistinctValueEstimatorFactory.java
>  PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/ndv/hll/HLLConstants.java 
> PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/ndv/hll/HLLDenseRegister.java 
> PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/ndv/hll/HLLRegister.java 
> PRE-CREATION 
>   
> common/src/java/org/apache/hadoop/hive/common/ndv/hll/HLLSparseRegister.java 
> PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/ndv/hll/HyperLogLog.java 
> PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/ndv/hll/HyperLogLogUtils.java 
> PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 5700fb9325 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java.orig da48a7ccbd 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/NumDistinctValueEstimator.java
>  92f9a845e3 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/DecimalColumnStatsAggregator.java
>  36b2c9c56b 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/DoubleColumnStatsAggregator.java
>  a88ef84e5c 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/LongColumnStatsAggregator.java
>  8ac6561aec 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/StringColumnStatsAggregator.java
>  2aa4046a46 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMerger.java
>  33c7e3e52c 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
>  fe890e4e27 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/DateColumnStatsMerger.java
>  3179b23438 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/DecimalColumnStatsMerger.java
>  c13add9d9c 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/DoubleColumnStatsMerger.java
>  fbdba24b0a 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/LongColumnStatsMerger.java
>  ac65590505 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/StringColumnStatsMerger.java
>  41587477d3 
>   pom.xml f9fae59a5d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/DecimalNumDistinctValueEstimator.java
>  a05906edfa 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/DoubleNumDistinctValueEstimator.java
>  e76fc74dbc 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
>  2ebfcb2360 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/LongNumDistinctValueEstimator.java
>  1c197a028a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java
>  fa70f49857 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/StringNumDistinctValueEstimator.java
>  601901c163 
>   ql/src/test/queries/clientpositive/hll.q PRE-CREATION 
>   ql/src/test/results/clientpositive/hll.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60753/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>

Reply via email to