[ 
https://issues.apache.org/jira/browse/ORC-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018604#comment-16018604
 ] 

ASF GitHub Bot commented on ORC-184:
------------------------------------

Github user wgtmac commented on a diff in the pull request:

    https://github.com/apache/orc/pull/123#discussion_r117616712
  
    --- Diff: c++/src/Statistics.hh ---
    @@ -106,15 +106,50 @@ namespace orc {
     
         void setMinimum(T min) { _minimum = min; }
     
    -    // GET / SET valueCount_
    +    // GET / SET _valueCount
         uint64_t getNumberOfValues() const { return _valueCount; }
     
         void setNumberOfValues(uint64_t numValues) { _valueCount = numValues; }
     
    -    // GET / SET hasNullValue_
    +    // GET / SET _hasNullValue
         bool hasNull() const { return _hasNull; }
     
         void setHasNull(bool hasNull) { _hasNull = hasNull; }
    +
    +    void reset() {
    +      _hasNull = false;
    +      _hasMinimum = false;
    +      _hasMaximum = false;
    +      _hasSum = false;
    +      _hasTotalLength = false;
    +      _totalLength = 0;
    +      _valueCount = 0;
    +    }
    +
    +    // sum is not merged here as we need to check overflow
    +    void merge(const InternalStatisticsImpl& other) {
    +      _hasNull = _hasNull || other._hasNull;
    +      _valueCount += other._valueCount;
    +
    +      if (other._hasMinimum) {
    +        if (!_hasMinimum) {
    +          _hasMinimum = _hasMaximum = true;
    +          _minimum = other._minimum;
    +          _maximum = other._maximum;
    +        } else {
    +          // all template types should support operator<
    +          if (_maximum < other._maximum) {
    --- End diff --
    
    @majetideepak I agree with you for not putting opeator< into public API. 
The season to implement operator< is to simplify the logic here, otherwise we 
have to specialize InternalStatisticsImpl<Decimal>::merge(const 
InternalStatisticsImpl& other). Now there are two options:
    1. add Decimal::compare() and use the template specilization for Decimal;
    2. move opeator< of decimal here to simplify the merge logic.
    Which do you prefer or you have other ideas? Thanks!


> [C++] Refactor ColumnStatistics classes for writer
> --------------------------------------------------
>
>                 Key: ORC-184
>                 URL: https://issues.apache.org/jira/browse/ORC-184
>             Project: ORC
>          Issue Type: Sub-task
>          Components: C++
>            Reporter: Gang Wu
>            Assignee: Gang Wu
>
> 1. Add setter functions to ColumnStatistics.
> 2. Refactor ColumnStatistics to reduce duplicate code.
> 3. Add more functions in Int128 and Decimal classes for 
> DecimalColumnStatistics.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to