David Mollitor created ORC-848:
----------------------------------

             Summary: Recycle Internal Buffer in StringHashTableDictionary
                 Key: ORC-848
                 URL: https://issues.apache.org/jira/browse/ORC-848
             Project: ORC
          Issue Type: Improvement
            Reporter: David Mollitor
            Assignee: David Mollitor


{code:java|title=StringHashTableDictionary.java}
  private void initHashBuckets(int capacity) {
    DynamicIntArray[] buckets = new DynamicIntArray[capacity];
    for (int i = 0; i < capacity; i++) {
      // We don't need large bucket: If we have more than a handful of 
collisions,
      // then the table is too small or the function isn't good.
      buckets[i] = createBucket();
    }
    hashBuckets = buckets;
  }
{code}

This code was highlighted for me in a JMH run of the perf test.  The 
{{Dictionary}} is regularly cleared out and is reset back to its default state. 
 I'm sure most of the time is spent generating {{capacity}} buckets (buffers), 
but we can save one buffer initialization by only creating {{buckets}} if the 
capacity is different than requested (which is not the case with a 
{{clear()}}}).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to