siddharthteotia commented on a change in pull request #4535: Implement DISTINCT 
clause
URL: https://github.com/apache/incubator-pinot/pull/4535#discussion_r315336815
 
 

 ##########
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/common/DataBlockCache.java
 ##########
 @@ -366,4 +366,74 @@ public boolean equals(Object obj) {
       return _column.equals(that._column) && _dataType == that._dataType;
     }
   }
+
+  /**
+   * Row Index based APIs for Single Value columns
+   */
+
+  public int getSVIntAtIndex(final String column, final int index) {
 
 Review comment:
   At the end to judge uniqueness, we need to look at a row. So simply looking 
at the return value (which is an array containing all the projected values for 
the column) from getIntValuesForSVColumn will not help. We should look at each 
cell from each such array together with the value at the same row index for 
other projection column arrays to check if row has already been stored in 
hashset and store it if now
   
   That's why I introduced the index based APIs.
   
   I see your point though w.r.t making a function call per row... what we can 
do is use the existing APIs and fetch the values(the array) and then iterate 
over them to build row, check for existence and store if new. We can avoid the 
function call overhead this way...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to