[ https://issues.apache.org/jira/browse/HBASE-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549452#comment-13549452 ]
Anil Gupta commented on HBASE-7474: ----------------------------------- License headers in SortingClient.java and BigDecimalSortingColumnInterpreter.java are not properly formatted. Anil: Pending Some log statements, such as the following, can be at debug level. + log.info("Querying only one region for sorting"); Anil: Done + if (sortDecreasing) return instance.sortDecreasing(scan, columnFamily, columnQualifier, + colInterpreter, startIndex, pageSize, true); + else return instance.sortIncreasing(scan, columnFamily, columnQualifier, 'else' keyword is not needed above. Anil: Done. But, personally i like to explicitly use the "else" keyword for readability purpose. I am curious to know if there is any technical reason for not using "else" in the above case? + * This method is used to do the merge sort the rows from multiple regions and produce the final output Remove 'do the'. Wrap long line. Anil: Done + for (Map.Entry<byte[], Result[]> regionResultsEntryMap : regionResultMap.entrySet()) { regionResultsEntryMap -> regionResultsEntry or regionResultsMapEntry Anil: Done + if(totalNoOfRows < startIndex) + { Normally left brace is on the same line as if statement. Insert a space between if and (. Anil: Done currentMaxorMinValueRegion and maxOrMin are used in the if / else blocks. You can move them inside if / else block and give them names that are clearer in meaning. Anil: Done + for (Result[] regionResult : regionResults) { + if ((regionResult.length - 1) < arrayIndex[regionNum]) { regionResults and arrayIndex are both arrays. So you can use the same index to access them - in my opinion the code is more readable. Anil:Pending + finalResult[finalResultCurrentSize++] = regionResults[currentMaxorMinValueRegion][arrayIndex[currentMaxorMinValueRegion]]; Wrap long line above. Anil: Done + if (colInterpreter.compare(tmp, maxOrMin) > 0) { If I read the code correctly, the above comparison is the major difference between ascending and descending sorting. A little abstraction would allow you to unify the two cases. Anil: IMHO, the only way to do that is to put an If(sortDecresing) condition and then either do ">" or "<" comparison on the basis of sortDecreasing. I am worried that this abstraction will make the implementation a tab more slow since the worst case complexity of this sorting is O(n*n). I would prefer performance over few extra lines of code. Let me know your views. Looking at SortingColumnInterpreter, this is the only method which is not present in ColumnInterpreter: + T getValue(KeyValue kv) throws IOException; The following method is already provided by ColumnInterpreter: public abstract T getValue(byte[] colFamily, byte[] colQualifier, KeyValue kv) throws IOException; Anil: I am thinking of adding the missing method "T getValue(KeyValue kv) throws IOException;" in ColumnInterpreter. Is that fine? I dont understand why we need colFamily and colQualifier in getValue method when only a KeyValue is passed. Please consider dropping SortingColumnInterpreter Thanks a lot for doing the code review. ~Anil Gupta Software Engineer II, Intuit, Inc > Endpoint Implementation to support Scans with Sorting of Rows based on column > values(similar to "order by" clause of RDBMS) > --------------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-7474 > URL: https://issues.apache.org/jira/browse/HBASE-7474 > Project: HBase > Issue Type: New Feature > Components: Coprocessors, Scanners > Affects Versions: 0.94.3 > Reporter: Anil Gupta > Assignee: Anil Gupta > Priority: Minor > Labels: coprocessors, scan, sort > Fix For: 0.94.5 > > Attachments: hbase-7474.patch, hbase-7474-v2.patch, > SortingEndpoint_high_level_flowchart.pdf > > > Recently, i have developed an Endpoint which can sort the Results(rows) on > the basis of column values. This functionality is similar to "order by" > clause of RDBMS. I will be submitting this Patch for HBase0.94.3 > I am almost done with the initial development and testing of feature. But, i > need to write the JUnits for this. I will also try to make design doc. > Thanks, > Anil Gupta > Software Engineer II, Intuit, inc -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira