[ https://issues.apache.org/jira/browse/DRILL-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16336949#comment-16336949 ]
ASF GitHub Bot commented on DRILL-6080: --------------------------------------- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1090#discussion_r163455571 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/xsort/managed/TestSortImpl.java --- @@ -466,10 +469,10 @@ public void runLargeSortTest(OperatorFixture fixture, DataGenerator dataGen, public void runJumboBatchTest(OperatorFixture fixture, int rowCount) { timer.reset(); - DataGenerator dataGen = new DataGenerator(fixture, rowCount, Character.MAX_VALUE); - DataValidator validator = new DataValidator(rowCount, Character.MAX_VALUE); + DataGenerator dataGen = new DataGenerator(fixture, rowCount, ValueVector.MAX_ROW_COUNT); + DataValidator validator = new DataValidator(rowCount, ValueVector.MAX_ROW_COUNT); runLargeSortTest(fixture, dataGen, validator); - System.out.println(timer.elapsed(TimeUnit.MILLISECONDS)); +// System.out.println(timer.elapsed(TimeUnit.MILLISECONDS)); --- End diff -- Removed all the timing & debugging code to avoid the need for commented-out lines. Logging in tests is a no-op; we simply discard the logs. > Sort incorrectly limits batch size to 65535 records rather than 65536 > --------------------------------------------------------------------- > > Key: DRILL-6080 > URL: https://issues.apache.org/jira/browse/DRILL-6080 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.12.0 > Reporter: Paul Rogers > Assignee: Paul Rogers > Priority: Minor > Fix For: 1.13.0 > > > Drill places an upper limit on the number of rows in a batch of 64K. That is > 65,536 decimal. When we index records, the indexes run from 0 to 64K-1 or 0 > to 65,535. > The sort code incorrectly uses {{Character.MAX_VALUE}} as the maximum row > count. So, if an incoming batch uses the full 64K size, sort ends up > splitting batches unnecessarily. > The fix is to instead use the correct constant `ValueVector.MAX_ROW_COUNT`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)