[ https://issues.apache.org/jira/browse/TEZ-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617946#comment-14617946 ]
Tsuyoshi Ozawa commented on TEZ-2604: ------------------------------------- [~rajesh.balamohan] Thanks for the sharing! Closing this as duplicated one. > PipelinedSorter doesn't use number of items when creating SortSpan > ------------------------------------------------------------------- > > Key: TEZ-2604 > URL: https://issues.apache.org/jira/browse/TEZ-2604 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.8.0 > Reporter: Tsuyoshi Ozawa > Assignee: Tsuyoshi Ozawa > Attachments: TEZ-2604.001.patch > > > {quote} > int items = 1024*1024; > int perItem = 16; > if(span.length() != 0) { > items = span.length(); > perItem = span.kvbuffer.limit()/items; > items = (int) ((span.capacity)/(METASIZE+perItem)); > if(items > 1024*1024) { > // our goal is to have 1M splits and sort early > items = 1024*1024; > } > } > Preconditions.checkArgument(listIterator.hasNext(), "block iterator > should not be empty"); > span = new SortSpan((ByteBuffer)listIterator.next().clear(), > (1024*1024), > perItem, ConfigUtils.getIntermediateOutputKeyComparator(this.conf)); > {quote} > Should we use items instead of (1024*1024)? -- This message was sent by Atlassian JIRA (v6.3.4#6332)