Rajesh Balamohan created TEZ-1912:
-------------------------------------

             Summary: Merge exceptions are thrown when enabling 
tez.runtime.shuffle.memory-to-memory.enable && 
tez.runtime.shuffle.memory-to-memory.segments
                 Key: TEZ-1912
                 URL: https://issues.apache.org/jira/browse/TEZ-1912
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Rajesh Balamohan


Merge exceptions are thrown when running a hive query on tez with the following 
setting.  It works fine without mem-to-mem merge setting.

{code}
2015-01-04 20:04:01,371 ERROR [ShuffleAndMergeRunner [Map_1]] 
orderedgrouped.Shuffle: ShuffleRunner failed with error
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
 Error while doing final merge
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:364)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:327)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Rec# 22630125: Negative value-length: -1
        at 
org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:720)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.InMemoryReader.readRawKey(InMemoryReader.java:104)
        at 
org.apache.tez.runtime.library.common.sort.impl.TezMerger$Segment.readRawKey(TezMerger.java:329)
        at 
org.apache.tez.runtime.library.common.sort.impl.TezMerger$MergeQueue.adjustPriorityQueue(TezMerger.java:500)
        at 
org.apache.tez.runtime.library.common.sort.impl.TezMerger$MergeQueue.next(TezMerger.java:545)
        at 
org.apache.tez.runtime.library.common.sort.impl.TezMerger.writeFile(TezMerger.java:204)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:862)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:473)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:362)
        ... 5 more
{code}

{code}
$HIVE_HOME/bin/hive -hiveconf tez.runtime.io.sort.factor=200 --hiveconf 
tez.shuffle-vertex-manager.min-src-fraction=1.0 --hiveconf 
tez.shuffle-vertex-manager.max-src-fraction=1.0 --hiveconf 
hive.tez.auto.reducer.parallelism=false --hiveconf 
tez.am.heartbeat.interval-ms.max=20 --hiveconf tez.runtime.io.sort.mb=1200 
--hiveconf tez.runtime.sort.threads=2 --hiveconf 
tez.history.logging.service.class=org.apache.tez.dag.history.logging.impl.SimpleHistoryLoggingService
 --hiveconf hive.tez.container.size=4096 --hiveconf 
tez.runtime.shuffle.memory-to-memory.enable=true --hiveconf 
tez.runtime.shuffle.memory-to-memory.segments=4

--10 TB dataset
use tpcds4_bin_partitioned_orc_10000;
drop table testData;
create table testData as select 
ss_sold_date_sk,ss_sold_time_sk,ss_item_sk,ss_customer_sk,ss_quantity,ss_sold_date
 from store_sales distribute by ss_sold_date;
{code}






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to