[ https://issues.apache.org/jira/browse/HDFS-8198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16238998#comment-16238998 ]
Daniel Pol commented on HDFS-8198: ---------------------------------- Terasort doesn't seem to work on my system with EC in beta1. Here's a small script to reproduce the issue: sudo -u hdfs bin/hdfs dfs -rm -r -skipTrash /ectest sudo -u hdfs bin/hdfs dfs -mkdir /ectest #sudo -u hdfs bin/hdfs ec -setPolicy -path /ectest -policy RS-3-2-1024k sleep 5 sudo -u hdfs bin/yarn jar /ec/hadoop-3.0.0-beta1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-beta1.jar teragen 100000000 /ectest/Input sleep 30 sudo -u hdfs bin/yarn jar /ec/hadoop-3.0.0-beta1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-beta1.jar teravalidate /ectest/Input /ectest/Validate sleep 30 sudo -u hdfs bin/yarn jar /ec/hadoop-3.0.0-beta1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-beta1.jar terasort /ectest/Input /ectest/Output It works fine like this (with the set EC policy commented out) but it fails when you uncomment the set policy line. Interestingly enough the it fails only at Terasort step when reading the input files, but Teravalidate that runs before it reads the same files and it doesn't fail. Fsck shows everything find and checking the nodes individually, all the files are there. I've tried all default codecs and policies (native and java), they all give me the same error. Missing blocks. Error shows up only when the amount of data becomes big enough, so make sure you use the number of records I have in my script or higher. > Erasure Coding: system test of TeraSort > --------------------------------------- > > Key: HDFS-8198 > URL: https://issues.apache.org/jira/browse/HDFS-8198 > Project: Hadoop HDFS > Issue Type: Sub-task > Affects Versions: HDFS-7285 > Reporter: Kai Sasaki > Priority: Major > > Functional system test of TeraSort on EC files. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org