[ 
https://issues.apache.org/jira/browse/HDFS-8198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16238998#comment-16238998
 ] 

Daniel Pol commented on HDFS-8198:
----------------------------------

Terasort doesn't seem to work on my system with EC in beta1. Here's a small 
script to reproduce the issue:

sudo -u hdfs bin/hdfs dfs -rm -r -skipTrash /ectest
sudo -u hdfs bin/hdfs dfs -mkdir /ectest
#sudo -u hdfs bin/hdfs ec -setPolicy -path /ectest -policy RS-3-2-1024k
sleep 5
sudo -u hdfs bin/yarn jar  
/ec/hadoop-3.0.0-beta1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-beta1.jar
 teragen 100000000 /ectest/Input
sleep 30
sudo -u hdfs bin/yarn jar  
/ec/hadoop-3.0.0-beta1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-beta1.jar
 teravalidate /ectest/Input /ectest/Validate
sleep 30
sudo -u hdfs bin/yarn jar  
/ec/hadoop-3.0.0-beta1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-beta1.jar
 terasort /ectest/Input /ectest/Output

It works fine like this (with the set EC policy commented out) but it fails 
when you uncomment the set policy line. Interestingly enough the it fails only 
at Terasort step when reading the input files, but Teravalidate that runs 
before it reads the same files and it doesn't fail. Fsck shows everything find 
and checking the nodes individually, all the files are there. I've tried all 
default codecs and policies (native and java), they all give me the same error. 
Missing blocks. Error shows up only when the amount of data becomes big enough, 
so make sure you use the number of records I have in my script or higher.


> Erasure Coding: system test of TeraSort
> ---------------------------------------
>
>                 Key: HDFS-8198
>                 URL: https://issues.apache.org/jira/browse/HDFS-8198
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Kai Sasaki
>            Priority: Major
>
> Functional system test of TeraSort on EC files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to