[ https://issues.apache.org/jira/browse/TEZ-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781260#comment-17781260 ]
Ayush Saxena commented on TEZ-4415: ----------------------------------- This is reproducible for me as well {code:java} [hdfs@ayushsaxena-3 root]$ hdfs dfs -ls /dataq.har Found 2 items -rw-r--r-- 2 hdfs supergroup 0 2023-10-31 07:28 /dataq.har/_SUCCESS -rw-r--r-- 3 hdfs supergroup 0 2023-10-31 07:28 /dataq.har/part-0 [hdfs@ayushsaxena-3 root]$ hdfs dfs -ls har:/dataq.har 23/10/31 07:29:21 WARN fs.FileSystem: Failed to initialize fileystem har:/dataq.har: java.io.IOException: Invalid path for the Har Filesystem. No index file in har:/dataq.har ls: Invalid path for the Har Filesystem. No index file in har:/dataq.har {code} Looks like the reducer isn't running in case of tez... > Hadoop archives created with Tez miss index files > ------------------------------------------------- > > Key: TEZ-4415 > URL: https://issues.apache.org/jira/browse/TEZ-4415 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.9.2 > Reporter: Christophe Préaud > Priority: Minor > > When a hadoop archive is created with Tez, the _index and _masterindex files > are not created: > {code:java} > # create hadoop archive with Tez > hadoop archive -D mapreduce.framework.name=yarn-tez -archiveName data.har -p > /user/preaudc/data /user/preaudc > (...) > 22/05/23 13:04:39 INFO client.TezClient: Tez Client Version: [ > component=tez-api, version=0.9.2, > revision=10cb3519bd34389210e6511a2ba291b52dcda081, > SCM-URL=scm:git:https://gitbox.apache.org/repos/asf/tez.git, > buildTime=2019-03-19T20:44:07Z ] > (...) > # _index and _masterindex files are not created > hdfs dfs -ls /user/preaudc/data.har > Found 2 items > -rw-r--r-- 3 preaudc preaudc 0 2022-05-23 13:06 > /user/preaudc/data.har/_SUCCESS > -rw-r--r-- 3 preaudc preaudc 2537147461 2022-05-23 13:06 > /user/preaudc/data.har/part-0 > # the hadoop archive is thus unreadable > hdfs dfs -ls har:/user/preaudc/data.har > ls: Invalid path for the Har Filesystem. No index file in > har:/user/preaudc/data.har{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)