[
https://issues.apache.org/jira/browse/TEZ-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17565443#comment-17565443
]
Christophe Préaud commented on TEZ-4415:
----------------------------------------
Hi László,
I can try working on this bug.
Do you suggest that the issue is more likely to be solved in hadoop code rather
than in Tez?
In any case, thanks for the pointers!
> Hadoop archives created with Tez miss index files
> -------------------------------------------------
>
> Key: TEZ-4415
> URL: https://issues.apache.org/jira/browse/TEZ-4415
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.9.2
> Reporter: Christophe Préaud
> Priority: Minor
>
> When a hadoop archive is created with Tez, the _index and _masterindex files
> are not created:
> {code:java}
> # create hadoop archive with Tez
> hadoop archive -D mapreduce.framework.name=yarn-tez -archiveName data.har -p
> /user/preaudc/data /user/preaudc
> (...)
> 22/05/23 13:04:39 INFO client.TezClient: Tez Client Version: [
> component=tez-api, version=0.9.2,
> revision=10cb3519bd34389210e6511a2ba291b52dcda081,
> SCM-URL=scm:git:https://gitbox.apache.org/repos/asf/tez.git,
> buildTime=2019-03-19T20:44:07Z ]
> (...)
> # _index and _masterindex files are not created
> hdfs dfs -ls /user/preaudc/data.har
> Found 2 items
> -rw-r--r-- 3 preaudc preaudc 0 2022-05-23 13:06
> /user/preaudc/data.har/_SUCCESS
> -rw-r--r-- 3 preaudc preaudc 2537147461 2022-05-23 13:06
> /user/preaudc/data.har/part-0
> # the hadoop archive is thus unreadable
> hdfs dfs -ls har:/user/preaudc/data.har
> ls: Invalid path for the Har Filesystem. No index file in
> har:/user/preaudc/data.har{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)