[ 
https://issues.apache.org/jira/browse/FLINK-20505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17254057#comment-17254057
 ] 

zoucao commented on FLINK-20505:
--------------------------------

hi [~xintongsong],  after reading the code about `flink-yarn` and doing some 
test in yarn-cluster, I found something depressing. It is hard to support http 
path in `yarn.provided.lib.dirs`. When get file status in method
{code:java}
YarnApplicationFileUploader.getAllFilesInProvidedLibDirs(){code}
,   the http path can not pass the validation of  `isexist` and `isDirectory`, 
even if passed,  method listFiles will throw UnsupportedOperationException, I 
don't have a good choice to get all files' status for a http dir. At the same 
time, other error will occur, even if the validation is skipped. 

As far as I konw, we can make some changes to support http file in 
`yarn.provided.lib.dirs`, not http dir, But it is against the meaning of 
`yarn.provided.lib.dirs`.  So, I just fix the pattern and add unit test, make 
it not a limitation of Flink supporting http file.  If someone want to upload 
http file, he can add a new parameter in `YarnConfigOptions`, or Flink do this 
but not in `yarn.provided.lib.dirs`.

what do you think?

 

> Yarn provided lib does not work with http paths.
> ------------------------------------------------
>
>                 Key: FLINK-20505
>                 URL: https://issues.apache.org/jira/browse/FLINK-20505
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / YARN
>    Affects Versions: 1.12.0, 1.11.2
>            Reporter: Xintong Song
>            Assignee: zoucao
>            Priority: Major
>              Labels: pull-request-available
>
> If an http path is used for provided lib, the following exception will be 
> thrown on the resource manager side:
> {code:java}
> 2020-12-04 17:01:28.955 ERROR org.apache.flink.yarn.YarnResourceManager - 
> Could not start TaskManager in container containerXXXXXX.
> org.apache.flink.util.FlinkException: Error to parse 
> YarnLocalResourceDescriptor from YarnLocalResourceDescriptor{key=XXXXX.jar, 
> path=https://XXXXXXX.jar, size=-1, modificationTime=0, visibility=APPLICATION}
>     at 
> org.apache.flink.yarn.YarnLocalResourceDescriptor.fromString(YarnLocalResourceDescriptor.java:99)
>     at 
> org.apache.flink.yarn.Utils.decodeYarnLocalResourceDescriptorListFromString(Utils.java:721)
>     at org.apache.flink.yarn.Utils.createTaskExecutorContext(Utils.java:626)
>     at 
> org.apache.flink.yarn.YarnResourceManager.getOrCreateContainerLaunchContext(YarnResourceManager.java:746)
>     at 
> org.apache.flink.yarn.YarnResourceManager.createTaskExecutorLaunchContext(YarnResourceManager.java:726)
>     at 
> org.apache.flink.yarn.YarnResourceManager.startTaskExecutorInContainer(YarnResourceManager.java:500)
>     at 
> org.apache.flink.yarn.YarnResourceManager.onContainersOfResourceAllocated(YarnResourceManager.java:455)
>     at 
> org.apache.flink.yarn.YarnResourceManager.lambda$onContainersAllocated$1(YarnResourceManager.java:415)
> {code}
> The problem is that, `HttpFileSystem#getFilsStatus` returns file status with 
> length `-1`, while `YarnLocalResourceDescriptor` does not recognize the 
> negative file length.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to