[ 
https://issues.apache.org/jira/browse/TEZ-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358351#comment-14358351
 ] 

Siddharth Seth commented on TEZ-2192:
-------------------------------------

bq. Is it possible to do it in AM side ? Don't allow such kind of 
container-reuse. ... I think if we update the container signature as container 
is reused, we can stop such kind of container reuse with lr conflict.
That would be ideal.

Having AMContainer do the match doesn't really help - since the container has 
already been scheduled and there's no way to return the container without 
killing it - or ensuring a similar container does not get assigned again.
Having the container itself replace the file as a second re-localization isn't 
safe, since the first one may already have been added to the classloader in the 
running container.


> Relocalization does not check for source
> ----------------------------------------
>
>                 Key: TEZ-2192
>                 URL: https://issues.apache.org/jira/browse/TEZ-2192
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.6.0, 0.5.2
>            Reporter: Rohini Palaniswamy
>            Priority: Blocker
>
>  PIG-4443 spills the input splits to disk if serialized split size is greater 
> than some threshold. It faces issues with relocalization when more than one 
> vertex has job.split file. If a job.split file is already there on container 
> reuse, it is reused causing wrong data to be read.
> Either need a way to turn off relocalization or  check the source+timestamp 
> and redownload the file during relocalization. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to