Alexey Raga created SAMZA-875:
---------------------------------

             Summary: Don't try to access the package during containerStart
                 Key: SAMZA-875
                 URL: https://issues.apache.org/jira/browse/SAMZA-875
             Project: Samza
          Issue Type: Bug
          Components: container
    Affects Versions: 0.10.1
            Reporter: Alexey Raga


When the job is submitted using {{run-job.sh}} the package file is given to 
{{YARN}}. The job is the accepted, the container is created, the package is 
unpacked and is ready to execute.

However, the {{startContainer}} method ({{ContainerUtil:159}}) then tries to 
access the original package file. 
{code}
    try {
      fileStatus = 
packagePath.getFileSystem(yarnConfiguration).getFileStatus(packagePath);
    } catch (IOException ioe) {
      log.error("IO Exception when accessing the package status from the 
filesystem", ioe);
      throw new SamzaException("IO Exception when accessing the package status 
from the filesystem");
    }
{code}


It wants to do it just to set the length of the file and the modification time 
to the resource:

{code}
    packageResource.setSize(fileStatus.getLen());
    packageResource.setTimestamp(fileStatus.getModificationTime());
{code}

If these attributes (length and timestamp) are really needed then I think they 
could be captured and submitted by {{run-job.sh}} which would allow to avoid 
this issue.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to