[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-03-11 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1403:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

TestJobRetire passes on my machine.

+1

I committed this. Thanks, Arun and Luke!

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20-1.patch, 
> MAPREDUCE-1403_yhadoop20-2.patch, MAPREDUCE-1403_yhadoop20.patch, 
> MR-1403-trunk-1.patch, mr-1403-trunk-v2.patch, mr-1403-trunk-v3.patch, 
> mr-1403-trunk-v4.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-03-10 Thread Luke Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated MAPREDUCE-1403:
---

Attachment: mr-1403-trunk-v4.patch

Added javadoc for the getConfiguration method in the RunningJob interface.

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20-1.patch, 
> MAPREDUCE-1403_yhadoop20-2.patch, MAPREDUCE-1403_yhadoop20.patch, 
> MR-1403-trunk-1.patch, mr-1403-trunk-v2.patch, mr-1403-trunk-v3.patch, 
> mr-1403-trunk-v4.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-03-09 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1403:
-

Status: Patch Available  (was: Open)

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20-1.patch, 
> MAPREDUCE-1403_yhadoop20-2.patch, MAPREDUCE-1403_yhadoop20.patch, 
> MR-1403-trunk-1.patch, mr-1403-trunk-v2.patch, mr-1403-trunk-v3.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-03-09 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1403:
-

Status: Open  (was: Patch Available)

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20-1.patch, 
> MAPREDUCE-1403_yhadoop20-2.patch, MAPREDUCE-1403_yhadoop20.patch, 
> MR-1403-trunk-1.patch, mr-1403-trunk-v2.patch, mr-1403-trunk-v3.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-03-09 Thread Luke Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated MAPREDUCE-1403:
---

Attachment: mr-1403-trunk-v3.patch

Rebased the patch against trunk.

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20-1.patch, 
> MAPREDUCE-1403_yhadoop20-2.patch, MAPREDUCE-1403_yhadoop20.patch, 
> MR-1403-trunk-1.patch, mr-1403-trunk-v2.patch, mr-1403-trunk-v3.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-03-09 Thread Luke Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated MAPREDUCE-1403:
---

Status: Patch Available  (was: Open)

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20-1.patch, 
> MAPREDUCE-1403_yhadoop20-2.patch, MAPREDUCE-1403_yhadoop20.patch, 
> MR-1403-trunk-1.patch, mr-1403-trunk-v2.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-03-09 Thread Luke Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated MAPREDUCE-1403:
---

Attachment: mr-1403-trunk-v2.patch

Discussed with Chris on item 2 and 3, Incorporated the rest of the suggestions.

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20-1.patch, 
> MAPREDUCE-1403_yhadoop20-2.patch, MAPREDUCE-1403_yhadoop20.patch, 
> MR-1403-trunk-1.patch, mr-1403-trunk-v2.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-03-05 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1403:
-

Status: Open  (was: Patch Available)

* The patch includes a whitespace change to {{Job}}
* Can you explain the addition of {{getConfiguration}} to the {{RunningJob}} 
interface? Is the relevant copy in {{JobContextImpl}}?
* Please retain javadoc for {{CACHE_FILES_SIZES}} and {{CACHE_ARCHIVES_SIZES}}, 
rather than code comments
* javadoc referring to classes (as 
{{TrackerDistributedCacheManager::getFileStatus}}) should probably use 
{...@link org.class.name}}} instead of {{name}}
* The actual/expected in the error message for {{assertEquals}} is redundant

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20-1.patch, 
> MAPREDUCE-1403_yhadoop20-2.patch, MAPREDUCE-1403_yhadoop20.patch, 
> MR-1403-trunk-1.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-02-25 Thread Luke Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated MAPREDUCE-1403:
---

Attachment: MR-1403-trunk-1.patch

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20-1.patch, 
> MAPREDUCE-1403_yhadoop20-2.patch, MAPREDUCE-1403_yhadoop20.patch, 
> MR-1403-trunk-1.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-02-25 Thread Luke Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated MAPREDUCE-1403:
---

Affects Version/s: 0.22.0
   Status: Patch Available  (was: Open)

Ported to trunk.

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20-1.patch, 
> MAPREDUCE-1403_yhadoop20-2.patch, MAPREDUCE-1403_yhadoop20.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-02-24 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-1403:


Attachment: MAPREDUCE-1403_yhadoop20-2.patch

More up-to-date patch for older version of hadoop. Not for commit here.

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20-1.patch, 
> MAPREDUCE-1403_yhadoop20-2.patch, MAPREDUCE-1403_yhadoop20.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-02-15 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-1403:


Release Note: Added private configuration variables: 
mapred.cache.files.filesizes and mapred.cache.archives.filesizes to store sizes 
of distributed cache artifacts per job. This can be used by tools like Gridmix 
in simulation runs. 

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20-1.patch, 
> MAPREDUCE-1403_yhadoop20.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-02-13 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-1403:


Attachment: MAPREDUCE-1403_yhadoop20-1.patch

Arun, patch looks fine. There were a few minor nits that I have fixed in the 
attached patch:

- In DistributedCache.java, I refactored getTimestamp to reuse getFileStatus, 
as the entire code was duplicated.
- In JobClient.java, there was an extraneous String.valueOf when constructing 
the modification time buffer. Something like:
{code}
+new StringBuffer(String.valueOf(
+String.valueOf(status.getModificationTime(;
{code}
Fixed that to remove the extraneous call.
- In MRCaching, moved a System.err.println to System.out.println, as it the 
follows the rest of the output and the debug statements now come along with 
rest of the output - so I thought it would be easier to debug if required.

Please verify the changes once.

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20-1.patch, 
> MAPREDUCE-1403_yhadoop20.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1403) Save file-sizes of each of the artifacts in DistributedCache in the JobConf

2010-02-09 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-1403:
-

Attachment: MAPREDUCE-1403_yhadoop20.patch

Patch for y20 distribution. Not to be committed.

> Save file-sizes of each of the artifacts in DistributedCache in the JobConf
> ---
>
> Key: MAPREDUCE-1403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1403_yhadoop20.patch
>
>
> It would be a useful metric to collect... potentially GridMix could use it to 
> emulate jobs which use the DistributedCache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.