[jira] [Updated] (MAPREDUCE-2153) Bring in more job configuration properties in to the trace file

2011-04-27 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-2153:
--

 Tags: rumen, job-conf, job-properties
   Resolution: Fixed
Fix Version/s: 0.23.0
 Release Note: Adds job configuration parameters to the job trace. The 
configuration parameters are stored under the 'jobProperties' field as 
key-value pairs.
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I just committed this. Thanks Rajesh and Ravi!

> Bring in more job configuration properties in to the trace file
> ---
>
> Key: MAPREDUCE-2153
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2153
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Affects Versions: 0.23.0
>Reporter: Ravi Gummadi
>Assignee: Rajesh Balamohan
> Fix For: 0.23.0
>
> Attachments: MR-2153-patch.txt, MapReduce-2153-trunk.patch, 
> MapReduce-2153-trunk.patch, mr-2153-test-patch-results.txt
>
>
> To emulate distributed cache usage in gridmix jobs, there are 9 configuration 
> properties needed to be available in trace file: 
> (1) mapreduce.job.cache.files
> (2) mapreduce.job.cache.files.visibilities
> (3) mapreduce.job.cache.files.filesizes
> (4) mapreduce.job.cache.files.timestamps
> (5) mapreduce.job.cache.archives
> (6) mapreduce.job.cache.archives.visibilities
> (7) mapreduce.job.cache.archives.filesizes
> (8) mapreduce.job.cache.archives.timestamps
> (9) mapreduce.job.cache.symlink.create
> To emulate data compression in gridmix jobs, trace file should contain the 
> following configuration properties:
> (1) mapreduce.map.output.compress
> (2) mapreduce.map.output.compress.codec
> (3) mapreduce.output.fileoutputformat.compress
> (4) mapreduce.output.fileoutputformat.compress.codec
> (5) mapreduce.output.fileoutputformat.compress.type
> Ideally, gridmix should set many job specific configuration properties like 
> io.sort.mb, io.sort.factor, etc when running simulated jobs to get the same 
> effect of original/real job in terms of spilled records, number of merges, 
> etc.
> TraceBuilder should bring in all these properties into the generated trace 
> file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2153) Bring in more job configuration properties in to the trace file

2011-04-12 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated MAPREDUCE-2153:


Attachment: MR-2153-patch.txt

Fixed the javac warnings in earlier patch

> Bring in more job configuration properties in to the trace file
> ---
>
> Key: MAPREDUCE-2153
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2153
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Affects Versions: 0.23.0
>Reporter: Ravi Gummadi
>Assignee: Rajesh Balamohan
> Attachments: MR-2153-patch.txt, MapReduce-2153-trunk.patch, 
> MapReduce-2153-trunk.patch, mr-2153-test-patch-results.txt
>
>
> To emulate distributed cache usage in gridmix jobs, there are 9 configuration 
> properties needed to be available in trace file: 
> (1) mapreduce.job.cache.files
> (2) mapreduce.job.cache.files.visibilities
> (3) mapreduce.job.cache.files.filesizes
> (4) mapreduce.job.cache.files.timestamps
> (5) mapreduce.job.cache.archives
> (6) mapreduce.job.cache.archives.visibilities
> (7) mapreduce.job.cache.archives.filesizes
> (8) mapreduce.job.cache.archives.timestamps
> (9) mapreduce.job.cache.symlink.create
> To emulate data compression in gridmix jobs, trace file should contain the 
> following configuration properties:
> (1) mapreduce.map.output.compress
> (2) mapreduce.map.output.compress.codec
> (3) mapreduce.output.fileoutputformat.compress
> (4) mapreduce.output.fileoutputformat.compress.codec
> (5) mapreduce.output.fileoutputformat.compress.type
> Ideally, gridmix should set many job specific configuration properties like 
> io.sort.mb, io.sort.factor, etc when running simulated jobs to get the same 
> effect of original/real job in terms of spilled records, number of merges, 
> etc.
> TraceBuilder should bring in all these properties into the generated trace 
> file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2153) Bring in more job configuration properties in to the trace file

2011-04-12 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated MAPREDUCE-2153:


Assignee: Rajesh Balamohan
  Status: Open  (was: Patch Available)

> Bring in more job configuration properties in to the trace file
> ---
>
> Key: MAPREDUCE-2153
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2153
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Affects Versions: 0.23.0
>Reporter: Ravi Gummadi
>Assignee: Rajesh Balamohan
> Attachments: MR-2153-patch.txt, MapReduce-2153-trunk.patch, 
> MapReduce-2153-trunk.patch, mr-2153-test-patch-results.txt
>
>
> To emulate distributed cache usage in gridmix jobs, there are 9 configuration 
> properties needed to be available in trace file: 
> (1) mapreduce.job.cache.files
> (2) mapreduce.job.cache.files.visibilities
> (3) mapreduce.job.cache.files.filesizes
> (4) mapreduce.job.cache.files.timestamps
> (5) mapreduce.job.cache.archives
> (6) mapreduce.job.cache.archives.visibilities
> (7) mapreduce.job.cache.archives.filesizes
> (8) mapreduce.job.cache.archives.timestamps
> (9) mapreduce.job.cache.symlink.create
> To emulate data compression in gridmix jobs, trace file should contain the 
> following configuration properties:
> (1) mapreduce.map.output.compress
> (2) mapreduce.map.output.compress.codec
> (3) mapreduce.output.fileoutputformat.compress
> (4) mapreduce.output.fileoutputformat.compress.codec
> (5) mapreduce.output.fileoutputformat.compress.type
> Ideally, gridmix should set many job specific configuration properties like 
> io.sort.mb, io.sort.factor, etc when running simulated jobs to get the same 
> effect of original/real job in terms of spilled records, number of merges, 
> etc.
> TraceBuilder should bring in all these properties into the generated trace 
> file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2153) Bring in more job configuration properties in to the trace file

2011-04-12 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated MAPREDUCE-2153:


Status: Patch Available  (was: Open)

> Bring in more job configuration properties in to the trace file
> ---
>
> Key: MAPREDUCE-2153
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2153
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Affects Versions: 0.23.0
>Reporter: Ravi Gummadi
>Assignee: Rajesh Balamohan
> Attachments: MR-2153-patch.txt, MapReduce-2153-trunk.patch, 
> MapReduce-2153-trunk.patch, mr-2153-test-patch-results.txt
>
>
> To emulate distributed cache usage in gridmix jobs, there are 9 configuration 
> properties needed to be available in trace file: 
> (1) mapreduce.job.cache.files
> (2) mapreduce.job.cache.files.visibilities
> (3) mapreduce.job.cache.files.filesizes
> (4) mapreduce.job.cache.files.timestamps
> (5) mapreduce.job.cache.archives
> (6) mapreduce.job.cache.archives.visibilities
> (7) mapreduce.job.cache.archives.filesizes
> (8) mapreduce.job.cache.archives.timestamps
> (9) mapreduce.job.cache.symlink.create
> To emulate data compression in gridmix jobs, trace file should contain the 
> following configuration properties:
> (1) mapreduce.map.output.compress
> (2) mapreduce.map.output.compress.codec
> (3) mapreduce.output.fileoutputformat.compress
> (4) mapreduce.output.fileoutputformat.compress.codec
> (5) mapreduce.output.fileoutputformat.compress.type
> Ideally, gridmix should set many job specific configuration properties like 
> io.sort.mb, io.sort.factor, etc when running simulated jobs to get the same 
> effect of original/real job in terms of spilled records, number of merges, 
> etc.
> TraceBuilder should bring in all these properties into the generated trace 
> file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2153) Bring in more job configuration properties in to the trace file

2011-04-11 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-2153:
--

Status: Open  (was: Patch Available)

Cancelling as Hudson picked up the wrong file.

> Bring in more job configuration properties in to the trace file
> ---
>
> Key: MAPREDUCE-2153
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2153
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Affects Versions: 0.23.0
>Reporter: Ravi Gummadi
> Attachments: MapReduce-2153-trunk.patch, MapReduce-2153-trunk.patch, 
> mr-2153-test-patch-results.txt
>
>
> To emulate distributed cache usage in gridmix jobs, there are 9 configuration 
> properties needed to be available in trace file: 
> (1) mapreduce.job.cache.files
> (2) mapreduce.job.cache.files.visibilities
> (3) mapreduce.job.cache.files.filesizes
> (4) mapreduce.job.cache.files.timestamps
> (5) mapreduce.job.cache.archives
> (6) mapreduce.job.cache.archives.visibilities
> (7) mapreduce.job.cache.archives.filesizes
> (8) mapreduce.job.cache.archives.timestamps
> (9) mapreduce.job.cache.symlink.create
> To emulate data compression in gridmix jobs, trace file should contain the 
> following configuration properties:
> (1) mapreduce.map.output.compress
> (2) mapreduce.map.output.compress.codec
> (3) mapreduce.output.fileoutputformat.compress
> (4) mapreduce.output.fileoutputformat.compress.codec
> (5) mapreduce.output.fileoutputformat.compress.type
> Ideally, gridmix should set many job specific configuration properties like 
> io.sort.mb, io.sort.factor, etc when running simulated jobs to get the same 
> effect of original/real job in terms of spilled records, number of merges, 
> etc.
> TraceBuilder should bring in all these properties into the generated trace 
> file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2153) Bring in more job configuration properties in to the trace file

2011-04-11 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated MAPREDUCE-2153:


Status: Patch Available  (was: Open)

> Bring in more job configuration properties in to the trace file
> ---
>
> Key: MAPREDUCE-2153
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2153
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Affects Versions: 0.23.0
>Reporter: Ravi Gummadi
> Attachments: MapReduce-2153-trunk.patch, MapReduce-2153-trunk.patch, 
> mr-2153-test-patch-results.txt
>
>
> To emulate distributed cache usage in gridmix jobs, there are 9 configuration 
> properties needed to be available in trace file: 
> (1) mapreduce.job.cache.files
> (2) mapreduce.job.cache.files.visibilities
> (3) mapreduce.job.cache.files.filesizes
> (4) mapreduce.job.cache.files.timestamps
> (5) mapreduce.job.cache.archives
> (6) mapreduce.job.cache.archives.visibilities
> (7) mapreduce.job.cache.archives.filesizes
> (8) mapreduce.job.cache.archives.timestamps
> (9) mapreduce.job.cache.symlink.create
> To emulate data compression in gridmix jobs, trace file should contain the 
> following configuration properties:
> (1) mapreduce.map.output.compress
> (2) mapreduce.map.output.compress.codec
> (3) mapreduce.output.fileoutputformat.compress
> (4) mapreduce.output.fileoutputformat.compress.codec
> (5) mapreduce.output.fileoutputformat.compress.type
> Ideally, gridmix should set many job specific configuration properties like 
> io.sort.mb, io.sort.factor, etc when running simulated jobs to get the same 
> effect of original/real job in terms of spilled records, number of merges, 
> etc.
> TraceBuilder should bring in all these properties into the generated trace 
> file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2153) Bring in more job configuration properties in to the trace file

2011-04-11 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated MAPREDUCE-2153:


Attachment: MapReduce-2153-trunk.patch

Uploading the same patch for running via Hudson 

> Bring in more job configuration properties in to the trace file
> ---
>
> Key: MAPREDUCE-2153
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2153
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Affects Versions: 0.23.0
>Reporter: Ravi Gummadi
> Attachments: MapReduce-2153-trunk.patch, MapReduce-2153-trunk.patch, 
> mr-2153-test-patch-results.txt
>
>
> To emulate distributed cache usage in gridmix jobs, there are 9 configuration 
> properties needed to be available in trace file: 
> (1) mapreduce.job.cache.files
> (2) mapreduce.job.cache.files.visibilities
> (3) mapreduce.job.cache.files.filesizes
> (4) mapreduce.job.cache.files.timestamps
> (5) mapreduce.job.cache.archives
> (6) mapreduce.job.cache.archives.visibilities
> (7) mapreduce.job.cache.archives.filesizes
> (8) mapreduce.job.cache.archives.timestamps
> (9) mapreduce.job.cache.symlink.create
> To emulate data compression in gridmix jobs, trace file should contain the 
> following configuration properties:
> (1) mapreduce.map.output.compress
> (2) mapreduce.map.output.compress.codec
> (3) mapreduce.output.fileoutputformat.compress
> (4) mapreduce.output.fileoutputformat.compress.codec
> (5) mapreduce.output.fileoutputformat.compress.type
> Ideally, gridmix should set many job specific configuration properties like 
> io.sort.mb, io.sort.factor, etc when running simulated jobs to get the same 
> effect of original/real job in terms of spilled records, number of merges, 
> etc.
> TraceBuilder should bring in all these properties into the generated trace 
> file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2153) Bring in more job configuration properties in to the trace file

2011-04-11 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated MAPREDUCE-2153:


Affects Version/s: 0.23.0
   Status: Patch Available  (was: Open)

> Bring in more job configuration properties in to the trace file
> ---
>
> Key: MAPREDUCE-2153
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2153
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Affects Versions: 0.23.0
>Reporter: Ravi Gummadi
> Attachments: MapReduce-2153-trunk.patch, 
> mr-2153-test-patch-results.txt
>
>
> To emulate distributed cache usage in gridmix jobs, there are 9 configuration 
> properties needed to be available in trace file: 
> (1) mapreduce.job.cache.files
> (2) mapreduce.job.cache.files.visibilities
> (3) mapreduce.job.cache.files.filesizes
> (4) mapreduce.job.cache.files.timestamps
> (5) mapreduce.job.cache.archives
> (6) mapreduce.job.cache.archives.visibilities
> (7) mapreduce.job.cache.archives.filesizes
> (8) mapreduce.job.cache.archives.timestamps
> (9) mapreduce.job.cache.symlink.create
> To emulate data compression in gridmix jobs, trace file should contain the 
> following configuration properties:
> (1) mapreduce.map.output.compress
> (2) mapreduce.map.output.compress.codec
> (3) mapreduce.output.fileoutputformat.compress
> (4) mapreduce.output.fileoutputformat.compress.codec
> (5) mapreduce.output.fileoutputformat.compress.type
> Ideally, gridmix should set many job specific configuration properties like 
> io.sort.mb, io.sort.factor, etc when running simulated jobs to get the same 
> effect of original/real job in terms of spilled records, number of merges, 
> etc.
> TraceBuilder should bring in all these properties into the generated trace 
> file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2153) Bring in more job configuration properties in to the trace file

2011-04-06 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated MAPREDUCE-2153:


Attachment: mr-2153-test-patch-results.txt

ant test-patch results

findbugs are not related to this patch.

> Bring in more job configuration properties in to the trace file
> ---
>
> Key: MAPREDUCE-2153
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2153
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Reporter: Ravi Gummadi
> Attachments: MapReduce-2153-trunk.patch, 
> mr-2153-test-patch-results.txt
>
>
> To emulate distributed cache usage in gridmix jobs, there are 9 configuration 
> properties needed to be available in trace file: 
> (1) mapreduce.job.cache.files
> (2) mapreduce.job.cache.files.visibilities
> (3) mapreduce.job.cache.files.filesizes
> (4) mapreduce.job.cache.files.timestamps
> (5) mapreduce.job.cache.archives
> (6) mapreduce.job.cache.archives.visibilities
> (7) mapreduce.job.cache.archives.filesizes
> (8) mapreduce.job.cache.archives.timestamps
> (9) mapreduce.job.cache.symlink.create
> To emulate data compression in gridmix jobs, trace file should contain the 
> following configuration properties:
> (1) mapreduce.map.output.compress
> (2) mapreduce.map.output.compress.codec
> (3) mapreduce.output.fileoutputformat.compress
> (4) mapreduce.output.fileoutputformat.compress.codec
> (5) mapreduce.output.fileoutputformat.compress.type
> Ideally, gridmix should set many job specific configuration properties like 
> io.sort.mb, io.sort.factor, etc when running simulated jobs to get the same 
> effect of original/real job in terms of spilled records, number of merges, 
> etc.
> TraceBuilder should bring in all these properties into the generated trace 
> file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2153) Bring in more job configuration properties in to the trace file

2011-04-06 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated MAPREDUCE-2153:


Attachment: MapReduce-2153-trunk.patch

Attaching the patch for apache trunk. This patch ensures that all job 
properties are saved in the json file under "jobProperties" tag.

> Bring in more job configuration properties in to the trace file
> ---
>
> Key: MAPREDUCE-2153
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2153
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Reporter: Ravi Gummadi
> Attachments: MapReduce-2153-trunk.patch
>
>
> To emulate distributed cache usage in gridmix jobs, there are 9 configuration 
> properties needed to be available in trace file: 
> (1) mapreduce.job.cache.files
> (2) mapreduce.job.cache.files.visibilities
> (3) mapreduce.job.cache.files.filesizes
> (4) mapreduce.job.cache.files.timestamps
> (5) mapreduce.job.cache.archives
> (6) mapreduce.job.cache.archives.visibilities
> (7) mapreduce.job.cache.archives.filesizes
> (8) mapreduce.job.cache.archives.timestamps
> (9) mapreduce.job.cache.symlink.create
> To emulate data compression in gridmix jobs, trace file should contain the 
> following configuration properties:
> (1) mapreduce.map.output.compress
> (2) mapreduce.map.output.compress.codec
> (3) mapreduce.output.fileoutputformat.compress
> (4) mapreduce.output.fileoutputformat.compress.codec
> (5) mapreduce.output.fileoutputformat.compress.type
> Ideally, gridmix should set many job specific configuration properties like 
> io.sort.mb, io.sort.factor, etc when running simulated jobs to get the same 
> effect of original/real job in terms of spilled records, number of merges, 
> etc.
> TraceBuilder should bring in all these properties into the generated trace 
> file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira