[jira] [Updated] (MAPREDUCE-1461) Feature to instruct rumen-folder utility to skip jobs worth of specific duration
[ https://issues.apache.org/jira/browse/MAPREDUCE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-1461: -- Tags: rumen, folder, skip-jobs Resolution: Fixed Fix Version/s: 0.23.0 Assignee: Rajesh Balamohan Release Note: Added a ''-starts-after' option to Rumen's Folder utility. The time duration specified after the '-starts-after' option is an offset with respect to the submit time of the first job in the input trace. Jobs in the input trace having a submit time (relative to the first job's submit time) lesser than the specified offset will be ignored. Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I just committed this. Thanks Rajesh! > Feature to instruct rumen-folder utility to skip jobs worth of specific > duration > > > Key: MAPREDUCE-1461 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1461 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: tools/rumen >Affects Versions: 0.23.0 >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Fix For: 0.23.0 > > Attachments: MR-1461-trunk.patch, mapreduce-1461--2010-02-05.patch, > mapreduce-1461--2010-03-04.patch, mr-1461-trunk-with-testcases.patch > > > JSON outputs of rumen on production logs can be huge in the order of multiple > GB. Rumen's folder utility helps in getting a smaller snapshot of this JSON > data. > It would be helpful to have an option in rumen-folder, wherein user can > specify a duration from which rumen-folder should start processing data. > Related JIRA link: https://issues.apache.org/jira/browse/MAPREDUCE-1295 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1461) Feature to instruct rumen-folder utility to skip jobs worth of specific duration
[ https://issues.apache.org/jira/browse/MAPREDUCE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated MAPREDUCE-1461: Attachment: mr-1461-trunk-with-testcases.patch Attaching the patch with -ve testcase as well. > Feature to instruct rumen-folder utility to skip jobs worth of specific > duration > > > Key: MAPREDUCE-1461 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1461 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: tools/rumen >Affects Versions: 0.23.0 >Reporter: Rajesh Balamohan > Attachments: MR-1461-trunk.patch, mapreduce-1461--2010-02-05.patch, > mapreduce-1461--2010-03-04.patch, mr-1461-trunk-with-testcases.patch > > > JSON outputs of rumen on production logs can be huge in the order of multiple > GB. Rumen's folder utility helps in getting a smaller snapshot of this JSON > data. > It would be helpful to have an option in rumen-folder, wherein user can > specify a duration from which rumen-folder should start processing data. > Related JIRA link: https://issues.apache.org/jira/browse/MAPREDUCE-1295 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1461) Feature to instruct rumen-folder utility to skip jobs worth of specific duration
[ https://issues.apache.org/jira/browse/MAPREDUCE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated MAPREDUCE-1461: Affects Version/s: 0.23.0 Status: Patch Available (was: Open) > Feature to instruct rumen-folder utility to skip jobs worth of specific > duration > > > Key: MAPREDUCE-1461 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1461 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: tools/rumen >Affects Versions: 0.23.0 >Reporter: Rajesh Balamohan > Attachments: MR-1461-trunk.patch, mapreduce-1461--2010-02-05.patch, > mapreduce-1461--2010-03-04.patch, mr-1461-trunk-with-testcases.patch > > > JSON outputs of rumen on production logs can be huge in the order of multiple > GB. Rumen's folder utility helps in getting a smaller snapshot of this JSON > data. > It would be helpful to have an option in rumen-folder, wherein user can > specify a duration from which rumen-folder should start processing data. > Related JIRA link: https://issues.apache.org/jira/browse/MAPREDUCE-1295 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1461) Feature to instruct rumen-folder utility to skip jobs worth of specific duration
[ https://issues.apache.org/jira/browse/MAPREDUCE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated MAPREDUCE-1461: Status: Open (was: Patch Available) > Feature to instruct rumen-folder utility to skip jobs worth of specific > duration > > > Key: MAPREDUCE-1461 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1461 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: tools/rumen >Reporter: Rajesh Balamohan > Attachments: MR-1461-trunk.patch, mapreduce-1461--2010-02-05.patch, > mapreduce-1461--2010-03-04.patch > > > JSON outputs of rumen on production logs can be huge in the order of multiple > GB. Rumen's folder utility helps in getting a smaller snapshot of this JSON > data. > It would be helpful to have an option in rumen-folder, wherein user can > specify a duration from which rumen-folder should start processing data. > Related JIRA link: https://issues.apache.org/jira/browse/MAPREDUCE-1295 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1461) Feature to instruct rumen-folder utility to skip jobs worth of specific duration
[ https://issues.apache.org/jira/browse/MAPREDUCE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated MAPREDUCE-1461: Status: Patch Available (was: Open) > Feature to instruct rumen-folder utility to skip jobs worth of specific > duration > > > Key: MAPREDUCE-1461 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1461 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: tools/rumen >Reporter: Rajesh Balamohan > Attachments: MR-1461-trunk.patch, mapreduce-1461--2010-02-05.patch, > mapreduce-1461--2010-03-04.patch > > > JSON outputs of rumen on production logs can be huge in the order of multiple > GB. Rumen's folder utility helps in getting a smaller snapshot of this JSON > data. > It would be helpful to have an option in rumen-folder, wherein user can > specify a duration from which rumen-folder should start processing data. > Related JIRA link: https://issues.apache.org/jira/browse/MAPREDUCE-1295 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1461) Feature to instruct rumen-folder utility to skip jobs worth of specific duration
[ https://issues.apache.org/jira/browse/MAPREDUCE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated MAPREDUCE-1461: Attachment: MR-1461-trunk.patch Regenerated the patch for latest apache trunk codebase > Feature to instruct rumen-folder utility to skip jobs worth of specific > duration > > > Key: MAPREDUCE-1461 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1461 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: tools/rumen >Reporter: Rajesh Balamohan > Attachments: MR-1461-trunk.patch, mapreduce-1461--2010-02-05.patch, > mapreduce-1461--2010-03-04.patch > > > JSON outputs of rumen on production logs can be huge in the order of multiple > GB. Rumen's folder utility helps in getting a smaller snapshot of this JSON > data. > It would be helpful to have an option in rumen-folder, wherein user can > specify a duration from which rumen-folder should start processing data. > Related JIRA link: https://issues.apache.org/jira/browse/MAPREDUCE-1295 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1461) Feature to instruct rumen-folder utility to skip jobs worth of specific duration
[ https://issues.apache.org/jira/browse/MAPREDUCE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-1461: -- Component/s: tools/rumen > Feature to instruct rumen-folder utility to skip jobs worth of specific > duration > > > Key: MAPREDUCE-1461 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1461 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: tools/rumen >Reporter: Rajesh Balamohan > Fix For: 0.22.0 > > Attachments: mapreduce-1461--2010-02-05.patch, > mapreduce-1461--2010-03-04.patch > > > JSON outputs of rumen on production logs can be huge in the order of multiple > GB. Rumen's folder utility helps in getting a smaller snapshot of this JSON > data. > It would be helpful to have an option in rumen-folder, wherein user can > specify a duration from which rumen-folder should start processing data. > Related JIRA link: https://issues.apache.org/jira/browse/MAPREDUCE-1295 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1461) Feature to instruct rumen-folder utility to skip jobs worth of specific duration
[ https://issues.apache.org/jira/browse/MAPREDUCE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated MAPREDUCE-1461: Attachment: mapreduce-1461--2010-03-04.patch I took the trunk version and generated the patch. Please refer the attached file. > Feature to instruct rumen-folder utility to skip jobs worth of specific > duration > > > Key: MAPREDUCE-1461 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1461 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Rajesh Balamohan > Fix For: 0.22.0 > > Attachments: mapreduce-1461--2010-02-05.patch, > mapreduce-1461--2010-03-04.patch > > > JSON outputs of rumen on production logs can be huge in the order of multiple > GB. Rumen's folder utility helps in getting a smaller snapshot of this JSON > data. > It would be helpful to have an option in rumen-folder, wherein user can > specify a duration from which rumen-folder should start processing data. > Related JIRA link: https://issues.apache.org/jira/browse/MAPREDUCE-1295 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1461) Feature to instruct rumen-folder utility to skip jobs worth of specific duration
[ https://issues.apache.org/jira/browse/MAPREDUCE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1461: - Status: Open (was: Patch Available) Please regenerate the patch relative to the root of the source tree > Feature to instruct rumen-folder utility to skip jobs worth of specific > duration > > > Key: MAPREDUCE-1461 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1461 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Rajesh Balamohan > Fix For: 0.22.0 > > Attachments: mapreduce-1461--2010-02-05.patch > > > JSON outputs of rumen on production logs can be huge in the order of multiple > GB. Rumen's folder utility helps in getting a smaller snapshot of this JSON > data. > It would be helpful to have an option in rumen-folder, wherein user can > specify a duration from which rumen-folder should start processing data. > Related JIRA link: https://issues.apache.org/jira/browse/MAPREDUCE-1295 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1461) Feature to instruct rumen-folder utility to skip jobs worth of specific duration
[ https://issues.apache.org/jira/browse/MAPREDUCE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Tang updated MAPREDUCE-1461: - Status: Patch Available (was: Open) Mark patch available for hudson to pick up. > Feature to instruct rumen-folder utility to skip jobs worth of specific > duration > > > Key: MAPREDUCE-1461 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1461 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Rajesh Balamohan > Fix For: 0.22.0 > > Attachments: mapreduce-1461--2010-02-05.patch > > > JSON outputs of rumen on production logs can be huge in the order of multiple > GB. Rumen's folder utility helps in getting a smaller snapshot of this JSON > data. > It would be helpful to have an option in rumen-folder, wherein user can > specify a duration from which rumen-folder should start processing data. > Related JIRA link: https://issues.apache.org/jira/browse/MAPREDUCE-1295 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1461) Feature to instruct rumen-folder utility to skip jobs worth of specific duration
[ https://issues.apache.org/jira/browse/MAPREDUCE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated MAPREDUCE-1461: Attachment: mapreduce-1461--2010-02-05.patch The attached patch implements this feature. User can specify the time duration to be skipped by specifying "-starts-after" commandline argument. > Feature to instruct rumen-folder utility to skip jobs worth of specific > duration > > > Key: MAPREDUCE-1461 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1461 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Rajesh Balamohan > Fix For: 0.22.0 > > Attachments: mapreduce-1461--2010-02-05.patch > > > JSON outputs of rumen on production logs can be huge in the order of multiple > GB. Rumen's folder utility helps in getting a smaller snapshot of this JSON > data. > It would be helpful to have an option in rumen-folder, wherein user can > specify a duration from which rumen-folder should start processing data. > Related JIRA link: https://issues.apache.org/jira/browse/MAPREDUCE-1295 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.