[ https://issues.apache.org/jira/browse/MAPREDUCE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853309#action_12853309 ]
Hong Tang commented on MAPREDUCE-1594: -------------------------------------- - GenerateData should extend from GridmixJob instead of LoadJob. I think we can have a default implementation of buildSplits (as an empty function) in GridmixJob and remove the "abstract" keyword. - The indentation of JobCreator looks weird. - Replace the following with Configuration.getEnum: {noformat} + Configuration conf, JobCreator defaultPolicy) { + String policy = conf.get(GRIDMIX_JOB_TYPE, defaultPolicy.name()); + return valueOf(policy.toUpperCase()); {noformat} - unused and extra imports (CommonConfigurationKeys, FsPermission) in TestGridmixSubmission.java. - There does not seem be any unit tests covering the added feature. - Avoiding directly setting "gridmix.job.seq" in both LoadJob and SleepJob. Instead, refactor the statement to a common method in GridmixJob called setSeqId(Job job). Similarly, adding a method getSeqId(Job job) in GridmixJob and avoid directly calling conf.get("girdmix.job.seq", -1) in {GridmixInputFormat, SleepInputFormat}.getSplits(...). - Should we rename Gridmix{Mapper, Reducer, InputFormat, RecordReader, Split} to Load{Mapper, Reducer, InputFormat, RecordReader, Split}? - Also suggest to refactor the statement of setting the original job id to a method in GridmixJob. - There are some comments in LoadJob that are hard to understand. {noformat} // TODO replace with ThreadLocal submitter? // not nesc when TL {noformat} - reduces in SleepMapper is not used. - SleepJob hacks GridmixKey to pass along the sleep duration from map tasks to reduce tasks. We should document that in the code and file a jira to fix it. - The following code does not seem to do what the comments claim to: {noformat} //This is to stop accumulating deviation from expected sleep time //over a period of time. long now = System.nanoTime() / 1000000; if (now < duration) { duration = duration - now; } long slept = 0L; long sleep = 0L; while (slept < duration) { final long rem = duration - slept; sleep = Math.min(rem, RINTERVAL); context.setStatus("Sleeping... " + rem + " ms left"); slept += sleep; TimeUnit.MILLISECONDS.sleep(sleep); } {noformat} Should it be more like the following? {noformat} //This is to stop accumulating deviation from expected sleep time //over a period of time. long start = System.nanoTime() / 1000000; long slept = 0L; long sleep = 0L; while (slept < duration) { final long rem = duration - slept; sleep = Math.min(rem, RINTERVAL); context.setStatus("Sleeping... " + rem + " ms left"); TimeUnit.MILLISECONDS.sleep(sleep); slept = System.nanoTime() / 1000000 - start; } {noformat} - The following seems incorrect: {noformat} final long RINTERVAL = TimeUnit.MILLISECONDS.convert( context.getConfiguration().getLong( GRIDMIX_SLEEP_INTERVAL, Math.min( 1, duration / 20)), TimeUnit.SECONDS); {noformat} "duration" is in ms not in seconds. It should be changed to {noformat} final long RINTERVAL = TimeUnit.MILLISECONDS.convert( context.getConfiguration().getLong( GRIDMIX_SLEEP_INTERVAL, Math.min( 1, Math.max(1, duration/1000/20))), TimeUnit.SECONDS); {noformat} - I cannot find anywhere outdir is used by SleepJob. Did you encounter an error if FOF.setOutputPath is commented out in SleepJob.call()? - Both SleepJob and GridmixJob calls FileInputFormat.addInputPath(job, new path("ignored")), but one is surrounded with a try-catch block and the other is not. Not sure why. I am also curious to know what would be the error if FIF.addInputPath is not called in both classes. > Support for Sleep Jobs in gridmix > --------------------------------- > > Key: MAPREDUCE-1594 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1594 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: contrib/gridmix > Reporter: rahul k singh > Attachments: 1376-5-yhadoop20-100-3.patch, > 1594-yhadoop-20-1xx-1-2.patch, 1594-yhadoop-20-1xx-1-3.patch, > 1594-yhadoop-20-1xx-1.patch, 1594-yhadoop-20-1xx.patch > > > Support for Sleep jobs in gridmix -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.