HI Chris, Thanks for your response. I deeply appreciate it.
I don¹t know what you mean by that question. I use configuration: * In the driver Job job = Job.getInstance(new Configuration()); * In the CustomLineRecordReader Configuration job = context.getConfiguration(); One of the biggest issues I have had is staying true to the mapreduce.* format Best wishes, Chris MacKenzie From: Chris Mawata <chris.maw...@gmail.com> Reply-To: <user@hadoop.apache.org> Date: Friday, 27 June 2014 14:11 To: <user@hadoop.apache.org> Subject: Re: Partitioning and setup errors The new Configuration() is suspicious. Are you setting configuration information manually? Chris On Jun 27, 2014 5:16 AM, "Chris MacKenzie" <stu...@chrismackenziephotography.co.uk> wrote: > Hi, > > I realise my previous question may have been a bit naïve and I also realise I > am asking an awful lot here, any advice would be greatly appreciated. > * I have been using Hadoop 2.4 in local mode and am sticking to the > mapreduce.* side of the track. > * I am using a Custom Line reader to read each sequence into a Map > * I have a partitioner class which is testing the key from the map class. > * I've tried debugging in eclipse with a breakpoint in the partitioner class > but getPartition(LongWritable mapKey, Text sequenceString, int numReduceTasks) > is not being called. > Could there be any reason for that ? > > Because my map and reduce code works in local mode within eclipse, I wondered > if I may get the partitioner to work if I changed to Pseudo Distributed Mode > exporting a runnable jar from Eclipse (Kepler) > > I have several faults On my own computer Pseudo Distributed Mode and the > university clusters Pseudo Distributed Mode which I set up. I¹ve googled and > read extensively but am not seeing a solution to any of these issues. > > I have this line: > 14/06/27 11:45:27 WARN mapreduce.JobSubmitter: No job jar file set. User > classes may not be found. See Job or Job#setJar(String). > My driver code is: > private void doParallelConcordance() throws Exception { > > Path inDir = new Path("input_sequences/10_sequences.txt"); > > Path outDir = new Path("demo_output"); > > > > Job job = Job.getInstance(new Configuration()); > > job.setJarByClass(ParallelGeneticAlignment.class); > > job.setOutputKeyClass(Text.class); > > job.setOutputValueClass(IntWritable.class); > > > > job.setInputFormatClass(CustomFileInputFormat.class); > > job.setMapperClass(ConcordanceMapper.class); > > job.setPartitionerClass(ConcordanceSequencePartitioner.class); > > job.setReducerClass(ConcordanceReducer.class); > > > > FileInputFormat.addInputPath(job, inDir); > > FileOutputFormat.setOutputPath(job, outDir); > > > > job.waitForCompletion(true) > > } > > > On the university server I am getting this error: > 4/06/27 11:45:40 INFO mapreduce.Job: Task Id : > attempt_1403860966764_0003_m_000000_0, Status : FAILED > Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class > par.gene.align.concordance.ConcordanceMapper not found > > On my machine the error is: > 4/06/27 12:58:03 INFO mapreduce.Job: Task Id : > attempt_1403864060032_0004_r_000000_2, Status : FAILED > Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class > par.gene.align.concordance.ConcordanceReducer not found > > On the university server I get total paths to process: > 14/06/27 11:45:27 INFO input.FileInputFormat: Total input paths to process : 1 > 14/06/27 11:45:28 INFO mapreduce.JobSubmitter: number of splits:1 > > On my machine I get total paths to process: > 14/06/27 12:57:09 INFO input.FileInputFormat: Total input paths to process : 0 > 14/06/27 12:57:36 INFO mapreduce.JobSubmitter: number of splits:0 > > Being new to this community, I thought it polite to introduce myself. I¹m > planning to return to software development via an MSc at Heriot Watt > University in Edinburgh. My MSc project is based on Fosters Genetic Sequence > Alignment. I have written a sequential version my goal is now to port it to > Hadoop. > > Thanks in advance, > Regards, > > Chris MacKenzie