Hello, I was able to rectify the afore-mentioned problem after i implemented a custom partitioner instead of using the default hash partitioner. I have another issue though. After running the post processor the number of points that each cluster contains is not matching the number of points each cluster should contain as stated by clusterdumper.
MSV-287{ n=90 c=[0.05195, 0.05675, 0.07151, 0.05713, 0.06946,...} MSV-145{ n=90 c=[0.93685, 0.93071, 0.93641, 0.94629, 0.94409,..} the n mentioned in clusters-n-final against each cluster is different from the number of points actually contained in d directory for each cluster. Any idea why is this happening ...? PS: the dataset on which i tested the algorithm has 1000 records with 200 attributes per record. I can share the dataset that i have used if needed. Thanks, Gaurav On Fri, Jan 6, 2012 at 6:12 PM, Paritosh Ranjan <pran...@xebia.com> wrote: > ClusterOutputProcessorDriver has options to run either sequentially or in > a mapreduce way. > > If the clustering was done sequetially, then ClusterOutputProcessor should > be run sequentially, and if the clustering was done in a mapreduce way, > then run the ClusterOutputPostProcessor with option mapreduce=true. > > If you have already tried this, and its still now working, then filing a > bug (as Lance mentioned) would be appropriate. > > > On 06-01-2012 17:18, gaurav redkar wrote: > >> Hello, >> wen I ran the ClusterOutputPostProcessor on synthetic_control_data in >> mapreduce mode, I observed that one directory contained points belonging to >> 2 other clusters and the directories relating to those 2 clusters were not >> created as their "part- *" files were empty and the function "** >> movePartFilesToRespectiveDirec**tories()" was not able to create the >> directories to put them into. I have converted the sequence file containing >> the points belonging to those 3 clusters into text file(by changing the >> output format to TextOutputFormat). Kindly find the attached part-file >> which can be viewed. >> Any suggestions as to why this might be happening...? >> Note: The program runs fine in sequential mode. >> Thanks. >> >> >> No virus found in this message. >> Checked by AVG - www.avg.com <http://www.avg.com> >> Version: 10.0.1416 / Virus Database: 2109/4125 - Release Date: 01/05/12 >> >> >