[ https://issues.apache.org/jira/browse/MAHOUT-11?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780476#action_12780476 ]
Isabel Drost commented on MAHOUT-11: ------------------------------------ First of all, thanks for the review. Passing the output collector directly - Jepp, makes sense. Will change and resubmit the patch. Tests with real data: Big thanks for that. Isabel > Static fields used throughout clustering code (Canopy, K-Means). > ---------------------------------------------------------------- > > Key: MAHOUT-11 > URL: https://issues.apache.org/jira/browse/MAHOUT-11 > Project: Mahout > Issue Type: Bug > Components: Clustering > Affects Versions: 0.1 > Reporter: Dawid Weiss > Fix For: 0.3 > > Attachments: MAHOUT-11.patch > > > I file this as a bug, even though I'm not 100% sure it is one. In the currect > code the information is exchanged via static fields (for example, distance > measure and thresholds for Canopies are static field). Is it always true in > Hadoop that one job runs inside one JVM with exclusive access? I haven't seen > it anywhere in Hadoop documentation and my impression was that everything > uses JobConf to pass configuration to jobs, but jobs are configured on a > per-object basis (a job is an object, a mapper is an object and everything > else is basically an object). > If it's possible for two jobs to run in parallel inside one JVM then this is > a limitation and bug in our code that needs to be addressed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.