[ 
https://issues.apache.org/jira/browse/MAHOUT-11?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780476#action_12780476
 ] 

Isabel Drost commented on MAHOUT-11:
------------------------------------

First of all, thanks for the review.

Passing the output collector directly - Jepp, makes sense. Will change and 
resubmit the patch.

Tests with real data: Big thanks for that.

Isabel

> Static fields used throughout clustering code (Canopy, K-Means).
> ----------------------------------------------------------------
>
>                 Key: MAHOUT-11
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-11
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.1
>            Reporter: Dawid Weiss
>             Fix For: 0.3
>
>         Attachments: MAHOUT-11.patch
>
>
> I file this as a bug, even though I'm not 100% sure it is one. In the currect 
> code the information is exchanged via static fields (for example, distance 
> measure and thresholds for Canopies are static field). Is it always true in 
> Hadoop that one job runs inside one JVM with exclusive access? I haven't seen 
> it anywhere in Hadoop documentation and my impression was that everything 
> uses JobConf to pass configuration to jobs, but jobs are configured on a 
> per-object basis (a job is an object, a mapper is an object and everything 
> else is basically an object).
> If it's possible for two jobs to run in parallel inside one JVM then this is 
> a limitation and bug in our code that needs to be addressed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to