Re: [jira] Assigned: (MAHOUT-11) Static fields used throughout clustering code (Canopy, K-Means).

2009-12-10 Thread Isabel Drost
On Thu Sean Owen  wrote:

> Looks like Hudson is saying that broke the build but looks like easily
> addressable stuff.

Fixed it - but only shortly *after* Hudson had already started building
the project :/

Triggered the build on Hudson manually a few minutes ago - now it runs
successfully again.

Isabel



Re: [jira] Assigned: (MAHOUT-11) Static fields used throughout clustering code (Canopy, K-Means).

2009-12-10 Thread Sean Owen
Looks like Hudson is saying that broke the build but looks like easily
addressable stuff.

On Dec 10, 2009 11:10 AM, "Isabel Drost (JIRA)"  wrote:

[
https://issues.apache.org/jira/browse/MAHOUT-11?page=com.atlassian.jira.plugin.system.issuet.
..
   Assignee: Drew Farris  (was: Isabel Drost)

Thanks.

> Static fields used throughout clustering code (Canopy, K-Means). >
--...
>Assignee: Drew Farris

> Fix For: 0.3 > > Attachments: MAHOUT-11-all-cleanup-20091128.patch,
MAHOUT-11-...


[jira] Assigned: (MAHOUT-11) Static fields used throughout clustering code (Canopy, K-Means).

2009-12-10 Thread Isabel Drost (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-11?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabel Drost reassigned MAHOUT-11:
--

Assignee: Drew Farris  (was: Isabel Drost)

Thanks.

> Static fields used throughout clustering code (Canopy, K-Means).
> 
>
> Key: MAHOUT-11
> URL: https://issues.apache.org/jira/browse/MAHOUT-11
> Project: Mahout
>  Issue Type: Bug
>  Components: Clustering
>Affects Versions: 0.1
>Reporter: Dawid Weiss
>Assignee: Drew Farris
> Fix For: 0.3
>
> Attachments: MAHOUT-11-all-cleanup-20091128.patch, 
> MAHOUT-11-kmeans-cleanup.patch, MAHOUT-11-RandomSeedGenerator.patch, 
> MAHOUT-11.patch
>
>
> I file this as a bug, even though I'm not 100% sure it is one. In the currect 
> code the information is exchanged via static fields (for example, distance 
> measure and thresholds for Canopies are static field). Is it always true in 
> Hadoop that one job runs inside one JVM with exclusive access? I haven't seen 
> it anywhere in Hadoop documentation and my impression was that everything 
> uses JobConf to pass configuration to jobs, but jobs are configured on a 
> per-object basis (a job is an object, a mapper is an object and everything 
> else is basically an object).
> If it's possible for two jobs to run in parallel inside one JVM then this is 
> a limitation and bug in our code that needs to be addressed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAHOUT-11) Static fields used throughout clustering code (Canopy, K-Means).

2009-12-10 Thread Isabel Drost (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-11?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabel Drost reassigned MAHOUT-11:
--

Assignee: Isabel Drost

> Static fields used throughout clustering code (Canopy, K-Means).
> 
>
> Key: MAHOUT-11
> URL: https://issues.apache.org/jira/browse/MAHOUT-11
> Project: Mahout
>  Issue Type: Bug
>  Components: Clustering
>Affects Versions: 0.1
>Reporter: Dawid Weiss
>Assignee: Isabel Drost
> Fix For: 0.3
>
> Attachments: MAHOUT-11-all-cleanup-20091128.patch, 
> MAHOUT-11-kmeans-cleanup.patch, MAHOUT-11-RandomSeedGenerator.patch, 
> MAHOUT-11.patch
>
>
> I file this as a bug, even though I'm not 100% sure it is one. In the currect 
> code the information is exchanged via static fields (for example, distance 
> measure and thresholds for Canopies are static field). Is it always true in 
> Hadoop that one job runs inside one JVM with exclusive access? I haven't seen 
> it anywhere in Hadoop documentation and my impression was that everything 
> uses JobConf to pass configuration to jobs, but jobs are configured on a 
> per-object basis (a job is an object, a mapper is an object and everything 
> else is basically an object).
> If it's possible for two jobs to run in parallel inside one JVM then this is 
> a limitation and bug in our code that needs to be addressed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Assigned: (MAHOUT-11) Static fields used throughout clustering code (Canopy, K-Means).

2008-03-07 Thread Dawid Weiss


I do see a few advantages of using static variables, actually -- I just wasn't 
sure if it's contractual for Hadoop jobs to run in isolation from other jobs. 
This is a refactoring rather than functionality improvement, so I'll leave the 
issue open for some time; once I get a spare minute I'll look at Hadoop's code 
and see what's cooking there.


D.


Jeff Eastman wrote:

Dawid,

I'm not sure either, as it seems to work on deployed jobs where each
process only uses a single configuration of distance measure. I'm sure
one can easily create use cases where different t1 and t2 values are
required and this will break the static approach. I was going to move
the static variables back into the object and require each instance to
be configured individually, but I got sidetracked into vectors and
matrices and have not gotten to it. 


Go for it,
Jeff

-Original Message-
From: Dawid Weiss (JIRA) [mailto:[EMAIL PROTECTED] 
Sent: Thursday, March 06, 2008 4:59 AM

To: mahout-dev@lucene.apache.org
Subject: [jira] Assigned: (MAHOUT-11) Static fields used throughout
clustering code (Canopy, K-Means).


 [
https://issues.apache.org/jira/browse/MAHOUT-11?page=com.atlassian.jira.
plugin.system.issuetabpanels:all-tabpanel ]

Dawid Weiss reassigned MAHOUT-11:
-

Assignee: Dawid Weiss


Static fields used throughout clustering code (Canopy, K-Means).


Key: MAHOUT-11
URL: https://issues.apache.org/jira/browse/MAHOUT-11
Project: Mahout
 Issue Type: Bug
 Components: Clustering
   Affects Versions: 0.1
   Reporter: Dawid Weiss
   Assignee: Dawid Weiss

I file this as a bug, even though I'm not 100% sure it is one. In the

currect code the information is exchanged via static fields (for
example, distance measure and thresholds for Canopies are static field).
Is it always true in Hadoop that one job runs inside one JVM with
exclusive access? I haven't seen it anywhere in Hadoop documentation and
my impression was that everything uses JobConf to pass configuration to
jobs, but jobs are configured on a per-object basis (a job is an object,
a mapper is an object and everything else is basically an object).

If it's possible for two jobs to run in parallel inside one JVM then

this is a limitation and bug in our code that needs to be addressed.



RE: [jira] Assigned: (MAHOUT-11) Static fields used throughout clustering code (Canopy, K-Means).

2008-03-06 Thread Jeff Eastman
Dawid,

I'm not sure either, as it seems to work on deployed jobs where each
process only uses a single configuration of distance measure. I'm sure
one can easily create use cases where different t1 and t2 values are
required and this will break the static approach. I was going to move
the static variables back into the object and require each instance to
be configured individually, but I got sidetracked into vectors and
matrices and have not gotten to it. 

Go for it,
Jeff

-Original Message-
From: Dawid Weiss (JIRA) [mailto:[EMAIL PROTECTED] 
Sent: Thursday, March 06, 2008 4:59 AM
To: mahout-dev@lucene.apache.org
Subject: [jira] Assigned: (MAHOUT-11) Static fields used throughout
clustering code (Canopy, K-Means).


 [
https://issues.apache.org/jira/browse/MAHOUT-11?page=com.atlassian.jira.
plugin.system.issuetabpanels:all-tabpanel ]

Dawid Weiss reassigned MAHOUT-11:
-

Assignee: Dawid Weiss

> Static fields used throughout clustering code (Canopy, K-Means).
> 
>
> Key: MAHOUT-11
> URL: https://issues.apache.org/jira/browse/MAHOUT-11
> Project: Mahout
>  Issue Type: Bug
>  Components: Clustering
>Affects Versions: 0.1
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> I file this as a bug, even though I'm not 100% sure it is one. In the
currect code the information is exchanged via static fields (for
example, distance measure and thresholds for Canopies are static field).
Is it always true in Hadoop that one job runs inside one JVM with
exclusive access? I haven't seen it anywhere in Hadoop documentation and
my impression was that everything uses JobConf to pass configuration to
jobs, but jobs are configured on a per-object basis (a job is an object,
a mapper is an object and everything else is basically an object).
> If it's possible for two jobs to run in parallel inside one JVM then
this is a limitation and bug in our code that needs to be addressed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAHOUT-11) Static fields used throughout clustering code (Canopy, K-Means).

2008-03-06 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-11?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss reassigned MAHOUT-11:
-

Assignee: Dawid Weiss

> Static fields used throughout clustering code (Canopy, K-Means).
> 
>
> Key: MAHOUT-11
> URL: https://issues.apache.org/jira/browse/MAHOUT-11
> Project: Mahout
>  Issue Type: Bug
>  Components: Clustering
>Affects Versions: 0.1
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> I file this as a bug, even though I'm not 100% sure it is one. In the currect 
> code the information is exchanged via static fields (for example, distance 
> measure and thresholds for Canopies are static field). Is it always true in 
> Hadoop that one job runs inside one JVM with exclusive access? I haven't seen 
> it anywhere in Hadoop documentation and my impression was that everything 
> uses JobConf to pass configuration to jobs, but jobs are configured on a 
> per-object basis (a job is an object, a mapper is an object and everything 
> else is basically an object).
> If it's possible for two jobs to run in parallel inside one JVM then this is 
> a limitation and bug in our code that needs to be addressed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.