Re: [jira] [Commented] (MAHOUT-768) Duplicated DoubleFunction in mahout and mahout-collections (mahout.math package).

2012-01-18 Thread Ted Dunning
I think that this is a fine solution (so is merging). I may have some time tonight. On Wed, Jan 18, 2012 at 6:44 PM, Dawid Weiss (Commented) (JIRA) < j...@apache.org> wrote: > >[ > https://issues.apache.org/jira/browse/MAHOUT-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-t

Build failed in Jenkins: Mahout-Examples-Cluster-Reuters #17

2012-01-18 Thread Apache Jenkins Server
See -- Started by timer Building remotely on solaris1 hudson.util.IOException2: remote file operation failed: at hudson.

[jira] [Commented] (MAHOUT-768) Duplicated DoubleFunction in mahout and mahout-collections (mahout.math package).

2012-01-18 Thread Dawid Weiss (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188605#comment-13188605 ] Dawid Weiss commented on MAHOUT-768: Err... I must have forgotten to attach a comment

Build failed in Jenkins: Mahout-Examples-Cluster-Reuters-II #15

2012-01-18 Thread Apache Jenkins Server
See -- Started by timer Building remotely on solaris1 hudson.util.IOException2: remote file operation failed: at h

[jira] [Updated] (MAHOUT-950) Change BtJob to use new MultipleOutputs API

2012-01-18 Thread Tom White (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated MAHOUT-950: - Attachment: MAHOUT-950.patch Here's a patch which works against 0.23.1-SNAPSHOT (run with "mvn clean test

[jira] [Created] (MAHOUT-950) Change BtJob to use new MultipleOutputs API

2012-01-18 Thread Tom White (Created) (JIRA)
Change BtJob to use new MultipleOutputs API --- Key: MAHOUT-950 URL: https://issues.apache.org/jira/browse/MAHOUT-950 Project: Mahout Issue Type: Improvement Components: Math Repo

[jira] [Resolved] (MAHOUT-854) Add MinHash to build-reuters.sh example

2012-01-18 Thread Grant Ingersoll (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved MAHOUT-854. Resolution: Fixed > Add MinHash to build-reuters.sh example > -

Re: [jira] [Created] (MAHOUT-949) ClusterDumper option to skip vector terms and weights

2012-01-18 Thread Ioan Eugen Stan
Pe 18.01.2012 17:54, Ioan Eugen Stan (Created) (JIRA) a scris: ClusterDumper option to skip vector terms and weights - Key: MAHOUT-949 URL: https://issues.apache.org/jira/browse/MAHOUT-949 Proje

[jira] [Commented] (MAHOUT-768) Duplicated DoubleFunction in mahout and mahout-collections (mahout.math package).

2012-01-18 Thread Jeff Eastman (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188520#comment-13188520 ] Jeff Eastman commented on MAHOUT-768: - Can this issue be resolved soon or moved to 0.7

[jira] [Commented] (MAHOUT-854) Add MinHash to build-reuters.sh example

2012-01-18 Thread Jeff Eastman (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188518#comment-13188518 ] Jeff Eastman commented on MAHOUT-854: - Can this issue be closed? > A

[jira] [Created] (MAHOUT-949) ClusterDumper option to skip vector terms and weights

2012-01-18 Thread Ioan Eugen Stan (Created) (JIRA)
ClusterDumper option to skip vector terms and weights - Key: MAHOUT-949 URL: https://issues.apache.org/jira/browse/MAHOUT-949 Project: Mahout Issue Type: Improvement Components: C

Re: Extending mahout lucene.vector driver

2012-01-18 Thread Michael Kazekin
Thanks for your advice! I know about this functionality, but my problem is that I need to cluster very different "slices" of potentially huge index (corpora of texts). So I thought that there is a fast way to obtain such a "slice", while having only one index (instead of creating an index eac

Re: Extending mahout lucene.vector driver

2012-01-18 Thread Frank Scholten
You can use a MatchAllDocsQuery if you want to fetch all documents. On Wed, Jan 18, 2012 at 10:36 AM, Michael Kazekin wrote: > Thank you, Frank! I'll definitely have a look on it. > > As far as I can see, the problem with using Lucene in clusterisation tasks > is that even with queries you get ac

Re: Extending mahout lucene.vector driver

2012-01-18 Thread Michael Kazekin
Thank you, Frank! I'll definitely have a look on it. As far as I can see, the problem with using Lucene in clusterisation tasks is that even with queries you get access to the "tip-of-the-iceberg" results only, while clusterization tasks need to deal with the results as a whole. On 01/17/2012