[jira] [Commented] (MAHOUT-986) OutOfMemoryError in LanczosState by way of SpectralKMeans

2012-06-03 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288222#comment-13288222 ] Grant Ingersoll commented on MAHOUT-986: Do we have a test case for this? What's

[jira] [Resolved] (MAHOUT-1023) TestFuzzyKmeans is throwing NPE

2012-06-03 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved MAHOUT-1023. - Resolution: Fixed > TestFuzzyKmeans is throwing NPE > --

[jira] [Commented] (MAHOUT-768) Duplicated DoubleFunction in mahout and mahout-collections (mahout.math package).

2012-06-03 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288216#comment-13288216 ] Grant Ingersoll commented on MAHOUT-768: Ted, do you know the status of this? I d

[jira] [Created] (MAHOUT-1023) TestFuzzyKmeans is throwing NPE

2012-06-03 Thread Grant Ingersoll (JIRA)
Grant Ingersoll created MAHOUT-1023: --- Summary: TestFuzzyKmeans is throwing NPE Key: MAHOUT-1023 URL: https://issues.apache.org/jira/browse/MAHOUT-1023 Project: Mahout Issue Type: Bug

[jira] [Assigned] (MAHOUT-598) Downstream steps in the seq2sparse job flow looking in wrong location for output from previous steps when running in Elastic MapReduce (EMR) cluster

2012-06-03 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned MAHOUT-598: -- Assignee: Grant Ingersoll (was: Robin Anil) > Downstream steps in the seq2sparse j

[jira] [Resolved] (MAHOUT-1020) The Cluster Evaluator is returning bad results

2012-06-03 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved MAHOUT-1020. - Resolution: Fixed I think the build failure is unrelated to this issue. So, closing thi

[jira] [Commented] (MAHOUT-1006) Example from book no longer works - prepare20newsgroups broken with Lucene upgrade

2012-06-03 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288207#comment-13288207 ] Grant Ingersoll commented on MAHOUT-1006: - Robin is fixing some other things w/ N

[jira] [Assigned] (MAHOUT-1006) Example from book no longer works - prepare20newsgroups broken with Lucene upgrade

2012-06-03 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned MAHOUT-1006: --- Assignee: Robin Anil (was: Grant Ingersoll) > Example from book no longer works

[jira] [Commented] (MAHOUT-1006) Example from book no longer works - prepare20newsgroups broken with Lucene upgrade

2012-06-03 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288200#comment-13288200 ] Grant Ingersoll commented on MAHOUT-1006: - Hmm, looks like removing the old bayes

[jira] [Assigned] (MAHOUT-1006) Example from book no longer works - prepare20newsgroups broken with Lucene upgrade

2012-06-03 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned MAHOUT-1006: --- Assignee: Grant Ingersoll (was: Ted Dunning) > Example from book no longer work

[jira] [Commented] (MAHOUT-1006) Example from book no longer works - prepare20newsgroups broken with Lucene upgrade

2012-06-03 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288197#comment-13288197 ] Grant Ingersoll commented on MAHOUT-1006: - I've got this one, Ted.

[jira] [Updated] (MAHOUT-848) M/R job launching code should add Oozie's action.xml as a configuration resource of the Hadoop Configuration object

2012-05-31 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-848: --- Resolution: Fixed Status: Resolved (was: Patch Available) Applied. Thanks, Tim

[jira] [Assigned] (MAHOUT-848) M/R job launching code should add Oozie's action.xml as a configuration resource of the Hadoop Configuration object

2012-05-31 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned MAHOUT-848: -- Assignee: Grant Ingersoll > M/R job launching code should add Oozie's action.xml as

[jira] [Commented] (MAHOUT-848) M/R job launching code should add Oozie's action.xml as a configuration resource of the Hadoop Configuration object

2012-05-31 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286471#comment-13286471 ] Grant Ingersoll commented on MAHOUT-848: I'll apply this today. >

[jira] [Updated] (MAHOUT-399) LDA on Mahout 0.3 does not converge to correct solution for overlapping pyramids toy problem.

2012-05-31 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-399: --- Resolution: Duplicate Status: Resolved (was: Patch Available) > LDA on Mahout 0.

[jira] [Updated] (MAHOUT-939) ASF Email Classification Examples don't always produce good results

2012-05-31 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-939: --- Fix Version/s: (was: 0.7) 0.8 > ASF Email Classification Examples

[jira] [Updated] (MAHOUT-941) Improve ConfusionMatrix statistics

2012-05-31 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-941: --- Fix Version/s: (was: 0.7) 0.8 > Improve ConfusionMatrix statistics

[jira] [Updated] (MAHOUT-944) LuceneIndexToSequenceFiles (lucene2seq) utility

2012-05-31 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-944: --- Fix Version/s: (was: 0.7) 0.8 > LuceneIndexToSequenceFiles (lucene

[jira] [Commented] (MAHOUT-944) LuceneIndexToSequenceFiles (lucene2seq) utility

2012-05-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270367#comment-13270367 ] Grant Ingersoll commented on MAHOUT-944: I'll try to get to this patch this week.

[jira] [Updated] (MAHOUT-627) Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model Training.

2012-05-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-627: --- Fix Version/s: (was: 0.7) 0.8 Still think this is useful, but we need

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

2011-09-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108598#comment-13108598 ] Grant Ingersoll commented on MAHOUT-814: +1 on Sean's option #2. The only questio

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

2011-09-18 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107567#comment-13107567 ] Grant Ingersoll commented on MAHOUT-814: Yeah, it is different. It's not actually

[jira] [Updated] (MAHOUT-814) QRFirstStep should use their own tmp space to avoid collisions

2011-09-18 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-814: --- Summary: QRFirstStep should use their own tmp space to avoid collisions (was: LocalSSDSolver

[jira] [Created] (MAHOUT-814) LocalSSDSolver tests should use their own tmp space to avoid collisions

2011-09-18 Thread Grant Ingersoll (JIRA)
LocalSSDSolver tests should use their own tmp space to avoid collisions --- Key: MAHOUT-814 URL: https://issues.apache.org/jira/browse/MAHOUT-814 Project: Mahout Issue Type:

[jira] [Created] (MAHOUT-813) RecommenderJob incorrectly sets io.sort.mb

2011-09-18 Thread Grant Ingersoll (JIRA)
RecommenderJob incorrectly sets io.sort.mb -- Key: MAHOUT-813 URL: https://issues.apache.org/jira/browse/MAHOUT-813 Project: Mahout Issue Type: Bug Affects Versions: 0.6 Reporter: Grant

[jira] [Commented] (MAHOUT-802) Start Phase doesn't properly work in RecommenderJob

2011-09-09 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101564#comment-13101564 ] Grant Ingersoll commented on MAHOUT-802: What's the new RatingMatrix? I guess I s

[jira] [Commented] (MAHOUT-767) Improve RowSimilarityJob performance

2011-09-09 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101543#comment-13101543 ] Grant Ingersoll commented on MAHOUT-767: Why the change in types for the prep work

[jira] [Commented] (MAHOUT-802) Start Phase doesn't properly work in RecommenderJob

2011-09-09 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101525#comment-13101525 ] Grant Ingersoll commented on MAHOUT-802: Sebastian, Can you detail the input chan

[jira] [Commented] (MAHOUT-344) Minhash based clustering

2011-09-09 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101188#comment-13101188 ] Grant Ingersoll commented on MAHOUT-344: Ankur, Could you doc this at https://cw

[jira] [Updated] (MAHOUT-802) Start Phase doesn't properly work in RecommenderJob

2011-09-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-802: --- Attachment: MAHOUT-802b.patch Makes the indexItemId mapping optional > Start Phase doesn't p

[jira] [Commented] (MAHOUT-802) Start Phase doesn't properly work in RecommenderJob

2011-09-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100301#comment-13100301 ] Grant Ingersoll commented on MAHOUT-802: I also don't get the long to int mapping

[jira] [Commented] (MAHOUT-802) Start Phase doesn't properly work in RecommenderJob

2011-09-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100285#comment-13100285 ] Grant Ingersoll commented on MAHOUT-802: Thanks, Sebastian. The hard part is I'm

[jira] [Reopened] (MAHOUT-802) Start Phase doesn't properly work in RecommenderJob

2011-09-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reopened MAHOUT-802: This also doesn't work because it is hardcoded to accept only the item id path. Seems to me, i

[jira] [Resolved] (MAHOUT-802) Start Phase doesn't properly work in RecommenderJob

2011-09-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved MAHOUT-802. Resolution: Fixed Fix Version/s: 0.6 Assignee: Grant Ingersoll > Start Phas

[jira] [Updated] (MAHOUT-802) Start Phase doesn't properly work in RecommenderJob

2011-09-07 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-802: --- Attachment: MAHOUT-802.patch draft patch. Has a step to count the items if they weren't alre

[jira] [Commented] (MAHOUT-802) Start Phase doesn't properly work in RecommenderJob

2011-09-07 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099599#comment-13099599 ] Grant Ingersoll commented on MAHOUT-802: patch coming either tonight or tomorrow a

[jira] [Resolved] (MAHOUT-795) Change prep_asf_mail_archives to not download archives

2011-09-07 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved MAHOUT-795. Resolution: Fixed Fix Version/s: 0.6 > Change prep_asf_mail_archives to not download

[jira] [Commented] (MAHOUT-627) Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model Training.

2011-09-07 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099283#comment-13099283 ] Grant Ingersoll commented on MAHOUT-627: Dhruv, any progress on the last pieces he

[jira] [Created] (MAHOUT-802) Start Phase doesn't properly work in RecommenderJob

2011-09-07 Thread Grant Ingersoll (JIRA)
Start Phase doesn't properly work in RecommenderJob --- Key: MAHOUT-802 URL: https://issues.apache.org/jira/browse/MAHOUT-802 Project: Mahout Issue Type: Bug Reporter: Grant Ingerso

[jira] [Created] (MAHOUT-798) Add Examples for the ASF Mail Archive

2011-08-29 Thread Grant Ingersoll (JIRA)
Add Examples for the ASF Mail Archive - Key: MAHOUT-798 URL: https://issues.apache.org/jira/browse/MAHOUT-798 Project: Mahout Issue Type: New Feature Reporter: Grant Ingersoll Prior

[jira] [Updated] (MAHOUT-627) Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model Training.

2011-08-24 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-627: --- Attachment: MAHOUT-627.patch Some minor changes to move the packaging around to be a bit more

[jira] [Updated] (MAHOUT-795) Change prep_asf_mail_archives to not download archives

2011-08-22 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-795: --- Attachment: MAHOUT-795.patch Removes the S3 downloading, assumes you have the mail files loca

[jira] [Assigned] (MAHOUT-795) Change prep_asf_mail_archives to not download archives

2011-08-22 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned MAHOUT-795: -- Assignee: Grant Ingersoll > Change prep_asf_mail_archives to not download archives > --

[jira] [Created] (MAHOUT-795) Change prep_asf_mail_archives to not download archives

2011-08-22 Thread Grant Ingersoll (JIRA)
Change prep_asf_mail_archives to not download archives -- Key: MAHOUT-795 URL: https://issues.apache.org/jira/browse/MAHOUT-795 Project: Mahout Issue Type: Bug Reporter: Grant I

[jira] [Commented] (MAHOUT-627) Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model Training.

2011-08-14 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084820#comment-13084820 ] Grant Ingersoll commented on MAHOUT-627: Hey Dhruv, nearing pencils down, how are

[jira] [Commented] (MAHOUT-627) Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model Training.

2011-08-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080901#comment-13080901 ] Grant Ingersoll commented on MAHOUT-627: Dhruv, Any luck yet on the unit tests?

[jira] [Commented] (MAHOUT-688) High Document Frequency pruning for seq2sparse

2011-08-02 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078452#comment-13078452 ] Grant Ingersoll commented on MAHOUT-688: Vasil, Any time to update this? If you

[jira] [Resolved] (MAHOUT-763) Map-Side Distance Comparison

2011-08-02 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved MAHOUT-763. Resolution: Fixed > Map-Side Distance Comparison > > >

[jira] [Commented] (MAHOUT-767) Improve RowSimilarityJob performance for count-based distance measures

2011-07-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068331#comment-13068331 ] Grant Ingersoll commented on MAHOUT-767: Why do we need the norms? Why can't we a

[jira] [Created] (MAHOUT-767) Improve RowSimilarityJob performance for count-based distance measures

2011-07-19 Thread Grant Ingersoll (JIRA)
Improve RowSimilarityJob performance for count-based distance measures -- Key: MAHOUT-767 URL: https://issues.apache.org/jira/browse/MAHOUT-767 Project: Mahout Issue Type: I

[jira] [Commented] (MAHOUT-765) Upgrade Lucene to latest release or wait for LUCENE-3151

2011-07-19 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067601#comment-13067601 ] Grant Ingersoll commented on MAHOUT-765: Lance, I think we would still release the

[jira] [Created] (MAHOUT-765) Upgrade Lucene to latest release or wait for LUCENE-3151

2011-07-18 Thread Grant Ingersoll (JIRA)
Upgrade Lucene to latest release or wait for LUCENE-3151 Key: MAHOUT-765 URL: https://issues.apache.org/jira/browse/MAHOUT-765 Project: Mahout Issue Type: Improvement Repor

[jira] [Commented] (MAHOUT-763) Map-Side Distance Comparison

2011-07-17 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066632#comment-13066632 ] Grant Ingersoll commented on MAHOUT-763: +1 > Map-Side Distance Comparison >

[jira] [Commented] (MAHOUT-763) Map-Side Distance Comparison

2011-07-16 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066401#comment-13066401 ] Grant Ingersoll commented on MAHOUT-763: The code is more or less a copy of what's

[jira] [Commented] (MAHOUT-763) Map-Side Distance Comparison

2011-07-16 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066398#comment-13066398 ] Grant Ingersoll commented on MAHOUT-763: Helpful label. Put up a patch and I'll t

[jira] [Reopened] (MAHOUT-763) Map-Side Distance Comparison

2011-07-15 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reopened MAHOUT-763: Going to reopen to provide an alternate output form > Map-Side Distance Comparison > -

[jira] [Resolved] (MAHOUT-763) Map-Side Distance Comparison

2011-07-15 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved MAHOUT-763. Resolution: Fixed Fix Version/s: 0.6 Assignee: Grant Ingersoll Committed re

[jira] [Updated] (MAHOUT-763) Map-Side Distance Comparison

2011-07-15 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-763: --- Attachment: MAHOUT-763.patch Handles multiple seed files > Map-Side Distance Comparison > --

[jira] [Updated] (MAHOUT-763) Map-Side Distance Comparison

2011-07-14 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-763: --- Attachment: MAHOUT-763.patch Fixed some issues w/ the job configuration > Map-Side Distance

[jira] [Updated] (MAHOUT-763) Map-Side Distance Comparison

2011-07-14 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-763: --- Attachment: MAHOUT-763.patch fix import > Map-Side Distance Comparison > ---

[jira] [Updated] (MAHOUT-763) Map-Side Distance Comparison

2011-07-14 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-763: --- Attachment: MAHOUT-763.patch First draft of a patch. Input seeds can be vector, Cluster or C

[jira] [Created] (MAHOUT-763) Map-Side Distance Comparison

2011-07-14 Thread Grant Ingersoll (JIRA)
Map-Side Distance Comparison Key: MAHOUT-763 URL: https://issues.apache.org/jira/browse/MAHOUT-763 Project: Mahout Issue Type: New Feature Reporter: Grant Ingersoll Priority: Minor KMean

[jira] [Updated] (MAHOUT-761) Emitting cluster points should have the option of emitting the distance and potentially other related metrics

2011-07-14 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-761: --- Attachment: MAHOUT-761.patch and better formatting > Emitting cluster points should have the

[jira] [Updated] (MAHOUT-761) Emitting cluster points should have the option of emitting the distance and potentially other related metrics

2011-07-14 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-761: --- Attachment: MAHOUT-761.patch Hooks it into ClusterDumper > Emitting cluster points should ha

[jira] [Updated] (MAHOUT-761) Emitting cluster points should have the option of emitting the distance and potentially other related metrics

2011-07-14 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-761: --- Attachment: MAHOUT-761.patch Tests pass. > Emitting cluster points should have the option of

[jira] [Updated] (MAHOUT-761) Emitting cluster points should have the option of emitting the distance and potentially other related metrics

2011-07-13 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-761: --- Attachment: MAHOUT-761.patch Rough draft of a patch, haven't really tested yet, but gets at w

[jira] [Created] (MAHOUT-761) Emitting cluster points should have the option of emitting the distance and potentially other related metrics

2011-07-13 Thread Grant Ingersoll (JIRA)
Emitting cluster points should have the option of emitting the distance and potentially other related metrics - Key: MAHOUT-761 URL: https://issues.apache.

[jira] [Created] (MAHOUT-757) RowIdJob does not use Mahout's standard CLI parameters

2011-07-11 Thread Grant Ingersoll (JIRA)
RowIdJob does not use Mahout's standard CLI parameters -- Key: MAHOUT-757 URL: https://issues.apache.org/jira/browse/MAHOUT-757 Project: Mahout Issue Type: Improvement Reporter:

[jira] [Commented] (MAHOUT-627) Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model Training.

2011-06-25 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054871#comment-13054871 ] Grant Ingersoll commented on MAHOUT-627: Hi Dhruv, How goes progress on this? >

[jira] [Commented] (MAHOUT-652) [GSoC Proposal] Parallel Viterbi algorithm for HMM

2011-06-25 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054870#comment-13054870 ] Grant Ingersoll commented on MAHOUT-652: Awesome! How goes the testing? > [GSoC

[jira] [Updated] (MAHOUT-458) The LDA output does not include the topic-probability distribution per document (p(z|d)). It outputs only the topics and corresponding words.

2011-06-06 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-458: --- Fix Version/s: (was: 0.6) 0.5 This was fixed on MAHOUT-682 and MAHOUT-

[jira] [Commented] (MAHOUT-399) LDA on Mahout 0.3 does not converge to correct solution for overlapping pyramids toy problem.

2011-06-06 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044806#comment-13044806 ] Grant Ingersoll commented on MAHOUT-399: Michael, any luck on the unit tests? > L

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-23 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037924#comment-13037924 ] Grant Ingersoll commented on MAHOUT-694: Sean/Drew: +1 on commit. Allan: send an

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-22 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037638#comment-13037638 ] Grant Ingersoll commented on MAHOUT-694: Drew, +1 on committing. > IndexOutOfBoun

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-22 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037625#comment-13037625 ] Grant Ingersoll commented on MAHOUT-694: Not sure what happened, ran again after c

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-22 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037624#comment-13037624 ] Grant Ingersoll commented on MAHOUT-694: bq. Work is done in a directory called ma

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-22 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037622#comment-13037622 ] Grant Ingersoll commented on MAHOUT-694: But, I ran it a second time and then I go

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-22 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037620#comment-13037620 ] Grant Ingersoll commented on MAHOUT-694: Hmm, Drew, I don't see the ClusterDump ou

[jira] [Resolved] (MAHOUT-588) Benchmark Mahout's clustering performance on EC2 and publish the results

2011-05-22 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved MAHOUT-588. Resolution: Fixed Fix Version/s: (was: 0.6) 0.5 > Benchmark M

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-22 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037619#comment-13037619 ] Grant Ingersoll commented on MAHOUT-694: I'm reviewing at the moment, but yeah, if

[jira] [Updated] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-694: --- Attachment: MAHOUT-694.patch Close the Input stream. > IndexOutOfBoundException using build-

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037054#comment-13037054 ] Grant Ingersoll commented on MAHOUT-694: bq. and 0.5 reads from hdfs, not local Y

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036981#comment-13036981 ] Grant Ingersoll commented on MAHOUT-694: Here's the 0.4 code: {code} cd examples/b

[jira] [Created] (MAHOUT-707) Setup Jenkins Jobs to validate our Examples/bin Scripts

2011-05-20 Thread Grant Ingersoll (JIRA)
Setup Jenkins Jobs to validate our Examples/bin Scripts --- Key: MAHOUT-707 URL: https://issues.apache.org/jira/browse/MAHOUT-707 Project: Mahout Issue Type: Task Reporter: Gran

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036965#comment-13036965 ] Grant Ingersoll commented on MAHOUT-694: bq. but perhaps build-reuters.sh was neve

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036956#comment-13036956 ] Grant Ingersoll commented on MAHOUT-694: Drew, does it even run on a cluster? I d

[jira] [Updated] (MAHOUT-706) reuse lucene tokenstreams

2011-05-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-706: --- Fix Version/s: 0.6 Assignee: Grant Ingersoll > reuse lucene tokenstreams > -

[jira] [Updated] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-694: --- Attachment: MAHOUT-694.patch Here's the fix. Allan, please confirm. > IndexOutOfBoundExcept

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036889#comment-13036889 ] Grant Ingersoll commented on MAHOUT-694: LUCENE-929 broke this. The fix for us is

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036888#comment-13036888 ] Grant Ingersoll commented on MAHOUT-694: In fact, that does the trick. > IndexOut

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036886#comment-13036886 ] Grant Ingersoll commented on MAHOUT-694: might simply be handled by dropping the t

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036884#comment-13036884 ] Grant Ingersoll commented on MAHOUT-694: Ah, I see the -tmp now, it's underneath r

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036881#comment-13036881 ] Grant Ingersoll commented on MAHOUT-694: When I run it, I get the reuters-out crea

[jira] [Commented] (MAHOUT-694) IndexOutOfBoundException using build-reuters.sh

2011-05-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036878#comment-13036878 ] Grant Ingersoll commented on MAHOUT-694: Hmm, the Lucene ExtractReuters code has:

[jira] [Created] (MAHOUT-698) Hook up Automated Patch Checking for Mahout

2011-05-16 Thread Grant Ingersoll (JIRA)
Hook up Automated Patch Checking for Mahout --- Key: MAHOUT-698 URL: https://issues.apache.org/jira/browse/MAHOUT-698 Project: Mahout Issue Type: Task Reporter: Grant Ingersoll It would b

[jira] [Commented] (MAHOUT-688) High Document Frequency pruning for seq2sparse

2011-05-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030481#comment-13030481 ] Grant Ingersoll commented on MAHOUT-688: OK, that makes reasonable sense. Perhaps

[jira] [Updated] (MAHOUT-688) High Document Frequency pruning for seq2sparse

2011-05-06 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-688: --- Attachment: MAHOUT-688.patch Reorgs the code a little bit to move std. dev. calculation to a

[jira] [Updated] (MAHOUT-688) High Document Frequency pruning for seq2sparse

2011-05-06 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated MAHOUT-688: --- Fix Version/s: 0.6 > High Document Frequency pruning for seq2sparse > ---

[jira] [Resolved] (MAHOUT-686) Upgrade to Lucene/Solr 3.1.0

2011-05-05 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved MAHOUT-686. Resolution: Fixed Fix Version/s: 0.5 Assignee: Grant Ingersoll > Upgrade to

[jira] [Assigned] (MAHOUT-688) High Document Frequency pruning for seq2sparse

2011-05-05 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned MAHOUT-688: -- Assignee: Grant Ingersoll > High Document Frequency pruning for seq2sparse > --

<    1   2   3   4   >