[jira] [Commented] (MAHOUT-944) LuceneIndexToSequenceFiles (lucene2seq) utility

2012-02-27 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13217150#comment-13217150 ] Grant Ingersoll commented on MAHOUT-944: Frank, can you put up a patch, please?

[jira] [Commented] (MAHOUT-944) LuceneIndexToSequenceFiles (lucene2seq) utility

2012-02-16 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209337#comment-13209337 ] Grant Ingersoll commented on MAHOUT-944: I've got that need soon, too, Jake. So,

[jira] [Commented] (MAHOUT-968) Classifier based on restricted boltzmann machines

2012-02-15 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208344#comment-13208344 ] Grant Ingersoll commented on MAHOUT-968: Dirk, Thanks for updating this, perhaps

[jira] [Commented] (MAHOUT-944) LuceneIndexToSequenceFiles (lucene2seq) utility

2012-02-15 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208376#comment-13208376 ] Grant Ingersoll commented on MAHOUT-944: Looks reasonable at first blush, with a

[jira] [Commented] (MAHOUT-944) LuceneIndexToSequenceFiles (lucene2seq) utility

2012-02-15 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208380#comment-13208380 ] Grant Ingersoll commented on MAHOUT-944: I'll take care of the pom - 3.5 issue.

[jira] [Commented] (MAHOUT-947) Improvements to seqdumper

2012-02-10 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205392#comment-13205392 ] Grant Ingersoll commented on MAHOUT-947: Hmm, should be a getOptions in there, but

[jira] [Commented] (MAHOUT-947) Improvements to seqdumper

2012-02-10 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205397#comment-13205397 ] Grant Ingersoll commented on MAHOUT-947: bq. I wasn't suggesting supporting

[jira] [Commented] (MAHOUT-947) Improvements to seqdumper

2012-02-08 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204152#comment-13204152 ] Grant Ingersoll commented on MAHOUT-947: I'm close to committing

[jira] [Commented] (MAHOUT-865) Refactor Sequential Clustering algorithms

2012-02-04 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13200496#comment-13200496 ] Grant Ingersoll commented on MAHOUT-865: Definitely. FWIW, beginners can supply

[jira] [Commented] (MAHOUT-958) NullPointerException in RepresentativePointsMapper when running cluster-reuters.sh example with kmeans

2012-01-31 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13197173#comment-13197173 ] Grant Ingersoll commented on MAHOUT-958: ehgjr: do you have a patch, by chance?

[jira] [Commented] (MAHOUT-962) minDF and maxDFPercent filtering doesnt get applied when output weight is tf in SpareVecorsFromSequenceFile

2012-01-28 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195564#comment-13195564 ] Grant Ingersoll commented on MAHOUT-962: Valid points. I think, however, I'm

[jira] [Commented] (MAHOUT-957) term vectors not created in SparseVectorsFromSequenceFiles using tf weighting and maxDFSigma filtering

2012-01-28 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195565#comment-13195565 ] Grant Ingersoll commented on MAHOUT-957: I committed my patch. John, does that

[jira] [Commented] (MAHOUT-964) RowSimilarityJob should exit immediately if an invalid similarity measure specified and it would be nice to have an --overwrite option for the RowSimilarityJob CLI

2012-01-28 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195566#comment-13195566 ] Grant Ingersoll commented on MAHOUT-964: Hey Suneel, Content of patch is fine,

[jira] [Commented] (MAHOUT-962) minDF and maxDFPercent filtering doesnt get applied when output weight is tf in SpareVecorsFromSequenceFile

2012-01-27 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195117#comment-13195117 ] Grant Ingersoll commented on MAHOUT-962: John, I think my fix on MAHOUT-957 should

[jira] [Commented] (MAHOUT-957) term vectors not created in SparseVectorsFromSequenceFiles using tf weighting and maxDFSigma filtering

2012-01-26 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194377#comment-13194377 ] Grant Ingersoll commented on MAHOUT-957: OK, I can reproduce the bug: {quote}

[jira] [Commented] (MAHOUT-957) term vectors not created in SparseVectorsFromSequenceFiles using tf weighting and maxDFSigma filtering

2012-01-26 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194379#comment-13194379 ] Grant Ingersoll commented on MAHOUT-957: Whoa, backing up here a second. If you

[jira] [Commented] (MAHOUT-957) term vectors not created in SparseVectorsFromSequenceFiles using tf weighting and maxDFSigma filtering

2012-01-26 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194395#comment-13194395 ] Grant Ingersoll commented on MAHOUT-957: then again, perhaps it is still all

[jira] [Commented] (MAHOUT-958) NullPointerException in RepresentativePointsMapper when running cluster-reuters.sh example with kmeans

2012-01-26 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194422#comment-13194422 ] Grant Ingersoll commented on MAHOUT-958: Hmm,works for me. Did you try doing a

[jira] [Commented] (MAHOUT-958) NullPointerException in RepresentativePointsMapper when running cluster-reuters.sh example with kmeans

2012-01-26 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194429#comment-13194429 ] Grant Ingersoll commented on MAHOUT-958: works for me on mac. I wonder if it is

[jira] [Commented] (MAHOUT-375) [GSOC] Restricted Boltzmann Machines in Apache Mahout

2012-01-11 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13184101#comment-13184101 ] Grant Ingersoll commented on MAHOUT-375: Hi Dirk, I think it makes sense to have

[jira] [Commented] (MAHOUT-863) Add DisplayMinhash clustering example

2012-01-11 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13184278#comment-13184278 ] Grant Ingersoll commented on MAHOUT-863: Jeff, nothing's been committed here in

[jira] [Commented] (MAHOUT-939) ASF Email Classification Examples don't always produce good results

2012-01-09 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182494#comment-13182494 ] Grant Ingersoll commented on MAHOUT-939: bq. Are these results with held-out data?

[jira] [Commented] (MAHOUT-863) Add DisplayMinhash clustering example

2012-01-08 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182236#comment-13182236 ] Grant Ingersoll commented on MAHOUT-863: I was just invoking the main() method via

[jira] [Commented] (MAHOUT-768) Duplicated DoubleFunction in mahout and mahout-collections (mahout.math package).

2012-01-08 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182259#comment-13182259 ] Grant Ingersoll commented on MAHOUT-768: bq. some require re-release of Mahout

[jira] [Commented] (MAHOUT-939) ASF Email Classification Examples don't always produce good results

2012-01-08 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182303#comment-13182303 ] Grant Ingersoll commented on MAHOUT-939: On what dataset? It works fine w/ just

[jira] [Commented] (MAHOUT-941) Strip quoted text from emails and add statistics to ConfusionMatrix

2012-01-07 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181916#comment-13181916 ] Grant Ingersoll commented on MAHOUT-941: Lance, can you separate out the stats

[jira] [Commented] (MAHOUT-941) Strip quoted text from emails and add statistics to ConfusionMatrix

2012-01-07 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181960#comment-13181960 ] Grant Ingersoll commented on MAHOUT-941: Or, just rename this issue to just be the

[jira] [Commented] (MAHOUT-863) Add DisplayMinhash clustering example

2012-01-07 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182174#comment-13182174 ] Grant Ingersoll commented on MAHOUT-863: FWIW, it still displays after that,

[jira] [Commented] (MAHOUT-939) ASF Email SGD Examples don't produce good results

2012-01-04 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179462#comment-13179462 ] Grant Ingersoll commented on MAHOUT-939: It's the asf-email-examples.sh in

[jira] [Commented] (MAHOUT-399) LDA on Mahout 0.3 does not converge to correct solution for overlapping pyramids toy problem.

2012-01-04 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179529#comment-13179529 ] Grant Ingersoll commented on MAHOUT-399: Jake, what's the status on this?

[jira] [Commented] (MAHOUT-845) Make cluster top terms code more reusable

2012-01-04 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179607#comment-13179607 ] Grant Ingersoll commented on MAHOUT-845: I've got some refactorings in this area

[jira] [Commented] (MAHOUT-890) Performance issue in FPGrowth

2011-12-22 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174819#comment-13174819 ] Grant Ingersoll commented on MAHOUT-890: Jeff, I haven't seen a new patch since

[jira] [Commented] (MAHOUT-627) Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model Training.

2011-12-19 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172658#comment-13172658 ] Grant Ingersoll commented on MAHOUT-627: Once I get past the example issue, I

[jira] [Commented] (MAHOUT-833) Make conversion to sequence files map-reduce

2011-12-18 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171943#comment-13171943 ] Grant Ingersoll commented on MAHOUT-833: I think Josh was working on something,

[jira] [Commented] (MAHOUT-627) Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model Training.

2011-12-18 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171950#comment-13171950 ] Grant Ingersoll commented on MAHOUT-627: I'm getting {quote} 1/12/18 17:21:02 WARN

[jira] [Commented] (MAHOUT-904) SplitInput should support randomizing the input

2011-12-18 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172007#comment-13172007 ] Grant Ingersoll commented on MAHOUT-904: OK. I still just like patches ;-).

[jira] [Commented] (MAHOUT-916) Make Mahout's tests run in parallel

2011-12-09 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166229#comment-13166229 ] Grant Ingersoll commented on MAHOUT-916: Yeah, not sure it's much savings. One

[jira] [Commented] (MAHOUT-916) Make Mahout's tests run in parallel

2011-12-09 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166247#comment-13166247 ] Grant Ingersoll commented on MAHOUT-916: The other weird thing is even when I run

[jira] [Commented] (MAHOUT-916) Make Mahout's tests run in parallel

2011-12-09 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166253#comment-13166253 ] Grant Ingersoll commented on MAHOUT-916: On that note, I suppose it's due to the

[jira] [Commented] (MAHOUT-904) SplitInput should support randomizing the input

2011-12-08 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165128#comment-13165128 ] Grant Ingersoll commented on MAHOUT-904: Go for it! SplitInput

[jira] [Commented] (MAHOUT-917) Build takes too long

2011-12-08 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165207#comment-13165207 ] Grant Ingersoll commented on MAHOUT-917:

[jira] [Commented] (MAHOUT-917) Build takes too long

2011-12-08 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165208#comment-13165208 ] Grant Ingersoll commented on MAHOUT-917: We really should solve MAHOUT-916 and see

[jira] [Commented] (MAHOUT-917) Build takes too long

2011-12-08 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165217#comment-13165217 ] Grant Ingersoll commented on MAHOUT-917: Seems a lot of the overhead is simply due

[jira] [Commented] (MAHOUT-917) Build takes too long

2011-12-07 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164380#comment-13164380 ] Grant Ingersoll commented on MAHOUT-917: I think we should make MahoutTestCase

[jira] [Commented] (MAHOUT-917) Build takes too long

2011-12-07 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164382#comment-13164382 ] Grant Ingersoll commented on MAHOUT-917: But, also agree w/ Sean, we should look

[jira] [Commented] (MAHOUT-837) Make build-asf-email.sh HDFS aware

2011-12-06 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163525#comment-13163525 ] Grant Ingersoll commented on MAHOUT-837: It is, with the caveat that the file in

[jira] [Commented] (MAHOUT-909) Make it so you can pass in all the answers to the questions asked in the example shell scripts

2011-12-05 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13162769#comment-13162769 ] Grant Ingersoll commented on MAHOUT-909: I think that's pretty close. It is a bit

[jira] [Commented] (MAHOUT-909) Make it so you can pass in all the answers to the questions asked in the example shell scripts

2011-12-05 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163244#comment-13163244 ] Grant Ingersoll commented on MAHOUT-909: I don't think we really need to worry

[jira] [Commented] (MAHOUT-688) High Document Frequency pruning for seq2sparse

2011-12-05 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163285#comment-13163285 ] Grant Ingersoll commented on MAHOUT-688: Working on an update to trunk for this.

[jira] [Commented] (MAHOUT-863) Add DisplayMinhash clustering example

2011-12-01 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13161173#comment-13161173 ] Grant Ingersoll commented on MAHOUT-863: Hi Miroslav, By all means! We'd love to

[jira] [Commented] (MAHOUT-344) Minhash based clustering

2011-11-29 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13159249#comment-13159249 ] Grant Ingersoll commented on MAHOUT-344: Ankur, do you have a reference for this

[jira] [Commented] (MAHOUT-899) Add Point Sampling, Color coding to ClusterDumper

2011-11-28 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13158354#comment-13158354 ] Grant Ingersoll commented on MAHOUT-899: The last one is implemented already, you

[jira] [Commented] (MAHOUT-891) LoadEvaluationRunner and Recommender stats

2011-11-21 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154513#comment-13154513 ] Grant Ingersoll commented on MAHOUT-891: Looks fine, other than it seems like a

[jira] [Commented] (MAHOUT-868) Rename build*.sh examples to be more indicative of what they actually do, i.e. classify-20newsgroups.sh

2011-11-21 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154688#comment-13154688 ] Grant Ingersoll commented on MAHOUT-868: Thanks, Joe. I committed most of your

[jira] [Commented] (MAHOUT-881) Refactor TopItems to use Lucene's PriorityQueue and remove excessive sorting

2011-11-20 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13153784#comment-13153784 ] Grant Ingersoll commented on MAHOUT-881: bq. I think the tests should at least be

[jira] [Commented] (MAHOUT-890) Performance issue in FPGrowth

2011-11-20 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13153800#comment-13153800 ] Grant Ingersoll commented on MAHOUT-890: Thinking about how to test this, it sure

[jira] [Commented] (MAHOUT-881) Refactor TopItems to use Lucene's PriorityQueue and remove excessive sorting

2011-11-13 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149380#comment-13149380 ] Grant Ingersoll commented on MAHOUT-881: Yeah, I did some profiling too and came

[jira] [Commented] (MAHOUT-881) Refactor TopItems to use Lucene's PriorityQueue and remove excessive sorting

2011-11-13 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149387#comment-13149387 ] Grant Ingersoll commented on MAHOUT-881: I see some other things we can do, too.

[jira] [Commented] (MAHOUT-881) Refactor TopItems to use Lucene's PriorityQueue and remove excessive sorting

2011-11-12 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149127#comment-13149127 ] Grant Ingersoll commented on MAHOUT-881: bq. Grant's patch should probably also

[jira] [Commented] (MAHOUT-881) Refactor TopItems to use Lucene's PriorityQueue and remove excessive sorting

2011-11-12 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149128#comment-13149128 ] Grant Ingersoll commented on MAHOUT-881: bq. replacing very standard

[jira] [Commented] (MAHOUT-878) Provide better examples for the parallel ALS recommender code

2011-11-11 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13148541#comment-13148541 ] Grant Ingersoll commented on MAHOUT-878: You might also do one for the Amazon

[jira] [Commented] (MAHOUT-878) Provide better examples for the parallel ALS recommender code

2011-11-11 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13148572#comment-13148572 ] Grant Ingersoll commented on MAHOUT-878: Sure, but most of are examples are meant

[jira] [Commented] (MAHOUT-878) Provide better examples for the parallel ALS recommender code

2011-11-09 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147103#comment-13147103 ] Grant Ingersoll commented on MAHOUT-878: See also the stuff I did for

[jira] [Commented] (MAHOUT-612) Simplify configuring and running Mahout MapReduce jobs from Java using Java bean configuration

2011-11-08 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13146246#comment-13146246 ] Grant Ingersoll commented on MAHOUT-612: It seems like we shouldn't have to wait

[jira] [Commented] (MAHOUT-865) Refactor Sequential Clustering algorithms

2011-11-08 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13146251#comment-13146251 ] Grant Ingersoll commented on MAHOUT-865: We likely need something similar to

[jira] [Commented] (MAHOUT-874) Extract Writables into a separate module to allow smaller dependencies

2011-11-06 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144980#comment-13144980 ] Grant Ingersoll commented on MAHOUT-874: Why is Cluster even dependent on

[jira] [Commented] (MAHOUT-873) Provide MapReduce job for creating Encoded Vectors from sequence files

2011-11-04 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144481#comment-13144481 ] Grant Ingersoll commented on MAHOUT-873: I've checked in some baseline

[jira] [Commented] (MAHOUT-403) Regex to Various Output Formats

2011-11-04 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144531#comment-13144531 ] Grant Ingersoll commented on MAHOUT-403: It may be something like this is better

[jira] [Commented] (MAHOUT-403) Regex to Various Output Formats

2011-11-04 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144542#comment-13144542 ] Grant Ingersoll commented on MAHOUT-403: The main issue it has is passing in the

[jira] [Commented] (MAHOUT-403) Regex to Various Output Formats

2011-11-04 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144543#comment-13144543 ] Grant Ingersoll commented on MAHOUT-403: Here's an example of running it against

[jira] [Commented] (MAHOUT-868) Rename build*.sh examples to be more indicative of what they actually do, i.e. classify-20newsgroups.sh

2011-11-04 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144554#comment-13144554 ] Grant Ingersoll commented on MAHOUT-868: bq. Shud I proceed modifying the scripts

[jira] [Commented] (MAHOUT-859) Move Decision Forests to classifier package

2011-11-03 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143063#comment-13143063 ] Grant Ingersoll commented on MAHOUT-859: bq. I did a fresh checkout and the old df

[jira] [Commented] (MAHOUT-155) ARFF VectorIterable

2011-11-03 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143095#comment-13143095 ] Grant Ingersoll commented on MAHOUT-155: Joe, bq. 1. TODO: create a map so we

[jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails

2011-11-03 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143512#comment-13143512 ] Grant Ingersoll commented on MAHOUT-524: bq. If at all possible, my suggestion

[jira] [Commented] (MAHOUT-862) MurmurHash 3.0

2011-11-02 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142199#comment-13142199 ] Grant Ingersoll commented on MAHOUT-862: I accidentally committed this when making

[jira] [Commented] (MAHOUT-862) MurmurHash 3.0

2011-11-02 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142200#comment-13142200 ] Grant Ingersoll commented on MAHOUT-862: Committed revision 1196616. I'll leave

[jira] [Commented] (MAHOUT-864) DisplayCanopy doesn't show any clusters

2011-11-02 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142212#comment-13142212 ] Grant Ingersoll commented on MAHOUT-864: Appears to be due to the fact that

[jira] [Commented] (MAHOUT-862) MurmurHash 3.0

2011-11-02 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142324#comment-13142324 ] Grant Ingersoll commented on MAHOUT-862: I committed the test.

[jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails

2011-11-02 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142461#comment-13142461 ] Grant Ingersoll commented on MAHOUT-524: bq. Is there any way we could simplify

[jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails

2011-11-02 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142477#comment-13142477 ] Grant Ingersoll commented on MAHOUT-524: Tracing into the Hadoop code, this data

[jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails

2011-11-02 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142510#comment-13142510 ] Grant Ingersoll commented on MAHOUT-524: REalizing now that Jeff already said that

[jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails

2011-11-02 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142557#comment-13142557 ] Grant Ingersoll commented on MAHOUT-524: The NPE is from one of the rowJ values

[jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails

2011-11-02 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142586#comment-13142586 ] Grant Ingersoll commented on MAHOUT-524: I guess the 1100 comes from how we are

[jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails

2011-11-02 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142583#comment-13142583 ] Grant Ingersoll commented on MAHOUT-524: in this particular case, the state has 4

[jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails

2011-11-02 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142588#comment-13142588 ] Grant Ingersoll commented on MAHOUT-524: Seems the numDims == 1100 there is

[jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails

2011-11-02 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142633#comment-13142633 ] Grant Ingersoll commented on MAHOUT-524: bq. I applied your patch but I'm having

[jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails

2011-11-02 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142635#comment-13142635 ] Grant Ingersoll commented on MAHOUT-524: bq. I applied your patch but I'm having

[jira] [Commented] (MAHOUT-155) ARFF VectorIterable

2011-11-01 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141099#comment-13141099 ] Grant Ingersoll commented on MAHOUT-155: Hey Joe, Since these are categorical

[jira] [Commented] (MAHOUT-857) Rework 20 NewsGroup shell script example to include SGD Example

2011-11-01 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141321#comment-13141321 ] Grant Ingersoll commented on MAHOUT-857: Here's the conf. matrix I'm getting,

[jira] [Commented] (MAHOUT-857) Rework 20 NewsGroup shell script example to include SGD Example

2011-11-01 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141464#comment-13141464 ] Grant Ingersoll commented on MAHOUT-857: I committed the last patch, plus some

[jira] [Commented] (MAHOUT-344) Minhash based clustering

2011-11-01 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141892#comment-13141892 ] Grant Ingersoll commented on MAHOUT-344: Ankur, any luck on documenting this

[jira] [Commented] (MAHOUT-854) Add MinHash to build-reuters.sh example

2011-11-01 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141894#comment-13141894 ] Grant Ingersoll commented on MAHOUT-854: bq. 1. Is it just me or when I try

[jira] [Commented] (MAHOUT-854) Add MinHash to build-reuters.sh example

2011-11-01 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141900#comment-13141900 ] Grant Ingersoll commented on MAHOUT-854: I've committed this, but will leave the

[jira] [Commented] (MAHOUT-627) Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model Training.

2011-10-31 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140149#comment-13140149 ] Grant Ingersoll commented on MAHOUT-627: I'm going to look to commit this soon

[jira] [Commented] (MAHOUT-855) LuceneTextValueEncoder doesn't properly set internal buffers, causing BufferUnderflowException

2011-10-31 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140234#comment-13140234 ] Grant Ingersoll commented on MAHOUT-855: At least two issues here: 1. The

[jira] [Commented] (MAHOUT-588) Benchmark Mahout's clustering performance on EC2 and publish the results

2011-10-13 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126482#comment-13126482 ] Grant Ingersoll commented on MAHOUT-588: I've turned off access to mine. You

[jira] [Commented] (MAHOUT-839) rowid job failing (when parsing options)

2011-10-11 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125107#comment-13125107 ] Grant Ingersoll commented on MAHOUT-839: Hey Dan, I think the addInputOption,

[jira] [Commented] (MAHOUT-839) rowid job failing (when parsing options)

2011-10-11 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125108#comment-13125108 ] Grant Ingersoll commented on MAHOUT-839: Also, for future reference, no need to

[jira] [Commented] (MAHOUT-839) rowid job failing (when parsing options)

2011-10-11 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125128#comment-13125128 ] Grant Ingersoll commented on MAHOUT-839: I didn't run the code, but looking at it,

[jira] [Commented] (MAHOUT-839) rowid job failing (when parsing options)

2011-10-11 Thread Grant Ingersoll (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125132#comment-13125132 ] Grant Ingersoll commented on MAHOUT-839: {quote} MapString,String parsedArgs

  1   2   >