[
https://issues.apache.org/jira/browse/MAHOUT-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143095#comment-13143095
]
Grant Ingersoll commented on MAHOUT-155:
Joe,
bq. 1. TODO: create a map so we
at 4:11 PM, Grant Ingersoll gsing...@apache.org wrote:
I was looking at some of the DistanceMeasure stuff (Mahalanobis at the
moment) and it strikes me as a bit odd that our core distance() functions
would use Preconditions to check things that are part of
construction/configuration
On Nov 2, 2011, at 5:31 PM, Ted Dunning wrote:
For some kinds of models, notably all of the ones from the
exponential class, there exist sufficient statistics and the combination of
models really is a lot like addition. Most of the uses of DP clustering
involve exponential models like
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143512#comment-13143512
]
Grant Ingersoll commented on MAHOUT-524:
bq. If at all possible, my suggestion
/browse/MAHOUT-868
Project: Mahout
Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
Fix For: 0.6
The build*.sh scripts in examples/bin are a bit weird naming wise. We should
deprecate
driver.classes.props is getting unwieldy
Key: MAHOUT-869
URL: https://issues.apache.org/jira/browse/MAHOUT-869
Project: Mahout
Issue Type: Improvement
Reporter: Grant Ingersoll
, Grant Ingersoll gsing...@apache.org wrote:
On Nov 2, 2011, at 5:31 PM, Ted Dunning wrote:
For some kinds of models, notably all of the ones from the
exponential class, there exist sufficient statistics and the combination
of
models really is a lot like addition. Most of the uses of DP
Well, maybe not dead...
What's our goal for the two implementations of Naive Bayes (and Complementary)?
It seems to me like the old one, o.a.m.classifier.bayes, is intended to be
deprecated due to the fact that it is tied to a word based representation.
However, it seems to still have a few
Driver or Job? Let's pick one and be consistent.
-
Key: MAHOUT-870
URL: https://issues.apache.org/jira/browse/MAHOUT-870
Project: Mahout
Issue Type: Improvement
Reporter: Grant
--
Grant Ingersoll
http://www.lucidimagination.com
On Nov 2, 2011, at 5:13 AM, Jake Mannix wrote:
So in the process of getting the LDA improvements I've got brewing over on
GitHub, and I'm doing my good due diligence and making more unit tests and
so forth, and I'm trying to figure out the best way to unit test something
like this, and I
MurmurHash 3.0
--
Key: MAHOUT-862
URL: https://issues.apache.org/jira/browse/MAHOUT-862
Project: Mahout
Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority
with various forms of
graph-shaped data, but isn't a general-purpose graph processing
environment?
Dan
Grant Ingersoll
http://www.lucidimagination.com
[
https://issues.apache.org/jira/browse/MAHOUT-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll resolved MAHOUT-859.
Resolution: Fixed
Committed revision 1196578.
Move Decision Forests
[
https://issues.apache.org/jira/browse/MAHOUT-862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated MAHOUT-862:
---
Attachment: MAHOUT-862.patch
Here's a patch that adds MurmurHash3. Tests pass, but I'm
[
https://issues.apache.org/jira/browse/MAHOUT-862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142199#comment-13142199
]
Grant Ingersoll commented on MAHOUT-862:
I accidentally committed this when making
[
https://issues.apache.org/jira/browse/MAHOUT-862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142200#comment-13142200
]
Grant Ingersoll commented on MAHOUT-862:
Committed revision 1196616.
I'll leave
Add DisplayMinhash clustering example
-
Key: MAHOUT-863
URL: https://issues.apache.org/jira/browse/MAHOUT-863
Project: Mahout
Issue Type: Improvement
Reporter: Grant Ingersoll
[
https://issues.apache.org/jira/browse/MAHOUT-864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated MAHOUT-864:
---
Component/s: Examples
Clustering
DisplayCanopy doesn't show any
DisplayCanopy doesn't show any clusters
---
Key: MAHOUT-864
URL: https://issues.apache.org/jira/browse/MAHOUT-864
Project: Mahout
Issue Type: Bug
Reporter: Grant Ingersoll
Priority
[
https://issues.apache.org/jira/browse/MAHOUT-864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142212#comment-13142212
]
Grant Ingersoll commented on MAHOUT-864:
Appears to be due to the fact
[
https://issues.apache.org/jira/browse/MAHOUT-864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll resolved MAHOUT-864.
Resolution: Fixed
Fix Version/s: 0.6
Assignee: Grant Ingersoll
, Grant Ingersoll wrote:
In reviewing clustering for upcoming training, I'm wondering about something
w/ Canopy clustering that we claim, but wanted to check here first. In the
lectures, etc. I've seen on it, the idea is to run Canopy first and then
some other more expensive algorithm, such as k
WFM - works for me.
On Nov 2, 2011, at 11:30 AM, Sebastian Schelter wrote:
On 02.11.2011 16:04, Jake Mannix wrote:
On Wed, Nov 2, 2011 at 6:38 AM, Grant Ingersoll gsing...@apache.org wrote:
Perhaps it would make sense to move them to a branch? I know we never
released them, but it seems
[
https://issues.apache.org/jira/browse/MAHOUT-862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142324#comment-13142324
]
Grant Ingersoll commented on MAHOUT-862:
I committed the test
On Nov 2, 2011, at 1:01 PM, Jake Mannix wrote:
On Wed, Nov 2, 2011 at 5:36 AM, Grant Ingersoll gsing...@apache.org wrote:
Alternatively, the ASF email data is license free. We could take and use
a chunk of that. You can pretty much have as much or as little as you
want. Since it's
[
https://issues.apache.org/jira/browse/MAHOUT-863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated MAHOUT-863:
---
Attachment: MAHOUT-863.patch
Here's a start. It doesn't display the items yet, namely
Refactor Sequential Clustering algorithms
-
Key: MAHOUT-865
URL: https://issues.apache.org/jira/browse/MAHOUT-865
Project: Mahout
Issue Type: Improvement
Reporter: Grant Ingersoll
[
https://issues.apache.org/jira/browse/MAHOUT-862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated MAHOUT-862:
---
Fix Version/s: 0.6
MurmurHash 3.0
--
Key: MAHOUT-862
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142461#comment-13142461
]
Grant Ingersoll commented on MAHOUT-524:
bq. Is there any way we could simplify
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142477#comment-13142477
]
Grant Ingersoll commented on MAHOUT-524:
Tracing into the Hadoop code, this data
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142510#comment-13142510
]
Grant Ingersoll commented on MAHOUT-524:
REalizing now that Jeff already said
Tim Potter and I have tried running Dirchlet in the past on the ASF email set
on EC2 and it didn't seem to scale all that well, so I was wondering if people
had ideas on improving it's speed. One question I had is whether we could
inject a Combiner into the process? Ted also mentioned that
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated MAHOUT-524:
---
Attachment: MAHOUT-524.patch
patch so far, never mind the DisplayMinHash stuff, as I forgot
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142557#comment-13142557
]
Grant Ingersoll commented on MAHOUT-524:
The NPE is from one of the rowJ values
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142586#comment-13142586
]
Grant Ingersoll commented on MAHOUT-524:
I guess the 1100 comes from how we
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142583#comment-13142583
]
Grant Ingersoll commented on MAHOUT-524:
in this particular case, the state has 4
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142588#comment-13142588
]
Grant Ingersoll commented on MAHOUT-524:
Seems the numDims == 1100
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated MAHOUT-524:
---
Attachment: MAHOUT-524.patch
This gets past the Lanczos issue by checking the size. __ I
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142633#comment-13142633
]
Grant Ingersoll commented on MAHOUT-524:
bq. I applied your patch but I'm having
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142635#comment-13142635
]
Grant Ingersoll commented on MAHOUT-524:
bq. I applied your patch but I'm having
I was looking at some of the DistanceMeasure stuff (Mahalanobis at the moment)
and it strikes me as a bit odd that our core distance() functions would use
Preconditions to check things that are part of construction/configuration of
the instance. A call to distance() is likely executed a lot of
On Nov 2, 2011, at 6:05 PM, Ted Dunning wrote:
I have done some testing and have been unable to demonstrate a big
difference in allocating versus re-using. Re-using is, however, *really*
error prone.
I've been bitten by that one at least once. It's a pain to debug.
On Nov 2, 2011, at 5:29 PM, Jeff Eastman wrote:
I think the scalability problems you are seeing are a consequence of using
the default GaussianCluster models. These models perform especially poorly
for large text clustering problems such as email. The pdf() calculation over
wide topic
[
https://issues.apache.org/jira/browse/MAHOUT-866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll reassigned MAHOUT-866:
--
Assignee: Grant Ingersoll
Move Precondition checks out of Mahalanobis.distance
[
https://issues.apache.org/jira/browse/MAHOUT-866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll resolved MAHOUT-866.
Resolution: Fixed
Fix Version/s: 0.6
Move Precondition checks out
2, 2011 at 10:13 PM, Grant Ingersoll gsing...@apache.org
wrote:
Tim Potter and I have tried running Dirchlet in the past on the ASF
email set on EC2 and it didn't seem to scale all that well, so I was
wondering if people had ideas on improving it's speed. One question I had
is whether we
Add ClusterEvaluator capabilities to ClusterDumper
--
Key: MAHOUT-867
URL: https://issues.apache.org/jira/browse/MAHOUT-867
Project: Mahout
Issue Type: Improvement
Reporter: Grant
[
https://issues.apache.org/jira/browse/MAHOUT-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll reassigned MAHOUT-867:
--
Assignee: Grant Ingersoll
Add ClusterEvaluator capabilities to ClusterDumper
[
https://issues.apache.org/jira/browse/MAHOUT-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated MAHOUT-867:
---
Attachment: MAHOUT-867.patch
Adds --evaluate option to ClusterDumper, which then uses
[
https://issues.apache.org/jira/browse/MAHOUT-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll resolved MAHOUT-867.
Resolution: Fixed
Hooked it into ClusterDumper, also hooked it into build-reuters.sh for k
?
Most likely I've forgotten about other vital pieces - just wanted to kick off
that discussion.
Isabel
* though not the only one - others include but are not limited to the time
frame
for which we offer support for any given release.
Grant
[
https://issues.apache.org/jira/browse/MAHOUT-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141099#comment-13141099
]
Grant Ingersoll commented on MAHOUT-155:
Hey Joe,
Since these are categorical
On Nov 1, 2011, at 8:09 AM, Grant Ingersoll wrote:
FWIW, in Lucene, we do the following:
1. All minor versions within a major release can read prior versions index
within the same major release. That is, 3.4 can read a 3.3 index. However,
3.3 cannot read a 3.4 index. When a user reads
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
The build-20news-bayes.sh script doesn't work when downloading the content for
the first time. The issue is that it changes the directory to the temp
directory and then later tries to do cd ../.. to get back
[
https://issues.apache.org/jira/browse/MAHOUT-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll resolved MAHOUT-856.
Resolution: Fixed
Fix Version/s: 0.6
build-20news-bayes.sh doesn't work when
Reporter: Grant Ingersoll
We have build-20news-bayes.sh that runs our NB stuff on 20 news groups. We
also have an SGD example that works on 20 news groups, but no script to run it.
I'm going to rename build-20news-bayes.sh to classify-20news.sh and
incorporate the two
On Nov 1, 2011, at 12:15 PM, Ted Dunning wrote:
I think the trend is away from an explicit version in serialized data and
toward systems like protobufs or avro which allow much more flexibility.
+1
Sent from my iPhone
On Nov 1, 2011, at 5:09, Grant Ingersoll gsing...@apache.org wrote
[
https://issues.apache.org/jira/browse/MAHOUT-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141321#comment-13141321
]
Grant Ingersoll commented on MAHOUT-857:
Here's the conf. matrix I'm getting
I'm working on https://issues.apache.org/jira/browse/MAHOUT-857. Each time I
run it, I get different answers for SGD for the confusion matrix, which is
presumably due to the randomness built in. However, is there a way to set the
seed so one can reproduce results for actually testing the
[
https://issues.apache.org/jira/browse/MAHOUT-857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated MAHOUT-857:
---
Attachment: MAHOUT-857.patch
Much better looking patch. Cleaned up the code, dropped
[
https://issues.apache.org/jira/browse/MAHOUT-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141464#comment-13141464
]
Grant Ingersoll commented on MAHOUT-857:
I committed the last patch, plus some
On Nov 1, 2011, at 2:45 PM, Sean Owen wrote:
RandomUtils.setTestSeed() (or something like that) makes all the RNGs
deterministic -- well if they are using RandomUtils.
I see it in use in at least one place.
On Tue, Nov 1, 2011 at 6:20 PM, Grant Ingersoll gsing...@apache.org wrote:
I'm
Anyone object to me moving the Decision/Random Forest stuff into the
classifiers package? Seems like that is where it rightfully belongs.
-Grant
Move Decision Forests to classifier package
---
Key: MAHOUT-859
URL: https://issues.apache.org/jira/browse/MAHOUT-859
Project: Mahout
Issue Type: Improvement
Reporter: Grant Ingersoll
In reviewing clustering for upcoming training, I'm wondering about something w/
Canopy clustering that we claim, but wanted to check here first. In the
lectures, etc. I've seen on it, the idea is to run Canopy first and then some
other more expensive algorithm, such as k-means, etc. with the
[
https://issues.apache.org/jira/browse/MAHOUT-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll reassigned MAHOUT-854:
--
Assignee: Grant Ingersoll
Add MinHash to build-reuters.sh example
[
https://issues.apache.org/jira/browse/MAHOUT-344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141892#comment-13141892
]
Grant Ingersoll commented on MAHOUT-344:
Ankur, any luck on documenting this stuff
[
https://issues.apache.org/jira/browse/MAHOUT-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141894#comment-13141894
]
Grant Ingersoll commented on MAHOUT-854:
bq. 1. Is it just me or when I try
On Nov 2, 2011, at 12:00 AM, Jake Mannix wrote:
Anyone with mad maven skills know how to churn that out in a short
evenings-worth of work? :)
Is there such a thing? :-)
As an alternative, we could simply generate a Jar that contains just the
necessary files and no re-org is necessary.
[
https://issues.apache.org/jira/browse/MAHOUT-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141900#comment-13141900
]
Grant Ingersoll commented on MAHOUT-854:
I've committed this, but will leave
[
https://issues.apache.org/jira/browse/MAHOUT-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140149#comment-13140149
]
Grant Ingersoll commented on MAHOUT-627:
I'm going to look to commit this soon
Project: Mahout
Issue Type: Bug
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
Fix For: 0.6
The LuceneTextValueEncoder throws an BufferUnderflowException when used. See
the code below. The problem appears
[
https://issues.apache.org/jira/browse/MAHOUT-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140234#comment-13140234
]
Grant Ingersoll commented on MAHOUT-855:
At least two issues here:
1
[
https://issues.apache.org/jira/browse/MAHOUT-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated MAHOUT-855:
---
Attachment: MAHOUT-855.patch
Here's a fix, going to commit shortly
[
https://issues.apache.org/jira/browse/MAHOUT-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll resolved MAHOUT-855.
Resolution: Fixed
Committed revision 1195549.
LuceneTextValueEncoder
,
Paritosh
/Manuel
Grant Ingersoll
http://www.lucidimagination.com
Upgrade Lucene dependency to 3.4
Key: MAHOUT-852
URL: https://issues.apache.org/jira/browse/MAHOUT-852
Project: Mahout
Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee
[
https://issues.apache.org/jira/browse/MAHOUT-852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll resolved MAHOUT-852.
Resolution: Fixed
Upgrade Lucene dependency to 3.4
Add SGD to build-asf-email.sh example
-
Key: MAHOUT-851
URL: https://issues.apache.org/jira/browse/MAHOUT-851
Project: Mahout
Issue Type: Improvement
Reporter: Grant Ingersoll
changes *that really should happen in a next release*, 0.6. Then file
some JIRAs for additional things that can and should be done in the
next month or so.
+1
On Mon, Oct 24, 2011 at 7:40 PM, Grant Ingersoll gsing...@apache.org wrote:
My first thought was what's the difference between open
On Oct 25, 2011, at 11:04 AM, Isabel Drost wrote:
On 25.10.2011 Dan Brickley wrote:
These make clear the urgency; the auto exporter is unmaintained, and breaks
with Confluence updates.
Ok - so the bottom line is: Auto export will go away. Confluence will remain.
As
linking to dynamic
--
Grant Ingersoll
http://www.lucidimagination.com
On Oct 23, 2011, at 6:29 AM, Dan Brickley wrote:
[snip]
Interesting discussion, and maybe a good time for those of us making
use of all this code to remember to say 'thanks'. So, er yeah, thanks.
One thing I would like to bring up, as you talk this stuff through, is
that there are a few
' target and
let it live there?
On Mon, Oct 24, 2011 at 9:59 AM, Jake Mannix jake.man...@gmail.com wrote:
On Mon, Oct 24, 2011 at 5:25 AM, Grant Ingersoll gsing...@apache.orgwrote:
- Anything that isn't fixed by December is WontFix and we release 0.6.
I realize it's drastic, but it's
The only issue I am really concerned about w provenance is pull requests from
non ASF people that are brought in. Sometimes hard to track
On Oct 23, 2011, at 7:56 AM, Benson Margulies bimargul...@gmail.com wrote:
I just want to focus on the provenance question, but, really, you can
ignore me.
On Oct 22, 2011, at 2:19 PM, Sean Owen wrote:
Bringing this to dev@, mid-thread, per Grant's suggestion. There was a
brief and fruitful thread on private@ to discuss project governance,
but the topic has shifted such that it's useful to just talk on dev@.
If I may paraphrase: I expressed
On Oct 22, 2011, at 6:41 PM, Sean Owen wrote:
Thanks! good thread.
On Sat, Oct 22, 2011 at 3:30 PM, Grant Ingersoll gsing...@apache.org wrote:
1. We aim for releases every 6 months or so
2. We make a best guess up front about what bug fixes will be in that
release, but we also
On Oct 22, 2011, at 7:34 PM, Benson Margulies wrote:
When the board looks at the health of a community, one of the
questions it asks (or so I am told) is, 'Is the community responsive
to requests for assistance?'
I think we are, but of course we could be better.
Now, the board's bar here
[
https://issues.apache.org/jira/browse/MAHOUT-698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll reopened MAHOUT-698:
I'd say we leave this one open. When done right, it can help people get
feedback right away
+1 to that, or, alternatively, we should simply say Mahout is in a number of
bundles at this point and we believe all players are properly following ASF
branding guidelines. We will continue to monitor.
On Oct 14, 2011, at 12:51 PM, Ted Dunning wrote:
It might, for equity, be reasonable to
On Oct 14, 2011, at 1:38 PM, Ted Dunning wrote:
Which others are there? Maybe we should mention them all in this report.
2 is a number of bundles to me :-)
[
https://issues.apache.org/jira/browse/MAHOUT-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126482#comment-13126482
]
Grant Ingersoll commented on MAHOUT-588:
I've turned off access to mine. You
- kmeans
step #4
clusterDump
i found the vector is org.apache.mahout.math.RandomAccessSparseVector,
and where i can found the sequenceFile key??
thx in advance
Grant Ingersoll
http://www.lucidimagination.com
Lucene Eurocon 2011
, // the convergence delta
value 10, // the maximum number of iterations true, // run clustering false
// execute map reduce );
no exception thrown and thx in advance
At 2011-10-12 20:27:19,Grant Ingersoll gsing...@apache.org wrote:
Can you share your actual commands?
On Oct 12
.
On Mon, Oct 10, 2011 at 11:20 PM, Grant Ingersoll gsing...@apache.orgwrote:
I was trying the Naive Bayes classifier via the build-asf-email.sh file I
committed the other day on a data set that had a fairly significant
variation in the number of messages per training label and am noticing
On Oct 10, 2011, at 10:26 PM, Dmitriy Lyubimov wrote:
As well as lda improvements. Gosh,
Nudge, nudge, Jake!
, or during
Apache
Con (Mon and Tue are Hackathon days there.)
Grant Ingersoll
http://www.lucidimagination.com
Lucene Eurocon 2011: http://www.lucene-eurocon.com
[
https://issues.apache.org/jira/browse/MAHOUT-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125107#comment-13125107
]
Grant Ingersoll commented on MAHOUT-839:
Hey Dan,
I think the addInputOption
[
https://issues.apache.org/jira/browse/MAHOUT-839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll reassigned MAHOUT-839:
--
Assignee: Grant Ingersoll
rowid job failing (when parsing options
601 - 700 of 1061 matches
Mail list logo