On Mon Robin Anil robin.a...@gmail.com wrote:
2. UIMA Integration with Mahout? (Maybe a good project if UIMA folks
are taking in GSOC students)
I guess one could easily split this one in two:
a) Using UIMA (whole pipeline or just the analysers if that is possible)
for data pre-processing
[
https://issues.apache.org/jira/browse/MAHOUT-237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robin Anil updated MAHOUT-237:
--
Attachment: MAHOUT-237-tfidf.patch
Added IDF job which takes a sequence file of doc-id=Vector.
[
https://issues.apache.org/jira/browse/MAHOUT-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828763#action_12828763
]
Ted Dunning commented on MAHOUT-237:
{quote}
Seems like the Text field Vector Class
You volunteering to port to avro, Ted? Awesome! :)
-jake
On Feb 2, 2010 1:10 PM, Ted Dunning (JIRA) j...@apache.org wrote:
[
https://issues.apache.org/jira/browse/MAHOUT-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828763#action_12828763]
I'm going to get back to it eventually, honest!
On Tue, Feb 2, 2010 at 4:13 PM, Jake Mannix jake.man...@gmail.com wrote:
You volunteering to port to avro, Ted? Awesome! :)
-jake
Add licences for 3rd party jars to mahout binary release and remove additional
unused dependencies.
---
Key: MAHOUT-272
URL:
:-)
On Tue, Feb 2, 2010 at 1:13 PM, Jake Mannix jake.man...@gmail.com wrote:
You volunteering to port to avro, Ted? Awesome! :)
--
Ted Dunning, CTO
DeepDyve
[
https://issues.apache.org/jira/browse/MAHOUT-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Drew Farris updated MAHOUT-272:
---
Attachment: MAHOUT-272.patch
* Added exclusion for eclipse core to hadoop dependency in
[
https://issues.apache.org/jira/browse/MAHOUT-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Drew Farris updated MAHOUT-272:
---
Status: Patch Available (was: Open)
Add licences for 3rd party jars to mahout binary release and
[
https://issues.apache.org/jira/browse/MAHOUT-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Drew Farris updated MAHOUT-272:
---
Summary: Add licenses for 3rd party jars to mahout binary release and
remove additional unused
Just notice this didn't go to the list.
---BeginMessage---
Hi Jerry,
I'm not sure why Dirichlet is doing that with this dataset and have not
been able to get better results than you. I have gotten excellent
results using it with other models on other datasets, so I'm pretty
confident in the
[
https://issues.apache.org/jira/browse/MAHOUT-242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Drew Farris updated MAHOUT-242:
---
Attachment: MAHOUT-242.patch
Updated patch, removed pom modifications checked in as a part of
This could also be caused if the prior is very diffuse. This makes the
probability that a point will go to any new cluster quite low. You can
compensate somewhat for this with different values of alpha.
I have had some half thoughts about how to improve the mixing and currently
think that
13 matches
Mail list logo