[jira] Commented: (MAHOUT-18) Embrace interoperability with other softwares

2008-03-18 Thread Isabel Drost (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580254#action_12580254 ] Isabel Drost commented on MAHOUT-18: > How does this relate to MAHOUT-8? Seems like t

答复: [jira] Commented: (MAHOUT-18) Embr ace interoperability with other softwares

2008-03-18 Thread shunkai.fu
Mahout-18 is about the description of physical data instances. Here I refer to the model representation. For example, with Naieve Bayes, we may need store the number of instances of (Class = 1) and (Class = 0), as well as the cases of ( Class=1| X_i = 0 ). With this information, we can re-store t

[jira] Commented: (MAHOUT-18) Embrace interoperability with other softwares

2008-03-18 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580212#action_12580212 ] Grant Ingersoll commented on MAHOUT-18: --- How does this relate to MAHOUT-8? Seems lik

答复: [jira] Commented: (MAHOUT-18) Embr ace interoperability with other softwares

2008-03-18 Thread shunkai.fu
The input and output format, from my view, have nothing to do with the parallel execution. It is the nature of model and the design of learning algorithm determine the parallel manner. Criteria may contain: (1) Mature; (2) Reasonable; (3) Wide acceptance; (4) Easy for extension; (5) Suitable for

答复: [jira] Commented: (MAHOUT-18) Embr ace interoperability with other softwares

2008-03-18 Thread shunkai.fu
You can find some known format, PMML (http://www.dmg.org/products.html) -邮件原件- 发件人: Ted Dunning (JIRA) [mailto:[EMAIL PROTECTED] 发送时间: 2008年3月19日 8:56 收件人: mahout-dev@lucene.apache.org 主题: [jira] Commented: (MAHOUT-18) Embrace interoperability with other softwares [ https://issues.

[jira] Commented: (MAHOUT-18) Embrace interoperability with other softwares

2008-03-18 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580199#action_12580199 ] Ted Dunning commented on MAHOUT-18: --- What are the possible formats? Do any of the format

[jira] Created: (MAHOUT-18) Embrace interoperability with other softwares

2008-03-18 Thread Shunkai Fu (JIRA)
Embrace interoperability with other softwares - Key: MAHOUT-18 URL: https://issues.apache.org/jira/browse/MAHOUT-18 Project: Mahout Issue Type: New JIRA Project Reporter: Shunkai Fu

RE: ´ð¸´: input and output forma t following PMML specification

2008-03-18 Thread Jeff Eastman
+1 on the benefits of interoperability How would you go about identifying and selecting from the alternatives? What decision criteria would you suggest? What other alternatives should be considered? Go ahead and open a Jira if you want to lead this discussion. Jeff > -Original Message-

RE: Demos/Tutorials

2008-03-18 Thread Jeff Eastman
I've been using the canopy clustering to cluster Apache log time slices by URL frequency. Typical results indicate several big clusters with the "business as usual" access patterns in them and then several small clusters with the unusual patterns. It's a little difficult to interpret beyond that bu

RE: Welcome Jeff Eastman as Mahout Committer

2008-03-18 Thread Goel, Ankur
Welcome aboard Jeff! Hope to see more stuff coming from you. I too plan to start on this full time sooner then later (in a weeks time) and get more engaged into discussions. -Ankur -Original Message- From: Jeff Eastman [mailto:[EMAIL PROTECTED] Sent: Monday, March 17, 2008 9:34 PM To: ma

Re: Demos/Tutorials

2008-03-18 Thread Grant Ingersoll
Yeah, I hear you there. I have a project I am working on that will require me to generate examples, but it is a couple of weeks away. The gene expression stuff is great. Text based ones would be really cool too. I haven't done too much clustering work (other than using Dawid's excellent

Re: Demos/Tutorials

2008-03-18 Thread Dawid Weiss
This is absolutely necessary, if not for just showing off with the project, then certainly for verification of correctness of algorithms inside it. I will certainly hop in to such a subtask to the extent of my current available time resources (not much, sadly). D. Grant Ingersoll wrote: No

Re: Demos/Tutorials

2008-03-18 Thread Isabel Drost
On Monday 17 March 2008, Grant Ingersoll wrote: > Now that we have some code in place for clustering, I think it would > be cool to put together some examples/demos of real world problems. One idea I thought of reading the proposal of Allen: I think it might also be great, if people using - or tr

Re: Demos/Tutorials

2008-03-18 Thread Isabel Drost
On Monday 17 March 2008, Allen Day wrote: > I'll be trying out Mahout to do some microarray gene expression > clustering pretty soon. I would be happy to do a small write-up. That sounds really great. Would be a great demo for applications apart from obvious tasks in the area of clustering texts

Re: Welcome Jeff Eastman as Mahout Committer

2008-03-18 Thread Isabel Drost
On Monday 17 March 2008, Jeff Eastman wrote: > Thanks to you and the others for such a warm welcome. Unlike most of you I > have been doing this full-time for the last few weeks. I think it is really great to have someone on board so early who can work full-time on the project. Gave us quite a qu

Re: [jira] Commented: (MAHOUT-6) Need a matrix implementation

2008-03-18 Thread Isabel Drost
On Monday 17 March 2008, Grant Ingersoll wrote: > Yeah, +1 on the wrapper idea. +1 on the wrapper as well. Especially as there might be matrix computations that really don't have labels attached to the matrix. Isabel -- Once a word has been allowed to escape, it cannot be recalled. --