[
https://issues.apache.org/jira/browse/MAHOUT-205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799638#action_12799638
]
Jake Mannix commented on MAHOUT-205:
and since tests pass for me with this, I'll commit
[
https://issues.apache.org/jira/browse/MAHOUT-205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix updated MAHOUT-205:
---
Attachment: MAHOUT-205.patch
up to date patch, with Robin's most recent Vectorizer commits merged in.
[
https://issues.apache.org/jira/browse/MAHOUT-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799631#action_12799631
]
Robin Anil commented on MAHOUT-237:
---
Ok. Done
> Map/Reduce Implementation of Document Ve
[
https://issues.apache.org/jira/browse/MAHOUT-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799625#action_12799625
]
Jake Mannix commented on MAHOUT-237:
Looking at this a little:
is there a reason why
[
https://issues.apache.org/jira/browse/MAHOUT-237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robin Anil reopened MAHOUT-237:
---
reopening this to let in further review
> Map/Reduce Implementation of Document Vectorizer
> ---
[
https://issues.apache.org/jira/browse/MAHOUT-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799560#action_12799560
]
Jake Mannix commented on MAHOUT-237:
It appears that there is just a missing line above
[
https://issues.apache.org/jira/browse/MAHOUT-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799557#action_12799557
]
Jake Mannix commented on MAHOUT-237:
Given the following code in PartialVectorGenerator
Hi Robin,
I'm seeing some strangeness from this, I've got a directory with 100k
documents. I build a sequence file using SequenceFilesFromDirectory, which
emits 4 chunks for this particular dataset. I then dump each of the chinks
using SequenceFileDumper. I only see 75,964 documents in the resulti
[
https://issues.apache.org/jira/browse/MAHOUT-237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved MAHOUT-237.
--
Resolution: Fixed
> Map/Reduce Implementation of Document Vectorizer
>
[
https://issues.apache.org/jira/browse/MAHOUT-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799508#action_12799508
]
Sean Owen commented on MAHOUT-237:
--
I'll commit -- still seeing some code inspection warni
[
https://issues.apache.org/jira/browse/MAHOUT-237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robin Anil updated MAHOUT-237:
--
Attachment: DictionaryVectorizer.patch
Uses String Reader. Removes unused imports and added License hea
[
https://issues.apache.org/jira/browse/MAHOUT-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799295#action_12799295
]
Jake Mannix commented on MAHOUT-180:
This is waiting on my MAHOUT-205 and MAHOUT-206 pa
[
https://issues.apache.org/jira/browse/MAHOUT-205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799293#action_12799293
]
Jake Mannix commented on MAHOUT-205:
It's not byet committed, because I was hoping at l
OK. The code throws up a number of warnings for me, like unused
declarations and variables, missing copyrights, etc. Mind if I accept
those before committing?
On Tue, Jan 12, 2010 at 2:09 PM, Robin Anil wrote:
> No.. Thats fixed. StringReader change was to prevent encoding errors. This
> patch wo
Until this is reasonably stable code and getting traction I wouldn't
think much about it. I doubt there will be a reason to seriously
consider porting to something other than Hadoop anytime soon --
certainly we should get the Hadoop side of things in order first.
On Tue, Jan 12, 2010 at 3:29 PM, G
Thoughts on http://wiki.apache.org/incubator/JppfProposal?
Seems like it might be useful. At some point, we may need APIs that go beyond
M/R and provide more generalized distributed capabilities.
-Grant
Dawid,
Like I said, I'm not sure we're disagreeing. My focal goal is
primitive collections, and I'm prepared to take my lumps with
compatibility. Sun has made such a mess of the collections API that we
seem forced to choose.
--benson
On Tue, Jan 12, 2010 at 9:28 AM, Dawid Weiss wrote:
> Thanks
Thanks for the clarification and understanding of my motives, Benson.
I know Trove and I know other libraries of this type -- PCJ has been
our favorite so far, but it's LGPL and our persistent attempts to ask
Soren Bak to distribute that code under a different license have
failed.
Adapters are a
No.. Thats fixed. StringReader change was to prevent encoding errors. This
patch works just fine. Infact I will post some numbers up on running it
against wikipedia tonight
Robin
On Tue, Jan 12, 2010 at 7:25 PM, Sean Owen wrote:
> Sorry but isn't that the very problem you are trying to solve
Sorry but isn't that the very problem you are trying to solve on this
thread? why do you want to commit this if it has this big memory
problem.
On Tue, Jan 12, 2010 at 12:49 PM, Robin Anil wrote:
> https://issues.apache.org/jira/secure/attachment/12429906/DictionaryVectorizer.patch
>
> Havent cha
[
https://issues.apache.org/jira/browse/MAHOUT-156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robin Anil resolved MAHOUT-156.
---
Resolution: Duplicate
> Documentation and Code cleanup for all Bayesian Classes
> ---
[
https://issues.apache.org/jira/browse/MAHOUT-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799189#action_12799189
]
Robin Anil commented on MAHOUT-156:
---
Oops. I guess i made a duplicate jira issue here. Ma
https://issues.apache.org/jira/secure/attachment/12429906/DictionaryVectorizer.patch
Havent changed the StringReader portion. rest are ok to review
On Tue, Jan 12, 2010 at 4:47 PM, Sean Owen wrote:
>
> https://issues.apache.org/jira/secure/attachment/12429846/DictionaryVectorizer.patch
>
> Thi
Dawid,
I find that I didn't quite answer all of your questions, and then
again maybe I'm not in a position to.
I started this by looking for some way to get the functionality of
Trove without the GPL. When I discovered that Mahout had already
absorbed Colt, I decided that the shortest path was to
[
https://issues.apache.org/jira/browse/MAHOUT-173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved MAHOUT-173.
--
Resolution: Won't Fix
> Implement clustering of massive-domain attributes
> ---
Dawid,
There is a model compromise out there: the Trove 'decorator' approach.
I'm perfectly happy to follow that model to give people whatever value
you can get from Java collection compatibility. I confess that I've
been considering using it as an excuse to learn the CGM library and
generate the
Hi guys,
I see Benson working really hard on converting Colt primitive
collections to Mahout -- this is great effort, really, since no such
library currently exists with an Apache or BSD license.
I wanted to ask you if compatibility with Java Collections is
something you consider crucial for a se
[
https://issues.apache.org/jira/browse/MAHOUT-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799163#action_12799163
]
Vaijanath N. Rao commented on MAHOUT-173:
-
Hi Sean,
This can be subsumed by Mahout
[
https://issues.apache.org/jira/browse/MAHOUT-205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799157#action_12799157
]
Sean Owen commented on MAHOUT-205:
--
Just checking, did this get committed? I know I had pr
[
https://issues.apache.org/jira/browse/MAHOUT-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799155#action_12799155
]
Sean Owen commented on MAHOUT-180:
--
Is this unblocked now that much of the Math stuff has
[
https://issues.apache.org/jira/browse/MAHOUT-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799154#action_12799154
]
Sean Owen commented on MAHOUT-173:
--
Just clarifying the status -- Vaijanath are you workin
https://issues.apache.org/jira/secure/attachment/12429846/DictionaryVectorizer.patch
This one? still seems to have the issues described in this thread.
Where's the latest one?
On Tue, Jan 12, 2010 at 9:08 AM, Robin Anil wrote:
> Hi Sean, Could you take a look at the Patch and comment.
>
> Robin
[
https://issues.apache.org/jira/browse/MAHOUT-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved MAHOUT-163.
--
Resolution: Fixed
It sounds like this was committed, so resolving (?)
> Get (better) cluster labels us
[
https://issues.apache.org/jira/browse/MAHOUT-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799151#action_12799151
]
Sean Owen commented on MAHOUT-156:
--
Is this actually done, subsumed in other changes? or i
Hi Sean, Could you take a look at the Patch and comment.
Robin
On Mon, Jan 11, 2010 at 10:39 PM, Sean Owen wrote:
> If one needs a Reader based on the contents of a String, the
> StringReader is a far better way of doing this. This also has
> potential character set issues if the platform's def
35 matches
Mail list logo