I'm writing up an appendix on Vector and Matrix. In the course of
this, I noticed a big problem with VectorWrtiable. It is pretty
glaringly un-thread-safe. It caches, in a static member, the class of
the vector to be read. The read method is not synchronized. Oops.
Synchronization fixes this, but
[
https://issues.apache.org/jira/browse/MAHOUT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832615#action_12832615
]
Grant Ingersoll commented on MAHOUT-236:
I don't have any, but should be pretty eas
This code was copied right out of AbstractVector, I never really understood
why we had to do that caching.
On Thu, Feb 11, 2010 at 9:10 AM, Sean Owen wrote:
> I'm writing up an appendix on Vector and Matrix. In the course of
> this, I noticed a big problem with VectorWrtiable. It is pretty
> gla
As a side note: for this appendix, we've got lots more stuff coming
down the pipe regarding distributed / HDFS-backed matrices too, which
is going to be pretty critical to be covered in this appendix (see latest
patches for MAHOUT-180).
On Thu, Feb 11, 2010 at 9:10 AM, Sean Owen wrote:
> I'm wri
[
https://issues.apache.org/jira/browse/MAHOUT-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll reassigned MAHOUT-185:
--
Assignee: Grant Ingersoll
> Add mahout shell script for easy launching of various algor
[
https://issues.apache.org/jira/browse/MAHOUT-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832626#action_12832626
]
Grant Ingersoll commented on MAHOUT-185:
Looks like a good start. Longer term, we
[
https://issues.apache.org/jira/browse/MAHOUT-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated MAHOUT-185:
---
Affects Version/s: (was: 0.2)
Fix Version/s: (was: 0.4)
[
https://issues.apache.org/jira/browse/MAHOUT-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832661#action_12832661
]
Grant Ingersoll commented on MAHOUT-185:
Committed revision 909120.
> Add mahout s
On Thu, Feb 11, 2010 at 6:37 PM, Jake Mannix wrote:
> Why would the sparse representation be the only way to represent it
> on disk? It's nearly twice as big as the dense form for dense vectors
> (ok, 50% bigger).
On disk (well, in any serialized form) you just have key-value,
key-value pairs in
On Thu, Feb 11, 2010 at 11:51 AM, Sean Owen wrote:
> On Thu, Feb 11, 2010 at 6:37 PM, Jake Mannix
> wrote:
> > Why would the sparse representation be the only way to represent it
> > on disk? It's nearly twice as big as the dense form for dense vectors
> > (ok, 50% bigger).
>
> On disk (well, i
On Thu, Feb 11, 2010 at 11:51 AM, Sean Owen wrote:
> On Thu, Feb 11, 2010 at 6:37 PM, Jake Mannix
> wrote:
> > Where do we actually use the VectorWritable.readVector() static
> > method?
>
> Looks like it's used in about 16 places across the code.
>
We should remove them, I think. I'm pretty s
+1 to eliminating the statics, they are indeed evil. The type to read
should be stored in the thing doing/facilitating the reading not the
vector itself and definitely not in a static field. Pretty sure vector
shouldn't be facilitating the reading of itself. No need for
synchronization then. The st
Robin,
Any chance you could add a page on FPM on
http://cwiki.apache.org/MAHOUT/algorithms.html? I'm trying to find out more
about it, but don't see much for documentation.
Thanks,
Grant
Seems like Avro is a great way to manage this enum (as in, we don't have to
think about it).
On Thu, Feb 11, 2010 at 12:31 PM, Drew Farris wrote:
> +1 to eliminating class names in serializations (this is especially
> bad when an efficiently managed enum can do the job)
>
--
Ted Dunning, CTO
On Thu, Feb 11, 2010 at 3:37 PM, Ted Dunning wrote:
> Seems like Avro is a great way to manage this enum (as in, we don't have to
> think about it).
Yes, I hope so. Now that the dictionary vectorizer/n-gram integration
is complete I will be getting back to that.
Drew
Sure, but we're not doing Avro for 0.3, so we should probably at least fix
this in some
minimal way before another release.
-jake
On Thu, Feb 11, 2010 at 12:37 PM, Ted Dunning wrote:
> Seems like Avro is a great way to manage this enum (as in, we don't have to
> think about it).
>
> On Thu, F
On Thu, Feb 11, 2010 at 3:40 PM, Jake Mannix wrote:
> Sure, but we're not doing Avro for 0.3, so we should probably at least fix
> this in some minimal way before another release.
Agreed.
I'm wondering -- and this would probably be pretty obvious if I just
looked at the code (sorry!) -- are the
[
https://issues.apache.org/jira/browse/MAHOUT-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832813#action_12832813
]
Edward J. Yoon commented on MAHOUT-180:
---
Hi, Quick question.
It works using M/R iter
[
https://issues.apache.org/jira/browse/MAHOUT-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832824#action_12832824
]
Jake Mannix commented on MAHOUT-180:
Yes. Multiplication of a matrix (or the square of
Hi All,
java.util.logging is really getting me down - I never really paid much
attention to it because I've always used log4j in the past, but it
looks like it can't do things like change the format of the logs using
a config file, do mapped diagnostic contextes, etc..
Does anyone have any issue
20 matches
Mail list logo