Re: Average distance between two points in unit hypercube?

2011-10-20 Thread Ted Dunning
Sort of. I may be misunderstanding the question. If you take a random orthogonal projection, then distances will be preserved within a reasonably small epsilon to reasonably high probability. Mathematically, if you take a random matrix \Omega which is tall and skinny and do a QR decomposition:

[jira] [Issue Comment Edited] (MAHOUT-847) Improve Euclidean distance similarity calculation

2011-10-20 Thread Lance Norskog (Issue Comment Edited) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132191#comment-13132191 ] Lance Norskog edited comment on MAHOUT-847 at 10/20/11 11:24 PM: ---

[jira] [Commented] (MAHOUT-847) Improve Euclidean distance similarity calculation

2011-10-20 Thread Lance Norskog (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132191#comment-13132191 ] Lance Norskog commented on MAHOUT-847: -- As a perpetual beginner, it is daunting to le

[jira] [Commented] (MAHOUT-847) Improve Euclidean distance similarity calculation

2011-10-20 Thread Lance Norskog (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132186#comment-13132186 ] Lance Norskog commented on MAHOUT-847: -- A instance of this class will probably be ca

Re: Average distance between two points in unit hypercube?

2011-10-20 Thread Lance Norskog
Does this all translate to doing high-dimensional distance with random projection? Project each vector to one dimension and subtract? This sounds like a really useful distance measure. On Wed, Oct 19, 2011 at 7:32 PM, Ted Dunning wrote: > The distribution of the dot product of two randomly chose

[jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails

2011-10-20 Thread Jeff Eastman (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132104#comment-13132104 ] Jeff Eastman commented on MAHOUT-524: - I've found where the /data is being added to th

Re: is in mahout any classification schema (pattern)

2011-10-20 Thread Ted Dunning
Wojciech, I think that your meaning is roughly clear. You are right that we don't have a very universal framework for this. It would be good to have such a thing. Would you be interested helping to construct such a framework? On Thu, Oct 20, 2011 at 11:37 AM, Wojciech Indyk wrote: > Hi! > I w

[jira] [Created] (MAHOUT-848) M/R job launching code should add Oozie's action.xml as a configuration resource of the Hadoop Configuration object

2011-10-20 Thread Timothy Potter (Created) (JIRA)
M/R job launching code should add Oozie's action.xml as a configuration resource of the Hadoop Configuration object --- Key: MAHOUT-848 URL: https://is

[jira] [Updated] (MAHOUT-848) M/R job launching code should add Oozie's action.xml as a configuration resource of the Hadoop Configuration object

2011-10-20 Thread Timothy Potter (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter updated MAHOUT-848: -- Priority: Minor (was: Major) > M/R job launching code should add Oozie's action.xml as a c

is in mahout any classification schema (pattern)

2011-10-20 Thread Wojciech Indyk
Hi! I want to develope new classifier in mahout, but i noticed, that probably there is any pattern (schema) for global sight for classificators. This mean: i probably must write own data loader, i didn't find any abstract training class and any test class and so on. I think it could be great if we

[jira] [Commented] (MAHOUT-672) Implementation of Conjugate Gradient for solving large linear systems

2011-10-20 Thread Jonathan Traupman (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131731#comment-13131731 ] Jonathan Traupman commented on MAHOUT-672: -- The currently attached patch was good

[jira] [Updated] (MAHOUT-847) Improve Euclidean distance similarity calculation

2011-10-20 Thread Sean Owen (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-847: - Status: Patch Available (was: Open) > Improve Euclidean distance similarity calculation > --

[jira] [Updated] (MAHOUT-847) Improve Euclidean distance similarity calculation

2011-10-20 Thread Sean Owen (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-847: - Attachment: MAHOUT-847.patch > Improve Euclidean distance similarity calculation > --

[jira] [Created] (MAHOUT-847) Improve Euclidean distance similarity calculation

2011-10-20 Thread Sean Owen (Created) (JIRA)
Improve Euclidean distance similarity calculation - Key: MAHOUT-847 URL: https://issues.apache.org/jira/browse/MAHOUT-847 Project: Mahout Issue Type: Improvement Components: Collabora

[jira] [Commented] (MAHOUT-828) bin/mahout should only print classpath on request, not all the time

2011-10-20 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131595#comment-13131595 ] Hudson commented on MAHOUT-828: --- Integrated in Mahout-Quality #1106 (See [https://builds.ap

[jira] [Commented] (MAHOUT-829) bin/mahout doesn't match the way the packaged forms of Mahout are arranged

2011-10-20 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131596#comment-13131596 ] Hudson commented on MAHOUT-829: --- Integrated in Mahout-Quality #1106 (See [https://builds.ap

[jira] [Resolved] (MAHOUT-518) Implement Affinity Preprocessing for Eigencuts and Spectral KMeans

2011-10-20 Thread Sean Owen (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved MAHOUT-518. -- Resolution: Won't Fix After 14 months, and some related progress, I think this is as done as it will b

[jira] [Resolved] (MAHOUT-830) Distribution should create .deb and .rpm packages

2011-10-20 Thread Sean Owen (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved MAHOUT-830. -- Resolution: Fixed Fix Version/s: 0.6 Assignee: Ted Dunning And then sounds like this is

[jira] [Commented] (MAHOUT-828) bin/mahout should only print classpath on request, not all the time

2011-10-20 Thread Sean Owen (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131506#comment-13131506 ] Sean Owen commented on MAHOUT-828: -- Ted I don't think you actually committed this to SVN.

[jira] [Commented] (MAHOUT-829) bin/mahout doesn't match the way the packaged forms of Mahout are arranged

2011-10-20 Thread Sean Owen (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131507#comment-13131507 ] Sean Owen commented on MAHOUT-829: -- Same here it did not seem to be in trunk.

[jira] [Commented] (MAHOUT-672) Implementation of Conjugate Gradient for solving large linear systems

2011-10-20 Thread Sean Owen (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131505#comment-13131505 ] Sean Owen commented on MAHOUT-672: -- Folks -- what's the status on this? It's been sitting