[
https://issues.apache.org/jira/browse/MAHOUT-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600264#comment-14600264
]
ASF GitHub Bot commented on MAHOUT-1746:
----------------------------------------
Github user dlyubimov commented on a diff in the pull request:
https://github.com/apache/mahout/pull/145#discussion_r33203413
--- Diff:
math-scala/src/main/scala/org/apache/mahout/math/drm/package.scala ---
@@ -160,6 +160,165 @@ package object drm {
def dsqrt[K: ClassTag](drmA: DrmLike[K]): DrmLike[K] = new
OpAewUnaryFunc[K](drmA, math.sqrt)
def dsignum[K: ClassTag](drmA: DrmLike[K]): DrmLike[K] = new
OpAewUnaryFunc[K](drmA, math.signum)
+
+ ///////////////////////////////////////////////////////////
+ // Misc. math utilities.
+
+ /**
+ * Compute column wise means and variances -- distributed version.
+ *
+ * @param drmA Note: will pin input to cache if not yet pinned.
+ * @tparam K
+ * @return colMeans → colVariances
+ */
+ def dcolMeanVars[K: ClassTag](drmA: DrmLike[K]): (Vector, Vector) = {
+
+ import RLikeDrmOps._
+
+ val drmAcp = drmA.checkpoint()
+
+ val mu = drmAcp colMeans
+
+ // Compute variance using mean(x^2) - mean(x)^2
+ val variances = (drmAcp ^ 2 colMeans) -=: mu * mu
+
+ mu → variances
--- End diff --
Other than it is valid Scala style? no.
btw every tool i have shows it correctly for me, including less, web pages,
latex/lyx, etc. etc. Use intellij. really.
> Fix: mxA ^ 2, mxA ^ 0.5 to mean the same thing as mxA * mxA and mxA ::= sqrt _
> ------------------------------------------------------------------------------
>
> Key: MAHOUT-1746
> URL: https://issues.apache.org/jira/browse/MAHOUT-1746
> Project: Mahout
> Issue Type: Blog - New Blog Request
> Reporter: Dmitriy Lyubimov
> Assignee: Dmitriy Lyubimov
> Fix For: 0.10.2
>
>
> it so happens that in java, if x is of double type, Math.pow(x,2.0) and x * x
> produce different values approximately once in million random values.
> This is extremely annoying as it creates rounding errors, especially with
> things like euclidean distance computations, which eventually may produce
> occasional NaNs.
> This issue suggests to get special treatment on vector and matrix dsl to make
> sure identical fpu algorithms are running as follows:
> x ^ 2 <=> x * x
> x ^ 0.5 <=> sqrt(x)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)