Hi Ignacio,
Please create a JIRA and send a PR for the information gain
computation, so it is easy to track the progress.
The sparse vector support for NaiveBayes is already implemented in
branch-1.0 and master. You only need to provide an RDD of sparse
vectors (created from Vectors.sparse).
MLU
In the following PR, there are related discussions.
https://github.com/apache/spark/pull/332
-- Original --
From: "Marcelo Vanzin";;
Date: Fri, Apr 11, 2014 08:16 AM
To: "dev";
Subject: RFC: varargs in Logging.scala?
Hey there,
While going through the
Are there scripts to build the AMI used by the spark-ec2 script?
Alternatively, is there a place to download the AMI. I'm interested in
using it to deploy into an internal Openstack cloud.
Thanks,
Jim
BTW...
You can do calculations in string interpolation:
s"Time: ${timeMillis / 1000}s"
Or use format strings.
f"Float with two decimal places: $floatValue%.2f"
More info:
http://docs.scala-lang.org/overviews/core/string-interpolation.html
On Thu, Apr 10, 2014 at 5:46 PM, Michael Armbrust wrote
Hi Marcelo,
Thanks for bringing this up here, as this has been a topic of debate
recently. Some thoughts below.
... all of the suffer from the fact that the log message needs to be built
> even
> though it might not be used.
>
This is not true of the current implementation (and this is actually
Hey there,
While going through the try to get the hang of things, I've noticed
several different styles of logging. They all have some downside
(readability being one of them in certain cases), but all of the
suffer from the fact that the log message needs to be built even
though it might not be u
You are welcome, thanks again for contributing =)
- Henry
On Thu, Apr 10, 2014 at 3:17 PM, Ignacio Zendejas
wrote:
> I don't think there's a noticeable performance hit by the use of reverse in
> those cases. It was a quick set of changes and it helped understand what
> you look for. I didn't int
I don't think there's a noticeable performance hit by the use of reverse in
those cases. It was a quick set of changes and it helped understand what
you look for. I didn't intend to nitpick, so I'll leave as is. I could have
used a scala.Ordering implicitly/explicitly also, but seems overkill and
d
HI Ignacio,
Thank you for your contribution.
Just a friendly reminder, in case you have not contributed to Apache
Software Foundation projects before please submit ASF ICLA form [1] or
if you are sponsored by your company also ask the company to send CCLA
[2] to clear the intellectual property fo
Hi, again -
As part of the next step, I'd like to make a more substantive contribution
and propose some initial work on feature selection, primarily as it relates
to text classification.
Specifically, I'd like to contribute very straightforward code to perform
information gain feature evaluation.
You can construct the Breeze vector by
val breezeVector = breeze.linalg.DenseVector.zeros[Double](length)
If you want to convert to mllib vector, you can do
val mllibVector = Vectors.fromBreeze(breezeVector)
If you want to convert back to breeze vector,
val newBreezeVector = mllibV
Thanks for contributing!
I think often unless the feature is gigantic, you can send a pull request
directly for discussion. One rule of thumb in the Spark code base is that
we typically prefer readability over conciseness, and thus we tend to avoid
using too much Scala magic or operator overloadin
Hi, all -
First off, I want to say that I love spark and am very excited about
MLBase. I'd love to contribute now that I have some time, but before I do
that I'd like to familiarize myself with the process.
In looking for a few projects and settling on one which I'll discuss in
another thread, I
You'll need to use the associated functionality in Breeze and then create a
dense vector from a Breeze vector. I have a JIRA for us to update the
examples for 1.0... I'm hoping Xiangrui can take a look at it.
https://issues.apache.org/jira/browse/SPARK-1464
https://github.com/scalanlp/breeze/wik
org.apache.spark.util.Vector is deprecated so what should be done to use say
if want to create a vector with zeros, def zeros(length: Int) in util.Vector
using new mllib.linalg.Vector ?
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/org-apache-spark-u
15 matches
Mail list logo