hi, Sandy Ryza:
I believe It's you originally added the SPARK_CLASSPATH in core/pom.xml in
the org.scalatest section. Does this still needed in 1.1?
I noticed this setting because when I looked into the unit-tests.log, It
shows something below:
14/09/24 23:57:19.246 WARN SparkConf:
Hi Spark developers,
I try to implement a framework with Spark and MLlib to do duplicate
detection. I'm not familiar with Spark and Scala so please be patient
with me. In order to enrich the LabeledPoint class with some information
I tried to extend it and added some properties.
But the ML
That is correct. Aliases in the SELECT clause can only be referenced in the
ORDER BY and HAVING clauses. Otherwise, you'll have to just repeat the
statement, like concat() in this case.
A more elegant alternative, which is probably not available in Spark SQL
yet, is to use Common Table
Hi Niklas Wilcke,
As you said, it is difficult to extend LabeledPoint class in
mllib.regression.
Do you want to extend LabeledPoint class in order to use any other type
exclude Double type?
If you have your code on Github, could you show us it? I want to know what
you want to do.
Community
By
@Yu Ishikawa,
*I think the right place for such discussion -
https://issues.apache.org/jira/browse/SPARK-3573
https://issues.apache.org/jira/browse/SPARK-3573*
2014-09-25 18:02 GMT+04:00 Yu Ishikawa yuu.ishikawa+sp...@gmail.com:
Hi Niklas Wilcke,
As you said, it is difficult to extend
Hi Egor Pahomov,
Thank you for your comment!
-
-- Yu Ishikawa
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-enable-extension-of-the-LabeledPoint-class-tp8546p8551.html
Sent from the Apache Spark Developers List mailing list archive at
Hi Yu Ishikawa,
I'm sorry but I can't share my code via github at the moment. Hopefully
in some months I can.
I don't want to change the type of the label but that would be also a
very nice improvement.
Making LabeledPoint abstract is exactly what I need. That enables me to
create a class like
Hi Egor Pahomov,
thanks for your suggestions. I think I will do the dirty workaround
because I don't want to maintain my own version of spark for now. Maybe
I will do later when I feel ready to contribute to the project.
Kind Regards,
Niklas Wilcke
On 25.09.2014 16:27, Egor Pahomov wrote:
I
Thanks, Yanbo and Nicholas. Now it makes more sense — query optimization is the
answer. /Du
From: Nicholas Chammas
nicholas.cham...@gmail.commailto:nicholas.cham...@gmail.com
Date: Thursday, September 25, 2014 at 6:43 AM
To: Yanbo Liang yanboha...@gmail.commailto:yanboha...@gmail.com
Cc: Du Li
Hi all
VertexRDD is partitioned with HashPartitioner, and it exhibits some
imbalance of tasks.
For example, Connected Components with partition strategy Edge2D:
Aggregated Metrics by Executor
Executor ID Task Time Total Tasks Failed Tasks Succeeded Tasks
Input Shuffle Read
Yeah we can also move it first. Wouldn't hurt.
On Thu, Sep 25, 2014 at 6:39 AM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
It might still make sense to make this change if MIMA checks are always
relatively quick, for the same reason we do style checks first.
On Thu, Sep 25, 2014 at
Hi Sandy,
Sorry for the bothering.
The tests run ok even the SPARK_CLASS setting is there now, but It gives a
config warning and will potential interfere other settings like Marcelo said.
The warning goes away if I remove it out.
And Marcelo, I believe the setting in core/pom should not be
12 matches
Mail list logo