Maybe you can create your own subclass of RDD and override the
getPreferredLocations() to implement the logic of dynamic changing of the
locations.
> On Dec 30, 2016, at 12:06, Fei Hu wrote:
>
> Dear all,
>
> Is there any way to change the host location for a certain
Hi ML/MLLib developers,
1.I'm trying to add a weights column to ml spark evaluators
(RegressionEvaluator, BinaryClassificationEvaluator,
MutliclassClassificationEvaluator) that use mllib metrics and I have a few
questions (JIRA
2.
Dear all,
Is there any way to change the host location for a certain partition of RDD?
"protected def getPreferredLocations(split: Partition)" can be used to
initialize the location, but how to change it after the initialization?
Thanks,
Fei
we also use spray for webservices that execute on spark, and spray depends
on even older (and incompatible) shapeless 1.x
to get rid of the old shapeless i would have to upgrade from spray to
akka-http, which means going to java 8
this might also affect spark-job-server, which it seems uses
Hi,
I understand that doing a union creates a nested structures, however why isn’t
it O(N)? If I look at the code it seems this should be a tree merge of two
plans, that should occur at O(N) not O(N^2).
Even when running the plan that should be O(N*LOG(N)) instead of O(N^2) or
worse.
Assaf.
Hello Jacek,
Actually, Reynold is still the release manager and I am just sending this
message for him :) Sorry. I should have made it clear in my original email.
Thanks,
Yin
On Thu, Dec 29, 2016 at 10:58 AM, Jacek Laskowski wrote:
> Hi Yan,
>
> I've been surprised the first
Hi Yan,
I've been surprised the first time when I noticed rxin stepped back and a
new release manager stepped in. Congrats on your first ANNOUNCE!
I can only expect even more great stuff coming in to Spark from the dev
team after Reynold spared some time
Can't wait to read the changes...
Breeze 0.13 (RC-1 right now) bumps shapeless to 2.2.0 and 2.2.5 for
Scala 2.10 and 2.11 respectively:
https://github.com/scalanlp/breeze/pull/509
On 12/29/2016 07:13 PM, Ryan Williams wrote:
>
> Other option would presumably be for someone to make a release of
> breeze with old-shapeless
Other option would presumably be for someone to make a release of breeze
with old-shapeless shaded... unless shapeless classes are exposed in
breeze's public API, in which case you'd have to copy the relevant
shapeless classes into breeze and then publish that?
On Thu, Dec 29, 2016, 1:05 PM Sean
Don't do that. Union them all at once with SparkContext.union
On Thu, Dec 29, 2016, 17:21 assaf.mendelson wrote:
> Hi,
>
>
>
> I have been playing around with doing union between a large number of
> dataframes and saw that the performance of the actual union (not the
>
It is breeze, but, what's the option? It can't be excluded. I think this
falls in the category of things an app would need to shade in this
situation.
On Thu, Dec 29, 2016, 16:49 Koert Kuipers wrote:
> i just noticed that spark 2.1.0 bring in a new transitive dependency on
>
`mvn dependency:tree -Dverbose -Dincludes=:shapeless_2.11` shows:
[INFO] \- org.apache.spark:spark-mllib_2.11:jar:2.1.0:provided
[INFO]\- org.scalanlp:breeze_2.11:jar:0.12:provided
[INFO] \- com.chuusai:shapeless_2.11:jar:2.0.0:provided
On Thu, Dec 29, 2016 at 12:11 PM Herman van
Iterative union like this creates a deeply nested recursive structure in
a similar manner to described here http://stackoverflow.com/q/34461804
You can try something like this http://stackoverflow.com/a/37612978 but
there is of course on overhead of conversion between Dataset and RDD.
On
Hi,
I have been playing around with doing union between a large number of
dataframes and saw that the performance of the actual union (not the action) is
worse than O(N^2). Since a union basically defines a lineage (i.e. current +
union with of other as a child) this should be almost
Which dependency pulls in shapeless?
On Thu, Dec 29, 2016 at 5:49 PM, Koert Kuipers wrote:
> i just noticed that spark 2.1.0 bring in a new transitive dependency on
> shapeless 2.0.0
>
> shapeless is a popular library for scala users, and shapeless 2.0.0 is old
> (2014) and
i just noticed that spark 2.1.0 bring in a new transitive dependency on
shapeless 2.0.0
shapeless is a popular library for scala users, and shapeless 2.0.0 is old
(2014) and not compatible with more current versions.
so this means a spark user that uses shapeless in his own development
cannot
Hi all,
Apache Spark 2.1.0 is the second release of Spark 2.x line. This release
makes significant strides in the production readiness of Structured
Streaming, with added support for event time watermarks
17 matches
Mail list logo