- Original Message -
At last, I worked around this issue by updating my local SBT to 0.13.2-RC1.
If any of you are experiencing similar problem, I suggest you upgrade your
local SBT version.
If this issue is causing grief for anyone on Fedora 20, know that you can
install sbt via yum
RC3 works with the applications I'm working on now and MLLib performance is
indeed perceptibly improved over 0.9.0 (although I haven't done a real
evaluation). Also, from the downstream perspective, I've been tracking the
0.9.1 RCs in Fedora and have no issues to report there either:
+1
I made the necessary interface changes to my apps that use MLLib and tested all
of my code against rc11 on Fedora 20 and OS X 10.9.3. (The Fedora Rawhide
package remains at 0.9.1 pending some additional dependency packaging work.)
best,
wb
- Original Message -
From: Tathagata
Friends,
For context (so to speak), I did some work in the 0.9 timeframe to fix
SPARK-897 (provide immediate feedback when closures aren't serializable) and
SPARK-729 (make sure that free variables in closures are captured when the RDD
transformations are declared).
I currently have a branch
This is an interesting approach, Nilesh!
Someone will correct me if I'm wrong, but I don't think this could go into
ClosureCleaner as a default behavior (since Kryo apparently breaks on some
classes that depend on custom Java serializers, as has come up on the list
recently). But it does seem
Hi all,
Does a Failed to generate golden answer for query message from
HiveComparisonTests indicate that it isn't possible to run the query in
question under Hive from Spark's test suite rather than anything about Spark's
implementation of HiveQL? The stack trace I'm getting implicates Hive
I assume you are adding tests? because that is the only time you should
see that message.
Yes, I had added the HAVING test to the whitelist.
That error could mean a couple of things:
1) The query is invalid and hive threw an exception
2) Your Hive setup is bad.
Regarding #2, you need
Hey, sorry to reanimate this thread, but just a quick question: why do the
examples (on http://spark.apache.org/examples.html) use spark for the
SparkContext reference? This is minor, but it seems like it could be a little
confusing for people who want to run them in the shell and need to
Hi all,
I was testing an addition to Catalyst today (reimplementing a Hive UDF) and ran
into some odd failures in the test suite. In particular, it seems that what
most of these have in common is that an array is spuriously reversed somewhere.
For example, the stddev tests in the
Hi all,
I've been evaluating YourKit and would like to profile the heap and CPU usage
of certain tests from the Spark test suite. In particular, I'm very interested
in tracking heap usage by allocation site. Unfortunately, I get a lot of
crashes running Spark tests with profiling (and thus
). Maybe they are very close to full and
profiling pushes them over the edge.
Matei
On Jul 14, 2014, at 9:51 AM, Will Benton wi...@redhat.com wrote:
Hi all,
I've been evaluating YourKit and would like to profile the heap and CPU
usage of certain tests from the Spark test suite
- Original Message -
From: Aaron Davidson ilike...@gmail.com
To: dev@spark.apache.org
Sent: Monday, July 14, 2014 5:21:10 PM
Subject: Re: Profiling Spark tests with YourKit (or something else)
Out of curiosity, what problems are you seeing with Utils.getCallSite?
Aaron, if I enable
with YourKit (or something else)
Would you mind filing a JIRA for this? That does sound like something bogus
happening on the JVM/YourKit level, but this sort of diagnosis is
sufficiently important that we should be resilient against it.
On Mon, Jul 14, 2014 at 6:01 PM, Will Benton wi
Hi all,
What's the preferred environment for generating golden test outputs for new
Hive tests? In particular:
* what Hadoop version and Hive version should I be using,
* are there particular distributions people have run successfully, and
* are there any system properties or environment
- Original Message -
dev/run-tests fails two tests (1 Hive, 1 Kafka Streaming) for me
locally on 1.1.0-rc3. Does anyone else see that? It may be my env.
Although I still see the Hive failure on Debian too:
[info] - SET commands semantics for a HiveContext *** FAILED ***
[info]
but will take another look later this
week.
best,
wb
- Original Message -
From: Sean Owen so...@cloudera.com
To: Will Benton wi...@redhat.com
Cc: Patrick Wendell pwend...@gmail.com, dev@spark.apache.org
Sent: Sunday, August 31, 2014 12:18:42 PM
Subject: Re: [VOTE] Release Apache Spark 1.1.0
+1
Tested Scala/MLlib apps on Fedora 20 (OpenJDK 7) and OS X 10.9 (Oracle JDK 8).
best,
wb
- Original Message -
From: Patrick Wendell pwend...@gmail.com
To: dev@spark.apache.org
Sent: Saturday, August 30, 2014 5:07:52 PM
Subject: [VOTE] Release Apache Spark 1.1.0 (RC3)
Please
Hi Yi,
I've had some interest in implementing windowing and rollup in particular for
some of my applications but haven't had them on the front of my plate yet. If
you need them as well, I'm happy to start taking a look this week.
best,
wb
- Original Message -
From: Yi Tian
/pull/1567
As far as windowing, I'll be developing my own test cases but would appreciate
it if you could also share some kinds of queries you're interested in so that I
can incorporate them as well.
best,
wb
- Original Message -
From: Yi Tian tianyi.asiai...@gmail.com
To: Will Benton
I'll chime in as yet another user who is extremely happy with sbt and a text
editor. (In my experience, running ack from the command line is usually just
as easy and fast as using an IDE's find-in-project facility.) You can, of
course, extend editors with Scala-specific IDE-like functionality
It's declared here:
https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/LocalSparkContext.scala
I assume you're already importing LocalSparkContext, but since the test classes
aren't included in Spark packages, you'll also need to package them up in order
to use
Hey Nick,
I did something similar with a Docker image last summer; I haven't updated the
images to cache the dependencies for the current Spark master, but it would be
trivial to do so:
http://chapeau.freevariable.com/2014/08/jvm-test-docker.html
best,
wb
- Original Message -
This might not be the easiest way, but it's pretty easy: you can use
Row(field_1, ..., field_n) as a pattern in a case match. So if you have a data
frame with foo as an int column and bar as a String columns and you want to
construct instances of a case class that wraps these up, you can do
Hi all,
Does anyone happen to know what tests Databricks uses for the Spark
distribution certification suite? Is it simply the tests that run as CI on
Spark pull requests, or is there something more involved?
The web site (
+1 (non-binding)
On Tue, Aug 15, 2017 at 10:32 AM, Anirudh Ramanathan <
fox...@google.com.invalid> wrote:
> Spark on Kubernetes effort has been developed separately in a fork, and
> linked back from the Apache Spark project as an experimental backend
>
What are you interested in accomplishing?
The spark.ml package has provided a machine learning API based on
DataFrames for quite some time. If you are interested in mixing query
processing and machine learning, this is certainly the best place to start.
See here:
26 matches
Mail list logo