Re: shapeless in spark 2.1.0

2016-12-29 Thread Ryan Williams
Other option would presumably be for someone to make a release of breeze with old-shapeless shaded... unless shapeless classes are exposed in breeze's public API, in which case you'd have to copy the relevant shapeless classes into breeze and then publish that? On Thu, Dec 29, 2016, 1:05 PM Sean

Re: shapeless in spark 2.1.0

2016-12-29 Thread Ryan Williams
`mvn dependency:tree -Dverbose -Dincludes=:shapeless_2.11` shows: [INFO] \- org.apache.spark:spark-mllib_2.11:jar:2.1.0:provided [INFO]\- org.scalanlp:breeze_2.11:jar:0.12:provided [INFO] \- com.chuusai:shapeless_2.11:jar:2.0.0:provided On Thu, Dec 29, 2016 at 12:11 PM Herman van

Re: spark-core "compile"-scope transitive-dependency on scalatest

2016-12-15 Thread Ryan Williams
. > > I'll re-open that bug, if you want to send a PR. (I think it's just a > matter of making the scalatest dependency "provided" in spark-tags, if > I remember the discussion.) > > On Thu, Dec 15, 2016 at 4:15 PM, Ryan Williams > <ryan.blake.willi...@gmail.com> w

spark-core "compile"-scope transitive-dependency on scalatest

2016-12-15 Thread Ryan Williams
spark-core depends on spark-tags (compile scope) which depends on scalatest (compile scope), so spark-core leaks test-deps into downstream libraries' "compile"-scope classpath. The cause is that spark-core has logical "test->test" and "compile->compile" dependencies on spark-tags, but spark-tags

Re: Compatibility of 1.6 spark.eventLog with a 2.0 History Server

2016-09-15 Thread Ryan Williams
What is meant by: """ (This is because clicking the refresh button in browser, updates the UI with latest events, where-as in the 1.6 code base, this does not happen) """ Hasn't refreshing the page updated all the information in the UI through the 1.x line?

Re: Setting YARN executors' JAVA_HOME

2016-08-18 Thread Ryan Williams
ntation > <http://spark.apache.org/docs/latest/configuration.html>. > > The page addresses what you need. You can look for > spark.executorEnv.[EnvironmentVariableName] > and set your java home as > spark.executorEnv.JAVA_HOME= > > Regards, > Dhruve >

Setting YARN executors' JAVA_HOME

2016-08-18 Thread Ryan Williams
I need to tell YARN a JAVA_HOME to use when spawning containers (to run a Java 8 app on Java 7 YARN). The only way I've found that works is setting SPARK_YARN_USER_ENV="JAVA_HOME=/path/to/java8". The code

Re: Latency due to driver fetching sizes of output statuses

2016-01-23 Thread Ryan Williams
avior: > https://issues.apache.org/jira/browse/SPARK-10193 > https://github.com/apache/spark/pull/8427 > > On Sat, Jan 23, 2016 at 1:40 PM, Ryan Williams < > ryan.blake.willi...@gmail.com> wrote: > >> I have a recursive algorithm that performs a few jobs on successively

Latency due to driver fetching sizes of output statuses

2016-01-23 Thread Ryan Williams
it computes it, no executors have joined or left the cluster. In this gist <https://gist.github.com/ryan-williams/445ef8736a688bd78edb#file-job-108> you can see two jobs stalling for almost a minute each between "Starting job:" and "Got job"; with larger input datasets my RDD linea

Re: Live UI

2015-10-12 Thread Ryan Williams
Yea, definitely check out Spree ! It functions as "live" UI, history server, and archival storage of event log data. There are pros and cons to building something like it in Spark trunk (and running it in the Spark driver, presumably) that I've spent a lot of

Re: An alternate UI for Spark.

2015-09-14 Thread Ryan Williams
You can check out Spree for one data point about how this can be done; it is a near-clone of the Spark web UI that updates in real-time. It uses JsonRelay , a SparkListener that sends events as JSON over the

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-11 Thread Ryan Williams
Any idea why 1.5.0 is not in Maven central yet ? Is that a separate release process? On Wed, Sep 9, 2015 at 12:40 PM andy petrella wrote: > You can try it out really quickly by "building" a Spark

Spree: Live-updating web UI for Spark

2015-07-27 Thread Ryan Williams
Hi dev@spark, I wanted to quickly ping about Spree http://www.hammerlab.org/2015/07/25/spree-58-a-live-updating-web-ui-for-spark/, a live-updating web UI for Spark that I released on Friday (along with some supporting infrastructure), and mention a couple things that came up while I worked on it

Re: Resource usage of a spark application

2015-05-21 Thread Ryan Williams
with that or that doesn't make sense! 2015-05-19 21:43 GMT+02:00 Ryan Williams ryan.blake.willi...@gmail.com: Hi Peter, a few months ago I was using MetricsSystem to export to Graphite and then view in Grafana; relevant scripts and some instructions are here https://github.com/hammerlab/grafana-spark

Re: Resource usage of a spark application

2015-05-19 Thread Ryan Williams
Hi Peter, a few months ago I was using MetricsSystem to export to Graphite and then view in Grafana; relevant scripts and some instructions are here https://github.com/hammerlab/grafana-spark-dashboards/ if you want to take a look. On Sun, May 17, 2015 at 8:48 AM Peter Prettenhofer

Monitoring Spark with Graphite and Grafana

2015-02-26 Thread Ryan Williams
If anyone is curious to try exporting Spark metrics to Graphite, I just published a post about my experience doing that, building dashboards in Grafana http://grafana.org/, and using them to monitor Spark jobs: http://www.hammerlab.org/2015/02/27/monitoring-spark-with-graphite-and-grafana/ Code

Re: Building Spark with Pants

2015-02-16 Thread Ryan Williams
I worked on Pants at Foursquare for a while and when coming up to speed on Spark was interested in the possibility of building it with Pants, particularly because allowing developers to share/reuse each others' compilation artifacts seems like it would be a boon to productivity; that was/is Pants'

Present/Future of monitoring spark jobs, MetricsSystem vs. Web UI, etc.

2015-01-09 Thread Ryan Williams
I've long wished the web UI gave me a better sense of how the metrics it reports are changing over time, so I was intrigued to stumble across the MetricsSystem

Re: zinc invocation examples

2014-12-05 Thread Ryan Williams
fwiw I've been using `zinc -scala-home $SCALA_HOME -nailed -start` which: - starts a nailgun server as well, - uses my installed scala 2.{10,11}, as opposed to zinc's default 2.9.2 https://github.com/typesafehub/zinc#scala: If no options are passed to locate a version of Scala then Scala 2.9.2 is

Re: Spurious test failures, testing best practices

2014-12-04 Thread Ryan Williams
...@cloudera.com wrote: On Tue, Dec 2, 2014 at 4:40 PM, Ryan Williams ryan.blake.willi...@gmail.com wrote: But you only need to compile the others once. once... every time I rebase off master, or am obliged to `mvn clean` by some other build-correctness bug, as I said before. In my

Re: Spurious test failures, testing best practices

2014-12-02 Thread Ryan Williams
://gist.github.com/ryan-williams/1711189e7d0af558738d a sample full output from running `mvn install -X -U -DskipTests -pl network/shuffle` from such a state (the -U was to get around a previous failure based on having cached a failed lookup of network-common-1.3.0-SNAPSHOT). - Thinking maven might be special

Re: Spurious test failures, testing best practices

2014-12-02 Thread Ryan Williams
` and `mvn install` on the parent project do. On Tue Dec 02 2014 at 3:45:48 PM Marcelo Vanzin van...@cloudera.com wrote: On Tue, Dec 2, 2014 at 2:40 PM, Ryan Williams ryan.blake.willi...@gmail.com wrote: Following on Mark's Maven examples, here is another related issue I'm having: I'd like

Re: Spurious test failures, testing best practices

2014-12-02 Thread Ryan Williams
On Tue Dec 02 2014 at 4:46:20 PM Marcelo Vanzin van...@cloudera.com wrote: On Tue, Dec 2, 2014 at 3:39 PM, Ryan Williams ryan.blake.willi...@gmail.com wrote: Marcelo: by my count, there are 19 maven modules in the codebase. I am typically only concerned with core (and therefore its two

Spurious test failures, testing best practices

2014-11-30 Thread Ryan Williams
to not incorporate any HTML in the body? It seems like all of the archives I've seen strip it out, but other people have used it and gmail displays it. [1] https://gist.githubusercontent.com/ryan-williams/8a162367c4dc157d2479/raw/484c2fb8bc0efa0e39d142087eefa9c3d5292ea3/dev%20run-tests:%20fail (57 mins) [2

Re: Spurious test failures, testing best practices

2014-11-30 Thread Ryan Williams
. Matei On Nov 30, 2014, at 2:39 PM, Ryan Williams ryan.blake.willi...@gmail.com wrote: In the course of trying to make contributions to Spark, I have had a lot of trouble running Spark's tests successfully. The main pain points I've experienced are: 1) frequent, spurious test

Re: Spurious test failures, testing best practices

2014-11-30 Thread Ryan Williams
abound, there's no good way to run only the things that a given change actually could have broken, etc. Anyway, hopefully zinc brings me to the world of ~minute iteration times that have been reported on this thread. On Sun Nov 30 2014 at 6:53:57 PM Ryan Williams ryan.blake.willi...@gmail.com

Re: Spurious test failures, testing best practices

2014-11-30 Thread Ryan Williams
, Patrick Wendell pwend...@gmail.com wrote: Hey Ryan, The existing JIRA also covers publishing nightly docs: https://issues.apache.org/jira/browse/SPARK-1517 - Patrick On Sun, Nov 30, 2014 at 5:53 PM, Ryan Williams ryan.blake.willi...@gmail.com wrote: Thanks Nicholas, glad