Re: Spark Improvement Proposals

2016-10-06 Thread Xiao Li
Let us continue to improve Apache Spark! I volunteer to go through all the SQL-related open JIRAs. Xiao Li 2016-10-06 21:14 GMT-07:00 Matei Zaharia : > Hey Cody, > > Thanks for bringing these things up. You're talking about quite a few > different things here, but let

Re: Spark Improvement Proposals

2016-10-06 Thread Matei Zaharia
Hey Cody, Thanks for bringing these things up. You're talking about quite a few different things here, but let me get to them each in turn. 1) About technical / design discussion -- I fully agree that everything big should go through a lot of review, and I like the idea of a more formal way to

Spark Improvement Proposals

2016-10-06 Thread Cody Koeninger
I love Spark. 3 or 4 years ago it was the first distributed computing environment that felt usable, and the community was welcoming. But I just got back from the Reactive Summit, and this is what I observed: - Industry leaders on stage making fun of Spark's streaming model - Open source project

Re: [ANNOUNCE] Announcing Spark 2.0.1

2016-10-06 Thread Luciano Resende
I have created a Infra jira to track the issue with the maven artifacts for Spark 2.0.1 On Wed, Oct 5, 2016 at 10:18 PM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Yeah I see the apache maven repos have the 2.0.1 artifacts at >

Re: StructuredStreaming Custom Sinks (motivated by Structured Streaming Machine Learning)

2016-10-06 Thread Michael Armbrust
Fred, I think thats a pretty good summary of my thoughts. Thanks for condensing them :) Right now, my focus is to get more people using Structured Streaming so that we can get some real world feedback on what is missing. Right now this means: - SPARK-15406

Submit job with driver options in Mesos Cluster mode

2016-10-06 Thread vonnagy
I am trying to submit a job to spark running in a Mesos cluster. We need to pass custom java options to the driver and executor for configuration, but the driver task never includes the options. Here is an example submit. GC_OPTS="-XX:+UseConcMarkSweepGC -verbose:gc

Monitoring system extensibility

2016-10-06 Thread Alexander Oleynikov
Hi. As of v2.0.1, the traits `org.apache.spark.metrics.source.Source` and `org.apache.spark.metrics.sink.Sink` are defined as private to ‘spark’ package, so it becomes troublesome to create a new implementation in the user’s code (but still possible in a hacky way). This seems to be the only

Re: Apache Spark chat channel

2016-10-06 Thread Dean Wampler
Since I'm a Scala Spark advocate, I'll try to get a Scala Spark Gitter channel created, one way or another. Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition (O'Reilly) Lightbend @deanwampler

Re: Apache Spark chat channel

2016-10-06 Thread Sean Owen
Yes this come up once in a while. There's no need or way to stop people forming groups to chat, though blessing a new channel as 'official' is tough because it means, in theory, everyone has to follow another channel to see 100% of the discussion. I think that's why the couple of mailing lists,

Apache Spark chat channel

2016-10-06 Thread Jan-Hendrik Zab
Hello! There was a request on scala-debate [0] to create a Spark centric chat room under the scala namespace on Gitter with a focus on Scala related questions. This is just a heads up to the Apache Spark "management" to give them a chance to get involved. It might be better to create a dedicated