Re: Block Transfer Service encryption support

2015-11-10 Thread Tim Preece
So it appears the tests fail because of an SSLHandshakeException. Tracing the failure I see: 3,0001,Using SSLEngineImpl.\0A 3,0001,\0AIs initial handshake: true\0A 3,0001,Ignoring unsupported cipher suite: SSL_RSA_WITH_DES_CBC_SHA for TLSv1.2\0A 3,0001,No available cipher suite for TLSv1.2\0A

[ANNOUNCE] Announcing Spark 1.5.2

2015-11-10 Thread Reynold Xin
Hi All, Spark 1.5.2 is a maintenance release containing stability fixes. This release is based on the branch-1.5 maintenance branch of Spark. We *strongly recommend* all 1.5.x users to upgrade to this release. The full list of bug fixes is here: http://s.apache.org/spark-1.5.2

Re: Support for views/ virtual tables in SparkSQL

2015-11-10 Thread Michael Armbrust
We do support hive style views, though all tables have to be visible to Hive. You can also turn on the experimental native view support (but it does not canonicalize the query). set spark.sql.nativeView = true On Mon, Nov 9, 2015 at 10:24 PM, Zhan Zhang wrote: > I

Re: Support for views/ virtual tables in SparkSQL

2015-11-10 Thread Sudhir Menon
Thanks Zhan, thanks Michael. I was already going down the temp table path, will check out the experimental native view support Suds On Tue, Nov 10, 2015 at 11:22 AM, Michael Armbrust wrote: > We do support hive style views, though all tables have to be visible to >

Re: A proposal for Spark 2.0

2015-11-10 Thread Reynold Xin
On Tue, Nov 10, 2015 at 3:35 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > > > 3. Assembly-free distribution of Spark: don’t require building an > enormous assembly jar in order to run Spark. > > Could you elaborate a bit on this? I'm not sure what an assembly-free > distribution

SPARK-11638: Run Spark on Mesos, in Docker with Bridge networking

2015-11-10 Thread Rad Gruchalski
Dear Team, We, Virdata, would like to present the result of the last few months of our work with Mesos and Spark. Our requirement was to run Spark on Mesos in Docker for multi-tenant. This required adapting Spark to run in Docker with Bridge networking. The result (and patches) of our work

Re: A proposal for Spark 2.0

2015-11-10 Thread Nicholas Chammas
> For this reason, I would *not* propose doing major releases to break substantial API's or perform large re-architecting that prevent users from upgrading. Spark has always had a culture of evolving architecture incrementally and making changes - and I don't think we want to change this model.

Re: A proposal for Spark 2.0

2015-11-10 Thread Shivaram Venkataraman
+1 On a related note I think making it lightweight will ensure that we stay on the current release schedule and don't unnecessarily delay 2.0 to wait for new features / big architectural changes. In terms of fixes to 1.x, I think our current policy of back-porting fixes to older releases would

Re: A proposal for Spark 2.0

2015-11-10 Thread Josh Rosen
There's a proposal / discussion of the assembly-less distributions at https://github.com/vanzin/spark/pull/2/files / https://issues.apache.org/jira/browse/SPARK-11157. On Tue, Nov 10, 2015 at 3:53 PM, Reynold Xin wrote: > > On Tue, Nov 10, 2015 at 3:35 PM, Nicholas Chammas

Re: A proposal for Spark 2.0

2015-11-10 Thread Kostas Sakellis
+1 on a lightweight 2.0 What is the thinking around the 1.x line after Spark 2.0 is released? If not terminated, how will we determine what goes into each major version line? Will 1.x only be for stability fixes? Thanks, Kostas On Tue, Nov 10, 2015 at 3:41 PM, Patrick Wendell

Re: A proposal for Spark 2.0

2015-11-10 Thread Mridul Muralidharan
Would be also good to fix api breakages introduced as part of 1.0 (where there is missing functionality now), overhaul & remove all deprecated config/features/combinations, api changes that we need to make to public api which has been deferred for minor releases. Regards, Mridul On Tue, Nov 10,

Re: A proposal for Spark 2.0

2015-11-10 Thread Reynold Xin
Echoing Shivaram here. I don't think it makes a lot of sense to add more features to the 1.x line. We should still do critical bug fixes though. On Tue, Nov 10, 2015 at 4:23 PM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > +1 > > On a related note I think making it lightweight

Re: PMML version in MLLib

2015-11-10 Thread selvinsource
Thank you Fazlan, looks good! -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Re-PMML-version-in-MLLib-tp14944p15112.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: A proposal for Spark 2.0

2015-11-10 Thread Sandy Ryza
Another +1 to Reynold's proposal. Maybe this is obvious, but I'd like to advocate against a blanket removal of deprecated / developer APIs. Many APIs can likely be removed without material impact (e.g. the SparkContext constructor that takes preferred node location data), while others likely see

Re: A proposal for Spark 2.0

2015-11-10 Thread Mark Hamstra
Really, Sandy? "Extra consideration" even for already-deprecated API? If we're not going to remove these with a major version change, then just when will we remove them? On Tue, Nov 10, 2015 at 4:53 PM, Sandy Ryza wrote: > Another +1 to Reynold's proposal. > > Maybe

Re: A proposal for Spark 2.0

2015-11-10 Thread Sudhir Menon
Agree. If it is deprecated, get rid of it in 2.0 If the deprecation was a mistake, let's fix that. Suds Sent from my iPhone On Nov 10, 2015, at 5:04 PM, Reynold Xin wrote: Maybe a better idea is to un-deprecate an API if it is too important to not be removed. I don't

Re: A proposal for Spark 2.0

2015-11-10 Thread Reynold Xin
Maybe a better idea is to un-deprecate an API if it is too important to not be removed. I don't think we can drop Java 7 support. It's way too soon. On Tue, Nov 10, 2015 at 4:59 PM, Mark Hamstra wrote: > Really, Sandy? "Extra consideration" even for

Re: Why LibSVMRelation and CsvRelation don't extends HadoopFsRelation ?

2015-11-10 Thread Sasaki Kai
Did you indicate CsvRelation in spark-csv package? LibSVMRelation is included in spark core package, but CsvRelation(spark-csv) is not. Is it necessary for us to modify also spark-csv as you proposed in SPARK-11622? Regards Kai > On Nov 5, 2015, at 11:30 AM, Jeff Zhang

Re: A proposal for Spark 2.0

2015-11-10 Thread Reynold Xin
Mark, I think we are in agreement, although I wouldn't go to the extreme and say "a release with no new features might even be best." Can you elaborate "anticipatory changes"? A concrete example or so would be helpful. On Tue, Nov 10, 2015 at 5:19 PM, Mark Hamstra

Re: A proposal for Spark 2.0

2015-11-10 Thread Sandy Ryza
Oh and another question - should Spark 2.0 support Java 7? On Tue, Nov 10, 2015 at 4:53 PM, Sandy Ryza wrote: > Another +1 to Reynold's proposal. > > Maybe this is obvious, but I'd like to advocate against a blanket removal > of deprecated / developer APIs. Many APIs

Re: [ANNOUNCE] Announcing Spark 1.5.2

2015-11-10 Thread Fengdong Yu
This is the most simplest announcement I saw. > On Nov 11, 2015, at 12:49 AM, Reynold Xin wrote: > > Hi All, > > Spark 1.5.2 is a maintenance release containing stability fixes. This release > is based on the branch-1.5 maintenance branch of Spark. We *strongly >

Re: Why LibSVMRelation and CsvRelation don't extends HadoopFsRelation ?

2015-11-10 Thread Jeff Zhang
Yes Kai, I also to plan to do for CsvRelation, will create PR for spark-csv On Wed, Nov 11, 2015 at 9:10 AM, Sasaki Kai wrote: > Did you indicate CsvRelation in spark-csv package? LibSVMRelation is > included in spark core package, but CsvRelation(spark-csv) is not. > Is it

Re: OLAP query using spark dataframe with cassandra

2015-11-10 Thread danielcsant
You can also evaluate Stratio Sparkta. It is a real time aggregation tool based on Spark Streaming. It is able to write in Cassandra and in other databases like MongoDB, Elasticsearch,... It is prepared to deploy this aggregations in Mesos so maybe it fits your necessities. There is no a query

Re: A proposal for Spark 2.0

2015-11-10 Thread Mark Hamstra
I'm liking the way this is shaping up, and I'd summarize it this way (let me know if I'm misunderstanding or misrepresenting anything): - New features are not at all the focus of Spark 2.0 -- in fact, a release with no new features might even be best. - Remove deprecated API that we

Re: A proposal for Spark 2.0

2015-11-10 Thread Patrick Wendell
I also feel the same as Reynold. I agree we should minimize API breaks and focus on fixing things around the edge that were mistakes (e.g. exposing Guava and Akka) rather than any overhaul that could fragment the community. Ideally a major release is a lightweight process we can do every couple of

Re: A proposal for Spark 2.0

2015-11-10 Thread Jean-Baptiste Onofré
Hi, I fully agree that. Actually, I'm working on PR to add "client" and "exploded" profiles in Maven build. The client profile create a spark-client-assembly jar, largely more lightweight that the spark-assembly. In our case, we construct jobs that don't require all the spark server side.

Re: Block Transfer Service encryption support

2015-11-10 Thread Tim Preece
Nb. I did notice some test failures when I ran a quick test on the pull request ( not sure if it is related - I haven't looked in any detail at the cause ). Failed tests: SslChunkFetchIntegrationSuite>ChunkFetchIntegrationSuite.fetchBothChunks:201 expected:<[]> but was:<[0, 1]>

Re: A proposal for Spark 2.0

2015-11-10 Thread Marcelo Vanzin
On Tue, Nov 10, 2015 at 6:51 PM, Reynold Xin wrote: > I think we are in agreement, although I wouldn't go to the extreme and say > "a release with no new features might even be best." > > Can you elaborate "anticipatory changes"? A concrete example or so would be > helpful.

Re: Why LibSVMRelation and CsvRelation don't extends HadoopFsRelation ?

2015-11-10 Thread Sasaki Kai
Great, thank you! > On Nov 11, 2015, at 11:41 AM, Jeff Zhang wrote: > > Yes Kai, I also to plan to do for CsvRelation, will create PR for spark-csv > > On Wed, Nov 11, 2015 at 9:10 AM, Sasaki Kai > wrote: > Did you indicate

Re: A proposal for Spark 2.0

2015-11-10 Thread Mark Hamstra
Heh... ok, I was intentionally pushing those bullet points to be extreme to find where people would start pushing back, and I'll agree that we do probably want some new features in 2.0 -- but I think we've got good agreement that new features aren't really the main point of doing a 2.0 release. I

Re: A proposal for Spark 2.0

2015-11-10 Thread Mark Hamstra
To take a stab at an example of something concrete and anticipatory I can go back to something I mentioned previously. It's not really a good example because I don't mean to imply that I believe that its premises are true, but try to go with it If we were to decide that real-time, event-based

Re: A proposal for Spark 2.0

2015-11-10 Thread Jean-Baptiste Onofré
Agree, it makes sense. Regards JB On 11/11/2015 01:28 AM, Reynold Xin wrote: Echoing Shivaram here. I don't think it makes a lot of sense to add more features to the 1.x line. We should still do critical bug fixes though. On Tue, Nov 10, 2015 at 4:23 PM, Shivaram Venkataraman