Re: A proposal for Spark 2.0

2015-12-03 Thread Mridul Muralidharan
There was a proposal to make schedulers pluggable in context of adding one which leverages Apache Tez : IIRC it was a abandoned - but the jira might be a good starting point. Regards Mridul On Dec 3, 2015 2:59 PM, "Rad Gruchalski" wrote: > There was a talk in this thread

Re: [VOTE] Release Apache Spark 1.6.0 (RC1)

2015-12-03 Thread taishi takahashi
Excuse me, I'm working on SPARK-10259.(this parent issue is SPARK-7751.) https://issues.apache.org/jira/browse/SPARK-10259 This issues's purpose is to add @Since annotation to stable and experimenal methods in MLlib. in SPARK-7751, this and this children issues' target version is v.1.6.0, but

Re: A proposal for Spark 2.0

2015-12-03 Thread Sean Owen
Reynold, did you (or someone else) delete version 1.7.0 in JIRA? I think that's premature. If there's a 1.7.0 then we've lost info about what it would contain. It's trivial at any later point to merge the versions. And, since things change and there's not a pressing need to decide one way or the

Re: [VOTE] Release Apache Spark 1.6.0 (RC1)

2015-12-03 Thread robineast
+1 OSX 10.10.5, java version "1.8.0_40", scala 2.10 mvn clean package -DskipTests [INFO] Spark Project External Kafka ... SUCCESS [ 18.161 s] [INFO] Spark Project Examples . SUCCESS [01:18 min] [INFO] Spark Project External Kafka Assembly

Re: [VOTE] Release Apache Spark 1.6.0 (RC1)

2015-12-03 Thread mkhaitman
I reported this in the 1.6 preview thread, but wouldn't mind if someone can confirm that ctrl-c is not keyboard interrupting / clearing the current line of input anymore in the pyspark shell. I saw the change that would kill the currently running job when using ctrl+c, but now the only way to

Re: [VOTE] Release Apache Spark 1.6.0 (RC1)

2015-12-03 Thread dodobidu
+1 (non binding) Tested our pipelines on a Spark 1.6.0 standalone cluster (Python only): - Pyspark package - Spark SQL - Dataframes - Spark MLlib No major issues, good performance. Just a minor distinct behavior from version 1.4.1 using a SQLContext: "select case myColumn when null then 'Y'

Spark Streaming Kafka - DirectKafkaInputDStream: Using the new Kafka Consumer API

2015-12-03 Thread Mario Ds Briggs
Hi, Wanted to pick Cody's mind on what he thinks about DirectKafkaInputDStream/KafkaRDD internally using the new Kafka consumer API. I know the latter is documented as beta-quality, but yet wanted to know if he sees any blockers as to why shouldn't go there shortly. On my side the consideration

Re: A proposal for Spark 2.0

2015-12-03 Thread Sean Owen
Pardon for tacking on one more message to this thread, but I'm reminded of one more issue when building the RC today: Scala 2.10 does not in general try to work with Java 8, and indeed I can never fully compile it with Java 8 on Ubuntu or OS X, due to scalac assertion errors. 2.11 is the first

Re: A proposal for Spark 2.0

2015-12-03 Thread Koert Kuipers
spark 1.x has been supporting scala 2.11 for 3 or 4 releases now. seems to me you already provide a clear upgrade path: get on scala 2.11 before upgrading to spark 2.x from scala team when scala 2.10.6 came out: We strongly encourage you to upgrade to the latest stable version of Scala 2.11.x, as

Re: [VOTE] Release Apache Spark 1.6.0 (RC1)

2015-12-03 Thread Davies Liu
Does this https://github.com/apache/spark/pull/10134 is valid fix? (still worse than 1.5) On Thu, Dec 3, 2015 at 8:45 AM, mkhaitman wrote: > I reported this in the 1.6 preview thread, but wouldn't mind if someone can > confirm that ctrl-c is not keyboard interrupting /

Re: [VOTE] Release Apache Spark 1.6.0 (RC1)

2015-12-03 Thread Sean Owen
Licenses and signature are all fine. Docker integration tests consistently fail for me with Java 7 / Ubuntu and "-Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver" *** RUN ABORTED *** java.lang.NoSuchMethodError:

Re: A proposal for Spark 2.0

2015-12-03 Thread Rad Gruchalski
There was a talk in this thread about removing the fine-grained Mesos scheduler. I think it would a loss to lose it completely, however, I understand that it might be a burden to keep it under development for Mesos only. Having been thinking about it for a while, it would be great if the

Re: Quick question regarding Maven and Spark Assembly jar

2015-12-03 Thread Mark Hamstra
Try to read this before Marcelo gets to you. https://issues.apache.org/jira/browse/SPARK-11157 On Thu, Dec 3, 2015 at 5:27 PM, Matt Cheah wrote: > Hi everyone, > > A very brief question out of curiosity – is there any particular reason > why we don’t publish the Spark

Re: Bringing up JDBC Tests to trunk

2015-12-03 Thread Luciano Resende
On Mon, Nov 30, 2015 at 1:53 PM, Josh Rosen wrote: > The JDBC drivers are currently being pulled in as test-scope dependencies > of the `sql/core` module: > https://github.com/apache/spark/blob/f2fbfa444f6e8d27953ec2d1c0b3abd603c963f9/sql/core/pom.xml#L91 > > In SBT,

Quick question regarding Maven and Spark Assembly jar

2015-12-03 Thread Matt Cheah
Hi everyone, A very brief question out of curiosity ­ is there any particular reason why we don¹t publish the Spark assembly jar on the Maven repository? Thanks, -Matt Cheah smime.p7s Description: S/MIME cryptographic signature

Spark doesn't unset HADOOP_CONF_DIR when testing ?

2015-12-03 Thread Jeff Zhang
I try to do test on HiveSparkSubmitSuite on local box, but fails. The cause is that spark is still using my local single node cluster hadoop when doing the unit test. I don't think it make sense to do that. These environment variable should be unset before the testing. And I suspect dev/run-tests

Re: Spark Streaming Kafka - DirectKafkaInputDStream: Using the new Kafka Consumer API

2015-12-03 Thread Cody Koeninger
Honestly my feeling on any new API is to wait for a point release before taking it seriously :) Auth and encryption seem like the only compelling reason to move, but forcing people on kafka 8.x to upgrade their brokers is questionable. On Thu, Dec 3, 2015 at 11:30 AM, Mario Ds Briggs