Re: Any plans to migrate Transformer API to Spark SQL (closer to DataFrames)?

2016-03-25 Thread Joseph Bradley
There have been some comments about using Pipelines outside of ML, but I have not yet seen a real need for it. If a user does want to use Pipelines for non-ML tasks, they still can use Transformers + PipelineModels. Will that work? On Fri, Mar 25, 2016 at 8:05 AM, Jacek Laskowski

Re: [discuss] ending support for Java 7 in Spark 2.0

2016-03-25 Thread Koert Kuipers
i asked around a little, and the general trend at our clients seems to be that they plan to upgrade the clusters to java 8 within the year. so with that in mind i wish this was a little later (i would have preferred a java-8-only spark at the end of year). but since a major spark version only

Re: SPARK-13843 and future of streaming backends

2016-03-25 Thread David Nalley
> As far as group / artifact name compatibility, at least in the case of > Kafka we need different artifact names anyway, and people are going to > have to make changes to their build files for spark 2.0 anyway. As > far as keeping the actual classes in org.apache.spark to not break > code

[spark.ml] Why is private class ColumnPruner?

2016-03-25 Thread Jacek Laskowski
Hi, Came across `private class ColumnPruner` with "TODO(ekl) make this a public transformer" in scaladoc, cf. https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala#L317. Why is this private and is there a JIRA for the TODO(ekl)? Pozdrawiam,

Any plans to migrate Transformer API to Spark SQL (closer to DataFrames)?

2016-03-25 Thread Jacek Laskowski
Hi, After few weeks with spark.ml now, I came to conclusion that Transformer concept from Pipeline API (spark.ml/MLlib) should be part of DataFrame (SQL) where they fit better. Are there any plans to migrate Transformer API (ML) to DataFrame (SQL)? Pozdrawiam, Jacek Laskowski

Re: [discuss] ending support for Java 7 in Spark 2.0

2016-03-25 Thread Andrew Ray
+1 on removing Java 7 and Scala 2.10 support. It looks to be entirely possible to support Java 8 containers in a YARN cluster otherwise running Java 7 (example code for alt JAVA_HOME https://issues.apache.org/jira/secure/attachment/12671739/YARN-1964.patch) so really there should be no big

Re: Does SparkSql has official jdbc/odbc driver ?

2016-03-25 Thread Daniel Darabos
I haven't tried this, but I thought you can run the Thriftserver in Spark and then connect with the HiveServer2 JDBC driver: http://spark.apache.org/docs/1.6.1/sql-programming-guide.html#running-the-thrift-jdbcodbc-server On Fri, Mar 25, 2016 at 7:57 AM, Reynold Xin wrote:

Re: [discuss] ending support for Java 7 in Spark 2.0

2016-03-25 Thread Mridul Muralidharan
I do agree w.r.t scala 2.10 as well; similar arguments apply (though there is a nuanced diff - source compatibility for scala vs binary compatibility wrt Java) Was there a proposal which did not go through ? Not sure if I missed it. Regards Mridul On Thursday, March 24, 2016, Koert Kuipers

Re: Does SparkSql has official jdbc/odbc driver ?

2016-03-25 Thread Reynold Xin
No - it is too painful to develop a jdbc/odbc driver. On Thu, Mar 24, 2016 at 11:56 PM, sage wrote: > Hi all, >Does SparkSql has official jdbc/odbc driver? >I only found third-party's odbc/jdbc driver, like simba, and most of > third-party's odbc/jdbc driver are not

Does SparkSql has official jdbc/odbc driver ?

2016-03-25 Thread sage
Hi all, Does SparkSql has official jdbc/odbc driver? I only found third-party's odbc/jdbc driver, like simba, and most of third-party's odbc/jdbc driver are not free to use. -- View this message in context: