Re: SPIP: Spark on Kubernetes

2017-08-15 Thread lucas.g...@gmail.com
>From our perspective, we have invested heavily in Kubernetes as our cluster manager of choice. We also make quite heavy use of spark. We've been experimenting with using these builds (2.1 with pyspark enabled) quite heavily. Given that we've already 'paid the price' to operate Kubernetes in

回复: SPIP: Spark on Kubernetes

2017-08-15 Thread 李书明
+1 在2017年08月16日 04:53,Jiri Kremser 写道: +1 (non-binding) On Tue, Aug 15, 2017 at 10:19 PM, Shubham Chopra wrote: +1 (non-binding) ~Shubham. On Tue, Aug 15, 2017 at 2:11 PM, Erik Erlandson wrote: Kubernetes has

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Andrew Ash
+1 (non-binding) We're moving large amounts of infrastructure from a combination of open source and homegrown cluster management systems to unify on Kubernetes and want to bring Spark workloads along with us. On Tue, Aug 15, 2017 at 2:29 PM, liyinan926 wrote: > +1

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread liyinan926
+1 (non-binding) -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SPIP-Spark-on-Kubernetes-tp22147p22164.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Shubham Chopra
+1 (non-binding) ~Shubham. On Tue, Aug 15, 2017 at 2:11 PM, Erik Erlandson wrote: > > Kubernetes has evolved into an important container orchestration platform; > it has a large and growing user base and an active ecosystem. Users of > Apache Spark who are also deploying

Re: Run a specific PySpark test or group of tests

2017-08-15 Thread Bryan Cutler
This generally works for me to just run tests within a class or even a single test. Not as flexible as pytest -k, which would be nice.. $ SPARK_TESTING=1 bin/pyspark pyspark.sql.tests ArrowTests On Tue, Aug 15, 2017 at 5:49 AM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > Pytest

Re: spark pypy support?

2017-08-15 Thread Tom Graves
Just curious, is this using the portable version of pypy or standard version (ubuntu?)? Tom On Monday, August 14, 2017, 5:27:11 PM CDT, Holden Karau wrote: Ah interesting, looking at our latest docs we imply that it should work with PyPy 2.3+ -- we might want to update

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Erik Erlandson
Kubernetes has evolved into an important container orchestration platform; it has a large and growing user base and an active ecosystem. Users of Apache Spark who are also deploying applications on Kubernetes (or are planning to) will have convergence-related motivations for migrating their Spark

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Daniel Imberman
+1 (non-binding) Glad to see this moving forward :D On Tue, Aug 15, 2017 at 10:10 AM Holden Karau wrote: > +1 (non-binding) > > I (personally) think that Kubernetes as a scheduler backend should > eventually get merged in and there is clearly a community interested in the

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Holden Karau
+1 (non-binding) I (personally) think that Kubernetes as a scheduler backend should eventually get merged in and there is clearly a community interested in the work required to maintain it. On Tue, Aug 15, 2017 at 9:51 AM William Benton wrote: > +1 (non-binding) > > On Tue,

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread William Benton
+1 (non-binding) On Tue, Aug 15, 2017 at 10:32 AM, Anirudh Ramanathan < fox...@google.com.invalid> wrote: > Spark on Kubernetes effort has been developed separately in a fork, and > linked back from the Apache Spark project as an experimental backend >

Questions about the future of UDTs and Encoders

2017-08-15 Thread Katherine Prevost
Hi, all! I'm a developer who works to support data scientists at CERT. We've been having some great success working with Spark for data analysis, and I have some questions about how we could contribute to work on Spark in support of our goals. Specifically, we have some interest in user-defined

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Timothy Chen
+1 (non-binding) Tim On Tue, Aug 15, 2017 at 9:20 AM, Kimoon Kim wrote: > +1 (non-binding) > > Thanks, > Kimoon > > On Tue, Aug 15, 2017 at 9:19 AM, Sean Suchter > wrote: >> >> +1 (non-binding) >> >> >> >> -- >> View this message in context:

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Kimoon Kim
+1 (non-binding) Thanks, Kimoon On Tue, Aug 15, 2017 at 9:19 AM, Sean Suchter wrote: > +1 (non-binding) > > > > -- > View this message in context: http://apache-spark- > developers-list.1001551.n3.nabble.com/SPIP-Spark-on- > Kubernetes-tp22147p22150.html > Sent

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Sean Suchter
+1 (non-binding) -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SPIP-Spark-on-Kubernetes-tp22147p22150.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Erik Erlandson
+1 (non-binding) On Tue, Aug 15, 2017 at 8:32 AM, Anirudh Ramanathan wrote: > Spark on Kubernetes effort has been developed separately in a fork, and > linked back from the Apache Spark project as an experimental backend >

Re: Run a specific PySpark test or group of tests

2017-08-15 Thread Nicholas Chammas
Pytest does support unittest-based tests , allowing for incremental adoption. I'll see how convenient it is to use with our current test layout. On Tue, Aug 15, 2017 at 1:03 AM Hyukjin Kwon wrote: > For me, I would like this

Re: Possible bug: inconsistent timestamp behavior

2017-08-15 Thread Maciej Szymkiewicz
These two are just not equivalent. Spark SQL interprets long as seconds when casting between timestamps and numerics, therefore lit(148550335L).cast(org.apache.spark.sql.types.TimestampType) represents 49043-09-23 21:26:400.0. This behavior is intended - see for example

Possible bug: inconsistent timestamp behavior

2017-08-15 Thread assaf.mendelson
Hi all, I encountered weird behavior for timestamp. It seems that when using lit to add it to column, the timestamp goes from milliseconds representation to seconds representation: scala> spark.range(1).withColumn("a", lit(new java.sql.Timestamp(148550335L)).cast("long")).show()