[ANNOUNCE] Announcing Apache Spark 2.2.0

2017-07-11 Thread Michael Armbrust
Hi all, Apache Spark 2.2.0 is the third release of the Spark 2.x line. This release removes the experimental tag from Structured Streaming. In addition, this release focuses on usability, stability, and polish, resolving over 1100 tickets. We'd like to thank our contributors and users for their

Re: Faster Spark on ORC with Apache ORC

2017-07-11 Thread Dong Joon Hyun
Hi, All. Since Apache Spark 2.2 vote passed successfully last week, I think it’s a good time for me to ask your opinions again about the following PR. https://github.com/apache/spark/pull/17980 (+3,887, −86) It’s for the following issues. * SPARK-20728: Make ORCFileFormat configurable

Slowness of Spark Thrift Server

2017-07-11 Thread Maciej Bryński
Hi, I have following issue. I'm trying to use Spark as a proxy to Cassandra. The problem is the thrift server overhead. I'm using following query: select * from table where primay_key = 123 Job time (from jobs tab) is around 50ms. (and it's similar to query time from SQL tab) Unfortunately query