Re: Faster Spark on ORC with Apache ORC

2017-05-14 Thread Dong Joon Hyun
Hi, All. As a continuation of SPARK-20682(Support a new faster ORC data source based on Apache ORC), I would like to suggest to make the default ORCFileFormat configurable between sql/hive and sql/core for the followings. spark.read.orc(...) spark.write.orc(...) CREATE TABLE t

[PYTHON] PySpark typing hints

2017-05-14 Thread Maciej Szymkiewicz
Hi everyone, For the last few months I've been working on static type annotations for PySpark. For those of you, who are not familiar with the idea, typing hints have been introduced by PEP 484 (https://www.python.org/dev/peps/pep-0484/) and further extended with PEP 526