Hi Thanks for this work.
Will this affect both: 1) spark.read.format("orc").load("...") 2) spark.sql("select ... from my_orc_table_in_hive") ? Le 10 janv. 2018 à 20:14, Dongjoon Hyun écrivait : > Hi, All. > > Vectorized ORC Reader is now supported in Apache Spark 2.3. > > https://issues.apache.org/jira/browse/SPARK-16060 > > It has been a long journey. From now, Spark can read ORC files faster without > feature penalty. > > Thank you for all your support, especially Wenchen Fan. > > It's done by two commits. > > [SPARK-16060][SQL] Support Vectorized ORC Reader > https://github.com/apache/spark/commit/f44ba910f58083458e1133502e193a > 9d6f2bf766 > > [SPARK-16060][SQL][FOLLOW-UP] add a wrapper solution for vectorized orc > reader > https://github.com/apache/spark/commit/eaac60a1e20e29084b7151ffca964c > faa5ba99d1 > > Please check OrcReadBenchmark for the final speed-up from `Hive built-in ORC` > to `Native ORC Vectorized`. > > https://github.com/apache/spark/blob/master/sql/hive/src/test/scala/org/ > apache/spark/sql/hive/orc/OrcReadBenchmark.scala > > Thank you. > > Bests, > Dongjoon. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org