unsubscribe

2017-09-19 Thread shshann
--- TSMC PROPERTY This email communication (and any attachments) is proprietary information for the sole use of its

what is : ParquetFileReader: reading summary file ?

2015-07-16 Thread shshann
Hi all, our scenario is to generate lots of folders containinig parquet file and then uses add partition to add these folder locations to a hive table; when trying to read the hive table using Spark, following logs would show up and took a lot of time on reading them; but this won't happen after

SparkSQL OOM issue

2015-07-07 Thread shshann
Dear all, We've tried to use sparkSql to do some insert from A table to B table action where using the exact same SQL script, hive is able to finish it but Spark 1.3.1 would always end with OOM issue; we tried several configuration including: --executor-cores 2 --num-executors 300

Re: SparkSQL OOM issue

2015-07-07 Thread shshann
Hi Shawn, Thank alot that's actually the last parameter we overlooked!! I'm able to run the same sql on spark now if I set the spark.driver.memoory larger, thanks again!! -- Best Regards, Felicia Shann 單師涵 +886-3-5636688 Ext. 7124300 |- |Xiaoyu

Caching parquet table (with GZIP) on Spark 1.3.1

2015-05-26 Thread shshann
we tried to cache table through hiveCtx = HiveContext(sc) hiveCtx.cacheTable(table name) as described on Spark 1.3.1's document and we're on CDH5.3.0 with Spark 1.3.1 built with Hadoop 2.6 following error message would occur if we tried to cache table with parquet format GZIP though we're not