---
TSMC PROPERTY
This email communication (and any attachments) is proprietary information
for the sole use of its
Hi all,
our scenario is to generate lots of folders containinig parquet file and
then uses add partition to add these folder locations to a hive table;
when trying to read the hive table using Spark,
following logs would show up and took a lot of time on reading them;
but this won't happen after
Dear all,
We've tried to use sparkSql to do some insert from A table to B table
action where using the exact same SQL script,
hive is able to finish it but Spark 1.3.1 would always end with OOM issue;
we tried several configuration including:
--executor-cores 2
--num-executors 300
Hi Shawn,
Thank alot that's actually the last parameter we overlooked!!
I'm able to run the same sql on spark now if I set the spark.driver.memoory
larger,
thanks again!!
--
Best Regards,
Felicia Shann
單師涵
+886-3-5636688 Ext. 7124300
|-
|Xiaoyu
we tried to cache table through
hiveCtx = HiveContext(sc)
hiveCtx.cacheTable(table name)
as described on Spark 1.3.1's document and we're on CDH5.3.0 with Spark
1.3.1 built with Hadoop 2.6
following error message would occur if we tried to cache table with parquet
format GZIP
though we're not