Is it possible to set the number of cores per executor on standalone
cluster?
Because we find that, cores distribution may be very skewed on executor at
some time, so the workload is skewed, that make our job become slow.
Thanks!
--
郑旭东
Zheng, Xudong
ema, or the schema provided by user via data source
> DDL), we don't need to do schema merging on driver side, but defer it to
> executor side and each task only needs to reconcile those part-files it
> needs to touch. This is also what the Parquet developers did recently for
>
s a configuration
> to disable schema merging by default when doing Hive metastore Parquet
> table conversion.
>
> Another workaround is to fallback to the old Parquet code by setting
> spark.sql.parquet.useDataSourceApi to false.
>
> Cheng
>
>
> On 3/31/15 2:47 PM, Zh
print the detailed
LocatedBlocks info.
Another finding is, if I read the Parquet file via scala code form
spark-shell as below, it looks fine, the computation will return the result
quick as before.
sqlContext.parquetFile("data/myparquettable")
Any idea about it? Thank you!
--
郑旭东
Zheng, Xudong