Hi,
The best part about Spark is that it is showing you which configuration to
tweak as well. In case you are using EMR, try to see that the configuration
points to the right location in the cluster "spark.local.dir". If a disk is
mounted across all the systems with a common path (you can do that
Hi All,
I have an external table in spark whose underlying data files are in
parquet format.
The table is partitioned. When I try to computed the statistics for a query
where partition column is in where clause, the statistics returned contains
only the sizeInBytes and not the no of rows count.
Hello everyone here is a case that i am facing,
i have a pyspark application that as it's last step is to create a pyspark
dataframe with two columns
(column1, column2). This dataframe has only one row and i want this row to
be inserted in a postgres db table. In every run this line in the datafra
Thanks for putting a comprehensive observation about Spark on Kubernetes. In
mesos Spark deployment, it has a property called spark.mesos.extra.cores.
The property means:
*
Set the extra number of cores for an executor to advertise. This does not
result in more cores allocated. It instead means tha
The other time when I encountered this I solved it by throwing more
resources at it (stronger cluster).
I was not able to understand the root cause though. I'll be happy to hear
deeper insight as well.
On Mon, Aug 20, 2018 at 7:08 PM, Steve Lewis wrote:
>
> We are trying to run a job that has pr