Hi all
My team has the same issue. It looks like Spark 1.3's sparkSQL cannot read
parquet file generated by Spark 1.1. It will cost a lot of migration work
when we wanna to upgrade Spark 1.3.
Is there anyone can help me?
Thanks
Wisely Chen
On Tue, Mar 10, 2015 at 5:06 PM, Pei-Lun Lee
Hi,
I am working on artificial neural networks for Spark. It is solved with
Gradient Descent, so each step the data is read, sum of gradients is calculated
for each data partition (on each worker), aggregated (on the driver) and
broadcasted back. I noticed that the gradient computation time is
Hi all,
I'm running the teraSort benchmark with a relative small input set: 5GB.
During profiling, I can see I am using a total of 68GB. I've got a terabyte
of memory in my system, and set
spark.executor.memory 900g
spark.driver.memory 900g
I use the default for
spark.shuffle.memoryFraction
The checks against maxCategories are not for statistical purposes; they are
to make sure communication does not blow up. There currently are not
checks to make sure that there are enough entries for statistically
significant results. That is up to the user.
I do like the idea of adding a