[ANNOUNCE] Apache Spark 2.1.3

2018-07-01 Thread Holden Karau
We are happy to announce the availability of Spark 2.1.3! Apache Spark 2.1.3 is a maintenance release, based on the branch-2.1 maintenance branch of Spark. We strongly recommend all 2.1.x users to upgrade to this stable release. The release notes are available at

Unable to acquire N bytes of memory, got 0

2018-07-01 Thread 吴晓菊
Is it normal to get exception like : "Previous exception in task: Unable to acquire 65536 bytes of memory, got 0" In my understanding, in current memory management, no enough memory will anyway trigger spill so such kind of exception will not be thrown. Unless some operators are not implemented

Re: Repartition not working on a csv file

2018-07-01 Thread Abdeali Kothari
I prefer not to do a .cache() due to memory limits. But I did try a persist() with DISK_ONLY I did the repartition(), followed by a .count() followed by a persist() of DISK_ONLY That didn't change the number of tasks either On Sun, Jul 1, 2018, 15:50 Alexander Czech wrote: > You could try to

Re: Repartition not working on a csv file

2018-07-01 Thread Alexander Czech
You could try to force a repartion right at that point by producing a cached version of the DF with .cache() if memory allows it. On Sun, Jul 1, 2018 at 5:04 AM, Abdeali Kothari wrote: > I've tried that too - it doesn't work. It does a repetition, but not right > after the broadcast join - it