Re: JDK version support policy?

2023-06-13 Thread David Li
we're >>>> going to see enough folks moving to JRE17 by the Spark 4 release unless we >>>> have a strong benefit from dropping 11 support I'd be inclined to keep it. >>>> >>>> On Tue, Jun 6, 2023 at 9:08 PM Dongjoon Hyun wrote: >>>>> I'm al

JDK version support policy?

2023-06-06 Thread David Li
Hello Spark developers, I'm from the Apache Arrow project. We've discussed Java version support [1], and crucially, whether to continue supporting Java 8 or not. As Spark is a big user of Arrow in Java, I was curious what Spark's policy here was. If Spark intends to stay on Java 8, for

[DISCUSS] Reducing memory usage of toPandas with Arrow "self_destruct" option

2020-09-10 Thread David Li
Hello all, We've been working with PySpark and Pandas, and have found that to convert a dataset using N bytes of memory to Pandas, we need to have 2N bytes free, even with the Arrow optimization enabled. The fundamental reason is ARROW-3789[1]: Arrow does not free the Arrow table until conversion