Re: cannot access class sun.nio.ch.DirectBuffer

2022-04-13 Thread Sean Owen
It is not officially supported, yes. Try Spark 3.3 from the branch if you want to try Java 17 On Wed, Apr 13, 2022, 9:36 PM Arunachalam Sibisakkaravarthi < arunacha...@mcruncher.com> wrote: > Thanks everyone for giving your feedback. > Jvm option "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED"

Re: cannot access class sun.nio.ch.DirectBuffer

2022-04-13 Thread Arunachalam Sibisakkaravarthi
Thanks everyone for giving your feedback. Jvm option "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED" resolved the issue "cannot access class sun.nio.ch.DirectBuffer" But still Spark throws some other exception org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0

Problems with DataFrameReader in Structured Streaming

2022-04-13 Thread Artemis User
We have a single file directory that's being used by both the file generator/publisher and the Spark job consumer.  When using microbatch files in structured streaming, we encountered the following problems: 1. We would like to have a Spark streaming job consume only data files after a

Re: cannot access class sun.nio.ch.DirectBuffer

2022-04-13 Thread Sean Owen
Yes I think that's a change that has caused difficulties, but, these internal APIs were always discouraged. Hey, one is even called 'unsafe'. There is an escape hatch, the JVM arg below. On Wed, Apr 13, 2022, 9:09 AM Andrew Melo wrote: > Gotcha. Seeing as there's a lot of large projects who

Re: Grabbing the current MemoryManager in a plugin

2022-04-13 Thread Andrew Melo
Hello, Any wisdom on the question below? Thanks Andrew On Fri, Apr 8, 2022 at 16:04 Andrew Melo wrote: > Hello, > > I've implemented support for my DSv2 plugin to back its storage with > ArrowColumnVectors, which necessarily means using off-heap memory. Is > it possible to somehow grab either

Re: cannot access class sun.nio.ch.DirectBuffer

2022-04-13 Thread Andrew Melo
Gotcha. Seeing as there's a lot of large projects who used the unsafe API either directly or indirectly (via netty, etc..) it's a bit surprising that it was so thoroughly closed off without an escape hatch, but I'm sure there was a lively discussion around it... Cheers Andrew On Wed, Apr 13,

Re: cannot access class sun.nio.ch.DirectBuffer

2022-04-13 Thread Sean Owen
It is intentionally closed by the JVM going forward, as direct access is discouraged. But it's still necessary for Spark. In some cases, like direct mem access, there is a new API but it's in Java 17 I think, and we can't assume Java 17 any time soon. On Wed, Apr 13, 2022 at 9:05 AM Andrew Melo

Re: cannot access class sun.nio.ch.DirectBuffer

2022-04-13 Thread Andrew Melo
Hi Sean, Out of curiosity, will Java 11+ always require special flags to access the unsafe direct memory interfaces, or is this something that will either be addressed by the spec (by making an "approved" interface) or by libraries (with some other workaround)? Thanks Andrew On Tue, Apr 12,

[Spark Streaming]: Why planInputPartitions is called multiple times for each micro-batch in Spark 3?

2022-04-13 Thread Hussain, Saghir
Hi All While upgrading our custom streaming data source from Spark 2.4.5 to Spark 3.2.1, we observed that the planInputPartitions() method in MicroBatchStream is being called multiple times(4 in our case) for each micro-batch in Spark 3. The Apache Spark documentation also says that : The

Streaming partition-by data locality for state lookupon executor

2022-04-13 Thread Sandip Khanzode
Hello, If I have a Kinesis stream split into multiple shards (say 10), can I have, say 3 executors, subscribed to those shards? I assume that automatic re-balancing will be enabled as we add/remove executors for scale up/down or simply failures … If so, can I specify a partition key? If I