java_method udf is not visible in the API documentation

2019-06-27 Thread kant kodali
Hi All, I see it here https://spark.apache.org/docs/2.3.1/api/sql/index.html#java_method But I don't see it here https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/functions.html

Re: Change parallelism number in Spark Streaming

2019-06-27 Thread Jungtaek Lim
Great, thanks! Even better if you could share the slide as well (and if possible video too), since it would be helpful for other users to understand about the details. Thanks again, Jungtaek Lim (HeartSaVioR) On Thu, Jun 27, 2019 at 7:33 PM Jacek Laskowski wrote: > Hi, > > I've got a talk "The

Checkpointing and accessing the checkpoint data

2019-06-27 Thread Jean-Georges Perrin
Hi Sparkians, Few questions around checkpointing. 1. Checkpointing “dump” file / persisting to disk Is the file encrypted or is it a standard parquet file? 2. If the file is not encrypted, can I use it with another app (I know it’s kind of of a weird stretch case) 3. Have you/do you know of

Re: Change parallelism number in Spark Streaming

2019-06-27 Thread Jacek Laskowski
Hi, I've got a talk "The internals of stateful stream processing in Spark Structured Streaming" at https://dataxday.fr/ today and am going to include the tool on the slides to thank you for the work. Thanks. Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski The Internals of Spark

Java Generic T makes ClassNotFoundException

2019-06-27 Thread big data
Dear, I use Spark to deserialize some files to restore to my own Class object. The Spark code and Class deserialized code (using Apache Common Lang) like this: val fis = spark.sparkContext.binaryFiles("/folder/abc*.file") val RDD = fis.map(x => { val content = x._2.toArray() val b =