java.lang.NullPointerException

2018-05-10 Thread Mina Aslani
Hi, I get java.lang.NullPointerException at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:128) When I try to createDataFrame using the sparkSession, see below: SparkConf conf = new SparkConf().setMaster().setAppName("test");

Re: Spark 2.3.0 Structured Streaming Kafka Timestamp

2018-05-10 Thread Yuta Morisawa
The problem is solved. The actual schema of Kafka message is different from documentation. https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html The documentation says the format of "timestamp" column is Long type, but the actual format is timestamp. The

Re: Accumulator guarantees

2018-05-10 Thread Sergey Zhemzhitsky
As far as I understand updates of the custom accumulators at the driver side happen during task completion [1]. The documentation states [2] that the very last stage in a job consists of multiple ResultTasks, which execute the task and send its output back to the driver application. Also sources

Accumulator guarantees

2018-05-10 Thread Sergey Zhemzhitsky
Hi there, Although Spark's docs state that there is a guarantee that - accumulators in actions will only be updated once - accumulators in transformations may be updated multiple times ... I'm wondering whether the same is true for transformations in the last stage of the job or there is a

Re: [Structured-Streaming][Beginner] Out of order messages with Spark kafka readstream from a specific partition

2018-05-10 Thread Cody Koeninger
As long as you aren't doing any spark operations that involve a shuffle, the order you see in spark should be the same as the order in the partition. Can you link to a minimal code example that reproduces the issue? On Wed, May 9, 2018 at 7:05 PM, karthikjay wrote: > On the

[Spark] Supporting python 3.5?

2018-05-10 Thread Irving Duran
Does spark now support python 3.5 or it is just 3.4.x? https://spark.apache.org/docs/latest/rdd-programming-guide.html Thank You, Irving Duran

Re: Spark 2.3.0 --files vs. addFile()

2018-05-10 Thread Lalwani, Jayesh
This is a long standing bug in Spark. –jars and –files doesn’t work in Standalone mode https://issues.apache.org/jira/browse/SPARK-4160 From: Marius Date: Wednesday, May 9, 2018 at 3:51 AM To: "user@spark.apache.org" Subject: Spark 2.3.0 --files vs.