Hi Sir,
Could you please advise to fix the below issue in the withColumn in the
spark 2.2 scala 2.11 joins
def processing(spark:SparkSession,
dataset1:Dataset[Reference],
dataset2:Dataset[DataCore],
dataset3:Dataset[ThirdPartyData] ,
dataset4:Dataset[OtherData]
date:String):Dataset[DataMerge
Dear Felix and Richikesh and list,
Thank you very much for your previous help. So far I have tried two ways to
trigger Spark SQL: one is to use R with sparklyr library and SparkR library;
the other way is to use SparkR shell from Spark. I am not connecting a remote
spark cluster, but a local on
Also anyone has any idea to resolve this issue -
https://stackoverflow.com/questions/56390492/spark-metadata-0-doesnt-exist-while-compacting-batch-9-structured-streaming-er
On Fri, Jun 7, 2019 at 5:59 PM Chetan Khatri
wrote:
> Hello Dear Spark Users,
>
> I am trying to write data from Kafka Topi
Hello Dear Spark Users,
I am trying to write data from Kafka Topic to Parquet HDFS with Structured
Streaming but Getting failures. Please do help.
val spark: SparkSession =
SparkSession.builder().appName("DemoSparkKafka").getOrCreate()
import spark.implicits._
val dataFromTopicDF = spark
.readS
This seem to be more a question of spark-sql shell? I may suggest you change
the email title to get more attention.
From: ya
Sent: Wednesday, June 5, 2019 11:48:17 PM
To: user@spark.apache.org
Subject: sparksql in sparkR?
Dear list,
I am trying to use sparksql
Hi,
Why is casting a string column to timestamp not giving the same results as
going through casting to long in-between? I'm tempted to consider it a bug.
scala> spark.version
res4: String = 2.4.3
scala> Seq("1", "2").toDF("ts").select($"ts" cast "timestamp").show
++
| ts|
++
|null|
|nu
Hey Guys,
I am wondering what is the best way to get logs for driver in the cluster
mode on standalone cluster? Normally I used to run client mode so I could
capture logs from the console.
Now I've started running jobs in cluster mode and obviously driver is
running on worker and can't see the
Hello,
How can we dump the spark driver and executor threads information in spark
application logging.?
PS: submitting spark job using spark submit
Regards
Rohit
Hi Bruno, that's really interesting...
So, to use explode, I would have to do a group by on countries and a
collect_all on cities, then explode the cities, right? Am I understanding
the idea right?
I think this could produce the results I want. But what would be the
behaviour under the hood? Does