In a pyspark SS job, trying to use sql instead of sql functions in
foreachBatch sink
throws AttributeError: 'JavaMember' object has no attribute 'format'
exception.
However, the same thing works in Scala API.
Please note, I tested in spark 2.4.5/2.4.6 and 3.0.0 and got the same
exception.
Is it a
You do not need one spark session per cluster.
Spark SQL with Datasource v1
http://www.russellspitzer.com/2016/02/16/Multiple-Clusters-SparkSql-Cassandra/
DatasourceV2
Would require making two catalog references then copying between them
https://github.com/datastax/spark-cassandra-connector/bl
Does anybody know how spark collects non-match results after performing
broadcast hash left outer join?
Suppose we have 4 nodes. 1 driver and 3 executors. We broadcast the left
table. After left outer join is performed in each executor, how does spark
recognize which records have not been matched,
Hi, I have table A in the cassandra cluster cluster -1 in one data
center. I have table B in cluster -2 in another data center. I want to copy
the data from one cluster to another using spark. I faced the problem that
I can not create two spark sessions as we need spark sessions per cluster.
Plea
Is possible to give options when reading semistructured files using SQL
Syntax like in the example below:
"SELECT * FROM csv.`file.csv`
For example, if I want to have header=true. Is it possible ?
Thanks
--
--
Daniel Mantovani