Messages by Thread
-
-
Re: Build SPARK from source with SBT failed
Sean Owen
-
Pandas UDFs vs Inbuilt pyspark functions
neha garde
-
[Spark Structured Streaming] Do spark structured streaming is support sink to AWS Kinesis currently and how to handle if achieve quotas of kinesis?
hueiyuan su
-
Data duplication and loss occur after executing 'insert overwrite...' in Spark 3.1.1
周锋
-
How to pass variables across functions in spark structured streaming (PySpark)
Mich Talebzadeh
-
[ANNOUNCE] Apache Celeborn(incubating) 0.2.0 available
Ethan Feng
-
Fwd: [New Project] sparksql-ml : Distributed Machine Learning using SparkSQL.
Chitral Verma
-
Fwd: 自动回复: Re: [DISCUSS] Show Python code examples first in Spark documentation
Mich Talebzadeh
-
[JDBC] [PySpark] Possible bug when comparing incoming data frame from mssql and empty delta table
lennart
-
Late arriving updates to fact tables
rajat kumar
-
Re: SPIP architecture diagrams
Mich Talebzadeh
-
Unable to handle bignumeric datatype in spark/pyspark
nidhi kher
-
[PySpark SQL] New column with the maximum of multiple terms?
Oliver Ruebenacker
-
Spark with bigquery : Data type issue
nidhi kher
-
SPIP: Adding work load identity to Spark on Kubernetes documents (supersedes Secret Management)
Mich Talebzadeh
-
SPIP: Shutting down spark structured streaming when the streaming process completed current process
Mich Talebzadeh
-
Vote SPIP
Faisal Waris
-
Update nested struct with null fields
Vikas Kumar
-
[Spark Structured Streaming] Do spark structured streaming is support sink to AWS Kinesis currently?
hueiyuan su
-
How can I set a value of Location with CustomDataSource ?
Zhuolin Ji
-
Upgrading from Spark SQL 3.2 to 3.3 faild
lk_spark
-
[Spark Structured Streaming] Could we apply new options of readStream/writeStream without stopping spark application (zero downtime)?
hueiyuan su
-
ADLS Gen2 adfs sample yaml configuration
Kondala Ponnaboina (US)
-
How to explode array columns of a dataframe having the same length
sam smith
-
Executor tab missing information
Prem Sahoo
-
Running Spark on Kubernetes (GKE) - failing on spark-submit
karan alang
-
[Spark Core] Spark data loss/data duplication when executors die
Erik Eklund
-
How to improve efficiency of this piece of code (returning distinct column values)
sam smith
-
Re:
Sunil Prabhakara
-
Executor metrics are missing on prometheus sink
Qian Sun
-
Jira Account for Contributions
Jack Goodson
-
[Spark SQL]: Spark 3.2 generates different results to query when columns name have mixed casing vs when they have same casing
Amit Singh Rathore
-
Is sparkSession.sql now an action in Spark 3 and later?
Sayeh Roshan
-
Fwd: Graceful shutdown SPARK Structured Streaming
Mich Talebzadeh
-
[Spark SQL] : Delete is only supported on V2 tables.
Jeevan Chhajed
-
How to upgrade a spark structure streaming application
Yoel Benharrous
-
big data products
LinuxGuy
-
Create table before inserting in SQL
Harut Martirosyan
-
Spark Thrift Server issue with external HDFS table
Kalhara Gurugamage
-
What is DataFilters and while joining why is the filter isnotnull[joinKey] applied twice
Nitin Siwach
-
[Spark/deeplyR] how come spark is caching tables read through jdbc connection from oracle, even when memory=false is chosen
Joris Billen
-
Help needed regarding error with 5 node Spark cluster (shuffle error)- Comcast
Jain, Sanchi
-
Fwd: Spark-submit doesn't load all app classes in the classpath
Soheil Pourbafrani
-
spark+kafka+dynamic resource allocation
Lingzhe Sun
-
Spark SQL question
Kohki Nishio
-
Question regarding Spark 3.X performance
Athanasios Kordelas
-
Duplicates in Collaborative Filtering Output
Kartik Ohri
-
Any advantages of using sql.adaptive.autoBroadcastJoinThreshold over sql.autoBroadcastJoinThreshold?
Soumyadeep Mukhopadhyay
-
Table created with saveAsTable behaves differently than a table created with spark.sql("CREATE TABLE....)
krexos
-
Writing protobuf RDD to parquet
David Diebold
-
[Spark Standalone Mode] How to read from kerberised HDFS in spark standalone mode
Bansal, Jaimita
-
How to check the liveness of a SparkSession
Yeachan Park
-
[PySPark] How to check if value of one column is in array of another column
Oliver Ruebenacker
-
Is there any Job/Career channel
Chetan Khatri
-
[Spark SQL] Data duplicate or data lost with non-deterministic function
李建伟
-
pyspark.sql.dataframe.DataFrame versus pyspark.pandas.frame.DataFrame
second_co...@yahoo.com.INVALID
-
[pyspark/pandas] Pandas UDF accepting more than 2 pandas dataframe when cogroup + applyInPandas?
pzm6...@hotmail.com
-
Re: Hive 3 has big performance improvement from my test
Mich Talebzadeh
-
[pyspark/sparksql]: How to overcome redundant/repetitive code? Is a for loop over an sql statement with a variable a bad idea?
Joris Billen
-
[PySpark] Error using SciPy: ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
Oliver Ruebenacker
-
Re: [PySpark] Error using SciPy: ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
Bjørn Jørgensen
-
Re: [PySpark] Error using SciPy: ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
Oliver Ruebenacker
-
Re: [PySpark] Error using SciPy: ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
Mich Talebzadeh
-
Re: [PySpark] Error using SciPy: ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
Bjørn Jørgensen
-
Re: [PySpark] Error using SciPy: ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
Oliver Ruebenacker
-
GPU Support
K B M Kaala Subhikshan
-
Spark reading from HBase using hbase-connectors - any benefit from localization?
Aaron Grubb
-
Got Error Creating permanent view in Postgresql through Pyspark code
Vajiha Begum S A
-
[BUG?] How to handle with special characters or scape them on spark version 3.3.0?
Vieira, Thiago
-
How to set a config for a single query?
Felipe Pessoto
-
[SparkR] Compare datetime with Sys.time() throws error in R (>= 4.2.0)
Vivek Atal
-
Incorrect csv parsing when delimiter used within the data
Saurabh Gulati
-
Spark migration from 2.3 to 3.0.1
Shrikant Prasad