Spark 3.3.0 with Structure Streaming from Kafka Issue on commons-pools2

2022-08-26 Thread Raymond Tang
Hi all, I encountered one issue when reading from Kafka as stream and then sink into HDFS (using delta lake format). java.lang.NoSuchMethodError: org.apache.spark.sql.kafka010.consumer.InternalKafkaConsumerPool$PoolConfig.setMinEvictableIdleTime(Ljava/time/Duration;)V I looked into the details

Spark SQL Predict Pushdown for Hive Bucketed Table

2022-08-26 Thread Raymond Tang
Hi all, Anyone knows why Spark SQL is not using Hive buckets pruning when reading from bucketed Hive table? [SPARK-40206] Spark SQL Predict Pushdown for Hive Bucketed Table - ASF JIRA (apache.org) Details also provided at the end of mail.

Structured Streaming - data not being read (offsets not getting committed ?)

2022-08-26 Thread karan alang
Hello All, i've a long-running Apache Spark structured streaming job running in GCP Dataproc, which reads data from Kafka every 10 mins, and does some processing. Kafka topic has 3 partitions, and a retention period of 3 days. The issue i'm facing is that after few hours, the program stops

Re: Profiling PySpark Pandas UDF

2022-08-26 Thread Abdeali Kothari
Hi Luca, I see you pushed some code to the PR 3 hrs ago. That's awesome. If I can help out in any way - do let me know I think that's an amazing feature and would be great if it can get into spark On Fri, 26 Aug 2022, 12:41 Luca Canali, wrote: > @Abdeali as for “lightweight profiling”, there is

回复:Re: Spark got incorrect scala version while using spark 3.2.1 and spark 3.2.2

2022-08-26 Thread ckgppl_yan
Oh, I got it. I thought SPARK can get local scala version. - 原始邮件 - 发件人:Sean Owen 收件人:ckgppl_...@sina.cn 抄送人:user 主题:Re: Spark got incorrect scala version while using spark 3.2.1 and spark 3.2.2 日期:2022年08月26日 21点08分 Spark is built with and ships with a copy of Scala. It doesn't use

Re: Spark got incorrect scala version while using spark 3.2.1 and spark 3.2.2

2022-08-26 Thread pengyh
good answer. nice to know too. Sean Owen wrote: Spark is built with and ships with a copy of Scala. It doesn't use your local version. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark got incorrect scala version while using spark 3.2.1 and spark 3.2.2

2022-08-26 Thread Sean Owen
Spark is built with and ships with a copy of Scala. It doesn't use your local version. On Fri, Aug 26, 2022 at 2:55 AM wrote: > Hi all, > > I found a strange thing. I have run SPARK 3.2.1 prebuilt in local mode. My > OS scala version is 2.13.7. > But when I run spark-sumit then check the

Spark got incorrect scala version while using spark 3.2.1 and spark 3.2.2

2022-08-26 Thread ckgppl_yan
Hi all, I found a strange thing. I have run SPARK 3.2.1 prebuilt in local mode. My OS scala version is 2.13.7.But when I run spark-sumit then check the SparkUI, the web page shown that my scala version is 2.13.5.I used spark-shell, it also shown that my scala version is 2.13.5.Then I tried

RE: Profiling PySpark Pandas UDF

2022-08-26 Thread Luca Canali
@Abdeali as for “lightweight profiling”, there is some work in progress on instrumenting Python UDFs with Spark metrics, see https://issues.apache.org/jira/browse/SPARK-34265 However it is a bit stuck at the moment, and needs to be revived I believe. Best, Luca From: Abdeali