Request for Assistance: Adding User Authentication to Apache Spark Application

2024-05-16 Thread NIKHIL RAJ SHRIVASTAVA
Dear Team, I hope this email finds you well. My name is Nikhil Raj, and I am currently working with Apache Spark for one of my projects , where through the help of a parquet file we are creating an external table in Spark. I am reaching out to seek assistance regarding user authentication

[ANNOUNCE] Apache Spark 3.4.3 released

2024-04-18 Thread Dongjoon Hyun
We are happy to announce the availability of Apache Spark 3.4.3! Spark 3.4.3 is a maintenance release containing many fixes including security and correctness domains. This release is based on the branch-3.4 maintenance branch of Spark. We strongly recommend all 3.4 users to upgrade

Apache Spark integration with Spring Boot 3.0.0+

2024-03-28 Thread Szymon Kasperkiewicz
Hello, Ive got a project which has to use newest versions of both Apache Spark and Spring Boot due to vulnerabilities issues. I build my project using Gradle. And when I try to run it i get : Unsatisfied dependecy exception about javax/servlet/Servlet. Ive tried to add jakarta servlet

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-23 Thread Winston Lai
+1 -- Thank You & Best Regards Winston Lai From: Jay Han Date: Sunday, 24 March 2024 at 08:39 To: Kiran Kumar Dusi Cc: Farshid Ashouri , Matei Zaharia , Mich Talebzadeh , Spark dev list , user @spark Subject: Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Communit

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-23 Thread Jay Han
> Some of you may be aware that Databricks community Home | Databricks >>> have just launched a knowledge sharing hub. I thought it would be a >>> good idea for the Apache Spark user group to have the same, especially >>> for repeat questions on Spark core, Spark SQL, Spa

Re: Feature article: Leveraging Generative AI with Apache Spark: Transforming Data Engineering

2024-03-22 Thread Mich Talebzadeh
Sorry from this link Leveraging Generative AI with Apache Spark: Transforming Data Engineering | LinkedIn <https://www.linkedin.com/pulse/leveraging-generative-ai-apache-spark-transforming-mich-lxbte/?trackingId=aqZMBOg4O1KYRB4Una7NEg%3D%3D> Mich Talebzadeh, Technologist | Data | Generat

Feature article: Leveraging Generative AI with Apache Spark: Transforming Data Engineering

2024-03-22 Thread Mich Talebzadeh
You may find this link of mine in Linkedin for the said article. We can use Linkedin for now. Leveraging Generative AI with Apache Spark: Transforming Data Engineering | LinkedIn Mich Talebzadeh, Technologist | Data | Generative AI | Financial Fraud London United Kingdom view my Linkedin

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-20 Thread Kiran Kumar Dusi
>> good idea for the Apache Spark user group to have the same, especially >> for repeat questions on Spark core, Spark SQL, Spark Structured >> Streaming, Spark Mlib and so forth. >> >> Apache Spark user and dev groups have been around for a good while. >> Th

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-20 Thread Farshid Ashouri
+1 On Mon, 18 Mar 2024, 11:00 Mich Talebzadeh, wrote: > Some of you may be aware that Databricks community Home | Databricks > have just launched a knowledge sharing hub. I thought it would be a > good idea for the Apache Spark user group to have the same, especially > for repe

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-19 Thread Mich Talebzadeh
d idea. Will be useful >>> >>> >>> >>> +1 >>> >>> >>> >>> >>> >>> >>> >>> *From: *ashok34...@yahoo.com.INVALID >>> *Date: *Monday, March 18, 2024 at 6:36 AM >>> *To: *user @s

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-19 Thread Joris Billen
com.INVALID Date: Monday, March 18, 2024 at 6:36 AM To: user @spark mailto:user@spark.apache.org>>, Spark dev list mailto:d...@spark.apache.org>>, Mich Talebzadeh mailto:mich.talebza...@gmail.com>> Cc: Matei Zaharia mailto:matei.zaha...@gmail.com>> Subject: Re: A pro

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-19 Thread Varun Shah
+1 Great initiative. QQ : Stack overflow has a similar feature called "Collectives", but I am not sure of the expenses to create one for Apache Spark. With SO being used ( atleast before ChatGPT became quite the norm for searching questions), it already has a lot of questions asked an

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Deepak Sharma
>> >> >> >> >> >> >> *From: *ashok34...@yahoo.com.INVALID >> *Date: *Monday, March 18, 2024 at 6:36 AM >> *To: *user @spark , Spark dev list < >> d...@spark.apache.org>, Mich Talebzadeh >> *Cc: *Matei Zaharia >> *Subject: *R

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Hyukjin Kwon
D >> *Date: *Monday, March 18, 2024 at 6:36 AM >> *To: *user @spark , Spark dev list < >> d...@spark.apache.org>, Mich Talebzadeh >> *Cc: *Matei Zaharia >> *Subject: *Re: A proposal for creating a Knowledge Sharing Hub for >> Apache Spark Community >

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Mich Talebzadeh
;>> >>> man. 18. mars 2024 kl. 17:26 skrev Parsian, Mahmoud < >>> mpars...@illumina.com.invalid>: >>> >>>> Good idea. Will be useful >>>> >>>> >>>> >>>> +1 >>>> >>&

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Reynold Xin
4...@yahoo.com.INVALID ) < >>> ashok34668@ yahoo. com. INVALID ( ashok34...@yahoo.com.INVALID ) > >>> *Date:* Monday, March 18 , 2024 at 6:36 AM >>> *To:* user @spark < user@ spark. apache. org ( user@spark.ap

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Mich Talebzadeh
> *To: *user @spark , Spark dev list < >> d...@spark.apache.org>, Mich Talebzadeh >> *Cc: *Matei Zaharia >> *Subject: *Re: A proposal for creating a Knowledge Sharing Hub for >> Apache Spark Community >> >> External message, be mindful when clicking links or att

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Bjørn Jørgensen
y, March 18, 2024 at 6:36 AM > *To: *user @spark , Spark dev list < > d...@spark.apache.org>, Mich Talebzadeh > *Cc: *Matei Zaharia > *Subject: *Re: A proposal for creating a Knowledge Sharing Hub for Apache > Spark Community > > External message, be mindful when

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Code Tutelage
ark dev list < > d...@spark.apache.org>, Mich Talebzadeh > *Cc: *Matei Zaharia > *Subject: *Re: A proposal for creating a Knowledge Sharing Hub for Apache > Spark Community > > External message, be mindful when clicking links or attachments > > > > Good idea. Will be us

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Mich Talebzadeh
v list < > d...@spark.apache.org>, Mich Talebzadeh > *Cc: *Matei Zaharia > *Subject: *Re: A proposal for creating a Knowledge Sharing Hub for Apache > Spark Community > > External message, be mindful when clicking links or attachments > > > > Good idea. Will be useful >

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Parsian, Mahmoud
Good idea. Will be useful +1 From: ashok34...@yahoo.com.INVALID Date: Monday, March 18, 2024 at 6:36 AM To: user @spark , Spark dev list , Mich Talebzadeh Cc: Matei Zaharia Subject: Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community External message, be mindful

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread ashok34...@yahoo.com.INVALID
Good idea. Will be useful +1 On Monday, 18 March 2024 at 11:00:40 GMT, Mich Talebzadeh wrote: Some of you may be aware that Databricks community Home | Databricks have just launched a knowledge sharing hub. I thought it would be a good idea for the Apache Spark user group to have

A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Mich Talebzadeh
Some of you may be aware that Databricks community Home | Databricks have just launched a knowledge sharing hub. I thought it would be a good idea for the Apache Spark user group to have the same, especially for repeat questions on Spark core, Spark SQL, Spark Structured Streaming, Spark Mlib

答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun
Okay, Let me double-check it carefully. Thank you very much for your help! 发件人: Jungtaek Lim 发送时间: 2024年3月5日 21:56:41 收件人: Pan,Bingkun 抄送: Dongjoon Hyun; dev; user 主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released Yeah the approach seems OK to me - please double

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Jungtaek Lim
024年3月5日 17:09:07 > *收件人:* Pan,Bingkun > *抄送:* Dongjoon Hyun; dev; user > *主题:* Re: [ANNOUNCE] Apache Spark 3.5.1 released > > Let me be more specific. > > We have two active release version lines, 3.4.x and 3.5.x. We just > released Spark 3.5.1, having a dropdown as 3.5.1 and

答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun
:07 收件人: Pan,Bingkun 抄送: Dongjoon Hyun; dev; user 主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released Let me be more specific. We have two active release version lines, 3.4.x and 3.5.x. We just released Spark 3.5.1, having a dropdown as 3.5.1 and 3.4.2 given the fact the last version of 3.4.x is 3.4.2

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Jungtaek Lim
possible. > > Only by sharing the same version. json file in each version. > -- > *发件人:* Jungtaek Lim > *发送时间:* 2024年3月5日 16:47:30 > *收件人:* Pan,Bingkun > *抄送:* Dongjoon Hyun; dev; user > *主题:* Re: [ANNOUNCE] Apache Spark 3.5.1 released &

答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun
; dev; user 主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released But this does not answer my question about updating the dropdown for the doc of "already released versions", right? Let's say we just released version D, and the dropdown has version A, B, C. We have another release tomorrow as

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Jungtaek Lim
user has entered the pyspark document, if he finds that the > version he is currently in is not the version he wants, he can easily jump > to the version he wants by clicking on the drop-down box. Additionally, in > this PR, the current automatic mechanism for PRs did not merge in. > >

答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun
, in this PR, the current automatic mechanism for PRs did not merge in. https://github.com/apache/spark/pull/42881 So, we need to manually update this file. I can manually submit an update first to get this feature working. 发件人: Jungtaek Lim 发送时间: 2024年3月4日 6:34:42

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-04 Thread yangjie01
That sounds like a great suggestion. 发件人: Jungtaek Lim 日期: 2024年3月5日 星期二 10:46 收件人: Hyukjin Kwon 抄送: yangjie01 , Dongjoon Hyun , dev , user 主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released Yes, it's relevant to that PR. I wonder, if we want to expose version switcher, it should

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-04 Thread Jungtaek Lim
Yes, it's relevant to that PR. I wonder, if we want to expose version switcher, it should be in versionless doc (spark-website) rather than the doc being pinned to a specific version. On Tue, Mar 5, 2024 at 11:18 AM Hyukjin Kwon wrote: > Is this related to https://github.com/apache/spark/p

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-04 Thread Hyukjin Kwon
Is this related to https://github.com/apache/spark/pull/42428? cc @Yang,Jie(INF) On Mon, 4 Mar 2024 at 22:21, Jungtaek Lim wrote: > Shall we revisit this functionality? The API doc is built with individual > versions, and for each individual version we depend on other released >

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-03 Thread Jungtaek Lim
tion how to bump the version manually, nor having the automatic bump. > > * PR for addition of dropdown: https://github.com/apache/spark/pull/42428 > * PR for automatically bumping version: > https://github.com/apache/spark/pull/42881 > > We will probably need to add an instruction in th

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-29 Thread Peter Toth
Congratulations and thanks Jungtaek for driving this! Xinrong Meng ezt írta (időpont: 2024. márc. 1., P, 5:24): > Congratulations! > > Thanks, > Xinrong > > On Thu, Feb 29, 2024 at 11:16 AM Dongjoon Hyun > wrote: > >> Congratulations! >> >> Bests, >> Dongjoon. >> >> On Wed, Feb 28, 2024 at

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-29 Thread Jungtaek Lim
uction how to bump the version manually, nor having the automatic bump. * PR for addition of dropdown: https://github.com/apache/spark/pull/42428 * PR for automatically bumping version: https://github.com/apache/spark/pull/42881 We will probably need to add an instruction in the release process

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-29 Thread Dongjoon Hyun
BTW, Jungtaek. PySpark document seems to show a wrong branch. At this time, `master`. https://spark.apache.org/docs/3.5.1/api/python/index.html PySpark Overview Date: Feb 24, 2024 Version: master

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-29 Thread John Zhuge
Excellent work, congratulations! On Wed, Feb 28, 2024 at 10:12 PM Dongjoon Hyun wrote: > Congratulations! > > Bests, > Dongjoon. > > On Wed, Feb 28, 2024 at 11:43 AM beliefer wrote: > >> Congratulations! >> >> >> >> At 2024-02-28 17:43:25, "Jungtaek Lim" >> wrote: >> >> Hi everyone, >> >> We

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-29 Thread Prem Sahoo
Congratulations Sent from my iPhoneOn Feb 29, 2024, at 4:54 PM, Xinrong Meng wrote:Congratulations!Thanks,XinrongOn Thu, Feb 29, 2024 at 11:16 AM Dongjoon Hyun wrote:Congratulations!Bests,Dongjoon.On Wed, Feb 28, 2024 at 11:43 AM beliefer

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-29 Thread Xinrong Meng
Congratulations! Thanks, Xinrong On Thu, Feb 29, 2024 at 11:16 AM Dongjoon Hyun wrote: > Congratulations! > > Bests, > Dongjoon. > > On Wed, Feb 28, 2024 at 11:43 AM beliefer wrote: > >> Congratulations! >> >> >> >> At 2024-02-28 17:43:25, "Jungtaek Lim" >> wrote: >> >> Hi everyone, >> >> We

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-28 Thread Dongjoon Hyun
Congratulations! Bests, Dongjoon. On Wed, Feb 28, 2024 at 11:43 AM beliefer wrote: > Congratulations! > > > > At 2024-02-28 17:43:25, "Jungtaek Lim" > wrote: > > Hi everyone, > > We are happy to announce the availability of Spark 3.5.1! > > Spark 3.5.1 is a maintenance release containing

Re:[ANNOUNCE] Apache Spark 3.5.1 released

2024-02-28 Thread beliefer
Congratulations! At 2024-02-28 17:43:25, "Jungtaek Lim" wrote: Hi everyone, We are happy to announce the availability of Spark 3.5.1! Spark 3.5.1 is a maintenance release containing stability fixes. This release is based on the branch-3.5 maintenance branch of Spark. We strongly

[ANNOUNCE] Apache Spark 3.5.1 released

2024-02-28 Thread Jungtaek Lim
Hi everyone, We are happy to announce the availability of Spark 3.5.1! Spark 3.5.1 is a maintenance release containing stability fixes. This release is based on the branch-3.5 maintenance branch of Spark. We strongly recommend all 3.5 users to upgrade to this stable release. To download Spark

[apache-spark] documentation on File Metadata _metadata struct

2024-01-10 Thread Jason Horner
All, the only documentation about the File Metadata ( hidden_metadata struct) I can seem to find is on the databricks website https://docs.databricks.com/en/ingestion/file-metadata-column.html#file-metadata-column for reference here is the struct:_metadata: struct (nullable = false) |-- file_path:

[ANNOUNCE] Apache Spark 3.3.4 released

2023-12-16 Thread Dongjoon Hyun
We are happy to announce the availability of Apache Spark 3.3.4! Spark 3.3.4 is the last maintenance release based on the branch-3.3 maintenance branch of Spark. It contains many fixes including security and correctness domains. We strongly recommend all 3.3 users to upgrade to this or higher

Re: SSH Tunneling issue with Apache Spark

2023-12-06 Thread Venkatesan Muniappan
hout Spark complicating the picture for you. >> >> >> On Dec 5, 2023, at 8:12 PM, Venkatesan Muniappan < >> venkatesa...@noonacademy.com> wrote: >> >> Hi Team, >> >> I am facing an issue with SSH Tunneling in Apache Spark. The behavior is >> same as th

Re: SSH Tunneling issue with Apache Spark

2023-12-06 Thread Nicholas Chammas
connection to work without Spark complicating the picture for you. >> >> >>> On Dec 5, 2023, at 8:12 PM, Venkatesan Muniappan >>> mailto:venkatesa...@noonacademy.com>> wrote: >>> >>> Hi Team, >>> >>> I am facing an issue with S

Re: SSH Tunneling issue with Apache Spark

2023-12-06 Thread Venkatesan Muniappan
PM, Venkatesan Muniappan < > venkatesa...@noonacademy.com> wrote: > > Hi Team, > > I am facing an issue with SSH Tunneling in Apache Spark. The behavior is > same as the one in this Stackoverflow question > <https://stackoverflow.com/questions/68278369/how-to-u

SSH Tunneling issue with Apache Spark

2023-12-05 Thread Venkatesan Muniappan
Hi Team, I am facing an issue with SSH Tunneling in Apache Spark. The behavior is same as the one in this Stackoverflow question <https://stackoverflow.com/questions/68278369/how-to-use-pyspark-to-read-a-mysql-database-using-a-ssh-tunnel> but there are no answers there. This is what I am

Re:[ANNOUNCE] Apache Spark 3.4.2 released

2023-11-30 Thread beliefer
Congratulations! At 2023-12-01 01:23:55, "Dongjoon Hyun" wrote: We are happy to announce the availability of Apache Spark 3.4.2! Spark 3.4.2 is a maintenance release containing many fixes including security and correctness domains. This release is based on the branch-3.4 m

[ANNOUNCE] Apache Spark 3.4.2 released

2023-11-30 Thread Dongjoon Hyun
We are happy to announce the availability of Apache Spark 3.4.2! Spark 3.4.2 is a maintenance release containing many fixes including security and correctness domains. This release is based on the branch-3.4 maintenance branch of Spark. We strongly recommend all 3.4 users to upgrade

Re: [ SPARK SQL ]: UPPER in WHERE condition is not working in Apache Spark 3.5.0 for Mysql ENUM Column

2023-11-07 Thread Suyash Ajmera
Any update on this? On Fri, 13 Oct, 2023, 12:56 pm Suyash Ajmera, wrote: > This issue is related to CharVarcharCodegenUtils readSidePadding method . > > Appending white spaces while reading ENUM data from mysql > > Causing issue in querying , writing the same data to Cassandra. > > On Thu, 12

Re: [ SPARK SQL ]: UPPER in WHERE condition is not working in Apache Spark 3.5.0 for Mysql ENUM Column

2023-10-13 Thread Suyash Ajmera
This issue is related to CharVarcharCodegenUtils readSidePadding method . Appending white spaces while reading ENUM data from mysql Causing issue in querying , writing the same data to Cassandra. On Thu, 12 Oct, 2023, 7:46 pm Suyash Ajmera, wrote: > I have upgraded my spark job from spark

[ SPARK SQL ]: PPER in WHERE condition is not working in Apache Spark 3.5.0 for Mysql ENUM Column

2023-10-12 Thread Suyash Ajmera
I have upgraded my spark job from spark 3.3.1 to spark 3.5.0, I am querying to Mysql Database and applying `*UPPER(col) = UPPER(value)*` in the subsequent sql query. It is working as expected in spark 3.3.1 , but not working with 3.5.0. Where Condition :: `*UPPER(vn) = 'ERICSSON' AND (upper(st)

APACHE Spark adoption/growth chart

2023-09-12 Thread Andrew Petersen
Hello Spark community Can anyone direct me to a simple graph/chart that shows APACHE Spark adoption, preferably one that includes recent years? Of less importance, a similar Databricks plot? An internet search gave me plots only up to 2015. I also searched spark.apache.org and databricks.com

Re: Seeking Professional Advice on Career and Personal Growth in the Apache Spark Community

2023-09-07 Thread Mich Talebzadeh
> > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any mon

Re: Seeking Professional Advice on Career and Personal Growth in the Apache Spark Community

2023-09-06 Thread ashok34...@yahoo.com.INVALID
may arise from relying on this email's technical content is explicitly disclaimed.The author will in no case be liable for any monetary damages arising from suchloss, damage or destruction.   On Tue, 5 Sept 2023 at 22:17, Varun Shah wrote: Dear Apache Spark Community, I hope this email fin

Re: Seeking Professional Advice on Career and Personal Growth in the Apache Spark Community

2023-09-06 Thread Mich Talebzadeh
ich may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Tue, 5 Sept 2023 at 22:17, Varun Shah wrote: > Dear Apache Spark Community, > > I hope this

Seeking Professional Advice on Career and Personal Growth in the Apache Spark Community

2023-09-05 Thread Varun Shah
Dear Apache Spark Community, I hope this email finds you well. I am writing to seek your valuable insights and advice on some challenges I've been facing in my career and personal development journey, particularly in the context of Apache Spark and the broader big data ecosystem. A little

[ANNOUNCE] Apache Spark 3.3.3 released

2023-08-22 Thread Yuming Wang
We are happy to announce the availability of Apache Spark 3.3.3! Spark 3.3.3 is a maintenance release containing stability fixes. This release is based on the branch-3.3 maintenance branch of Spark. We strongly recommend all 3.3 users to upgrade to this stable release. To download Spark 3.3.3

Re: dockerhub does not contain apache/spark-py 3.4.1

2023-08-10 Thread Mich Talebzadeh
01:01Z Revision 6b1ff22dde1ead51cbf370be6e48a802daae58b6 Url https://github.com/apache/spark Built on java 11 185@b031a15c6730:/opt/spark/work-dir$ java --version openjdk 11.0.11 2021-04-20 OpenJDK Runtime Environment 18.9 (build 11.0.11+9) OpenJDK 64-Bit Server VM 18.9 (build 11.0.11+9, mixe

Re: dockerhub does not contain apache/spark-py 3.4.1

2023-08-09 Thread Mich Talebzadeh
23 at 16:43, Mark Elliot wrote: > Hello, > > I noticed that the apache/spark-py image for Spark's 3.4.1 release is not > available (apache/spark@3.4.1 is available). Would it be possible to get > the 3.4.1 release build for the apache/spark-py image published? > > Thanks, > >

dockerhub does not contain apache/spark-py 3.4.1

2023-08-09 Thread Mark Elliot
Hello, I noticed that the apache/spark-py image for Spark's 3.4.1 release is not available (apache/spark@3.4.1 is available). Would it be possible to get the 3.4.1 release build for the apache/spark-py image published? Thanks, Mark -- This communication, together with any

Re: The performance difference when running Apache Spark on K8s and traditional server

2023-07-27 Thread Mich Talebzadeh
Spark on tin boxes like Google Dataproc or AWS EC2 often utilise YARN resource manager. YARN is the most widely used resource manager not just for Spark but for other artefacts as well. On-premise YARN is used extensively. In Cloud it is also used widely in Infrastructure as a Service such as

The performance difference when running Apache Spark on K8s and traditional server

2023-07-27 Thread Trường Trần Phan An
Hi all, I am learning about the performance difference of Spark when performing a JOIN problem on Serverless (K8S) and Serverful (Traditional server) environments. Through experiment, Spark on K8s tends to run slower than Serverful. Through understanding the architecture, I know that Spark runs

Re: Introducing English SDK for Apache Spark - Seeking Your Feedback and Contributions

2023-07-03 Thread Gavin Ray
Wow, really neat -- thanks for sharing! On Mon, Jul 3, 2023 at 8:12 PM Gengliang Wang wrote: > Dear Apache Spark community, > > We are delighted to announce the launch of a groundbreaking tool that aims > to make Apache Spark more user-friendly and accessible - the English

Re: Introducing English SDK for Apache Spark - Seeking Your Feedback and Contributions

2023-07-03 Thread Hyukjin Kwon
The demo was really amazing. On Tue, 4 Jul 2023 at 09:17, Farshid Ashouri wrote: > This is wonderful news! > > On Tue, 4 Jul 2023 at 01:14, Gengliang Wang wrote: > >> Dear Apache Spark community, >> >> We are delighted to announce the launch of a groundbreaking to

Re: Introducing English SDK for Apache Spark - Seeking Your Feedback and Contributions

2023-07-03 Thread Farshid Ashouri
This is wonderful news! On Tue, 4 Jul 2023 at 01:14, Gengliang Wang wrote: > Dear Apache Spark community, > > We are delighted to announce the launch of a groundbreaking tool that aims > to make Apache Spark more user-friendly and accessible - the English SDK > <

Introducing English SDK for Apache Spark - Seeking Your Feedback and Contributions

2023-07-03 Thread Gengliang Wang
Dear Apache Spark community, We are delighted to announce the launch of a groundbreaking tool that aims to make Apache Spark more user-friendly and accessible - the English SDK <https://github.com/databrickslabs/pyspark-ai/>. Powered by the application of Generative AI, the English SDK

Re: [ANNOUNCE] Apache Spark 3.4.1 released

2023-06-24 Thread yangjie01
to:mri...@gmail.com>> wrote: >> >> >> Thanks Dongjoon ! >> >> Regards, >> Mridul >> >> On Fri, Jun 23, 2023 at 6:58 PM Dongjoon Hyun > <mailto:dongj...@apache.org>> wrote: >>> >>> We are happy to announce the availability o

Re:[ANNOUNCE] Apache Spark 3.4.1 released

2023-06-24 Thread beliefer
Thanks! Dongjoon Hyun. Congratulation too! At 2023-06-24 07:57:05, "Dongjoon Hyun" wrote: We are happy to announce the availability of Apache Spark 3.4.1! Spark 3.4.1 is a maintenance release containing stability fixes. This release is based on the branch-3.4 maintenance branc

Apache Spark with watermark - processing data different LogTypes in same kafka topic

2023-06-24 Thread karan alang
Hello All - I'm using Apache Spark Structured Streaming to read data from Kafka topic, and do some processing. I'm using watermark to account for late-coming records and the code works fine. Here is the working(sample) code: ``` from pyspark.sql import SparkSessionfrom pyspark.sql.functions

Re: [ANNOUNCE] Apache Spark 3.4.1 released

2023-06-23 Thread L. C. Hsieh
58 PM Dongjoon Hyun wrote: >>> >>> We are happy to announce the availability of Apache Spark 3.4.1! >>> >>> Spark 3.4.1 is a maintenance release containing stability fixes. This >>> release is based on the branch-3.4 maintenance branch of Spark. We strongly

Re: [ANNOUNCE] Apache Spark 3.4.1 released

2023-06-23 Thread Hyukjin Kwon
Thanks! On Sat, Jun 24, 2023 at 11:01 AM Mridul Muralidharan wrote: > > Thanks Dongjoon ! > > Regards, > Mridul > > On Fri, Jun 23, 2023 at 6:58 PM Dongjoon Hyun wrote: > >> We are happy to announce the availability of Apache Spark 3.4.1! >> >> Spark

Re: [ANNOUNCE] Apache Spark 3.4.1 released

2023-06-23 Thread Mridul Muralidharan
Thanks Dongjoon ! Regards, Mridul On Fri, Jun 23, 2023 at 6:58 PM Dongjoon Hyun wrote: > We are happy to announce the availability of Apache Spark 3.4.1! > > Spark 3.4.1 is a maintenance release containing stability fixes. This > release is based on the branch-3.4 maintenance bra

[ANNOUNCE] Apache Spark 3.4.1 released

2023-06-23 Thread Dongjoon Hyun
We are happy to announce the availability of Apache Spark 3.4.1! Spark 3.4.1 is a maintenance release containing stability fixes. This release is based on the branch-3.4 maintenance branch of Spark. We strongly recommend all 3.4 users to upgrade to this stable release. To download Spark 3.4.1

Re: Apache Spark not reading UTC timestamp from MongoDB correctly

2023-06-08 Thread Enrico Minack
wrote: ref : https://stackoverflow.com/questions/76436159/apache-spark-not-reading-utc-timestamp-from-mongodb-correctly Hello All, I've data stored in MongoDB collection and the timestamp column is not being read by Apache Spark correctly. I'm running Apache Spark on GCP

Re: Apache Spark not reading UTC timestamp from MongoDB correctly

2023-06-08 Thread Sean Owen
You sure it is not just that it's displaying in your local TZ? Check the actual value as a long for example. That is likely the same time. On Thu, Jun 8, 2023, 5:50 PM karan alang wrote: > ref : > https://stackoverflow.com/questions/76436159/apache-spark-not-reading-utc-timestamp-from-m

Apache Spark not reading UTC timestamp from MongoDB correctly

2023-06-08 Thread karan alang
ref : https://stackoverflow.com/questions/76436159/apache-spark-not-reading-utc-timestamp-from-mongodb-correctly Hello All, I've data stored in MongoDB collection and the timestamp column is not being read by Apache Spark correctly. I'm running Apache Spark on GCP Dataproc. Here is sample data

Re: How to determine the function of tasks on each stage in an Apache Spark application?

2023-05-02 Thread Trường Trần Phan An
SparkListenerSQLExecutionStart [2] and SparkListenerSQLExecutionEnd >> [3], and correlate infos. >> >> You may want to look at how web UI works under the covers to collect all >> the information. Start from SQLTab that should give you what is displayed >> (that should give you

CVE-2023-32007: Apache Spark: Shell command injection via Spark UI

2023-05-02 Thread Arnout Engelen
Severity: important Affected versions: - Apache Spark 3.1.1 before 3.2.2 Description: ** UNSUPPORTED WHEN ASSIGNED ** The Apache Spark UI offers the possibility to enable ACLs via the configuration option spark.acls.enable. With an authentication filter, this checks whether a user has access

CVE-2023-22946: Apache Spark proxy-user privilege escalation from malicious configuration class

2023-04-15 Thread Sean R. Owen
Description: In Apache Spark versions prior to 3.4.0, applications using spark-submit can specify a 'proxy-user' to run as, limiting privileges. The application can execute code with the privileges of the submitting user, however, by providing malicious configuration-related classes

Re: How to determine the function of tasks on each stage in an Apache Spark application?

2023-04-14 Thread Jacek Laskowski
web UI works under the covers to collect all the information. Start from SQLTab that should give you what is displayed (that should give you then what's needed and how it's collected). [1] https://github.com/apache/spark/blob/8cceb3946bdfa5ceac0f2b4fe6a7c43eafb76d59/core/src/main/scala/org/apache

[ANNOUNCE] Apache Spark 3.2.4 released

2023-04-13 Thread Dongjoon Hyun
We are happy to announce the availability of Apache Spark 3.2.4! Spark 3.2.4 is a maintenance release containing stability fixes. This release is based on the branch-3.2 maintenance branch of Spark. We strongly recommend all 3.2 users to upgrade to this stable release. To download Spark 3.2.4

Re: How to determine the function of tasks on each stage in an Apache Spark application?

2023-04-13 Thread Trường Trần Phan An
com/jaceklaskowski > > <https://twitter.com/jaceklaskowski> > > > On Tue, Apr 11, 2023 at 6:53 PM Trường Trần Phan An < > truong...@vlute.edu.vn> wrote: > >> Hi all, >> >> I am conducting a study comparing the execution time of Bloom Filter Joi

Re: How to determine the function of tasks on each stage in an Apache Spark application?

2023-04-12 Thread Maytas Monsereenusorn
<https://twitter.com/jaceklaskowski> > > > On Tue, Apr 11, 2023 at 6:53 PM Trường Trần Phan An < > truong...@vlute.edu.vn> wrote: > >> Hi all, >> >> I am conducting a study comparing the execution time of Bloom Filter Join >> operation on two environments

Re: How to determine the function of tasks on each stage in an Apache Spark application?

2023-04-12 Thread Jacek Laskowski
er.com/jaceklaskowski> On Tue, Apr 11, 2023 at 6:53 PM Trường Trần Phan An wrote: > Hi all, > > I am conducting a study comparing the execution time of Bloom Filter Join > operation on two environments: Apache Spark Cluster and Apache Spark. I > have compared the overall time

Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-04-01 Thread Mich Talebzadeh
Good stuff Khalid. I have created a section in Apache Spark Community Stack called spark foundation. spark-foundation - Apache Spark Community - Slack <https://app.slack.com/client/T04URTRBZ1R/C051CL5T1KL/thread/C0501NBTNQG-1680132989.091199> I invite you to add your weblink to that s

Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-04-01 Thread Khalid Mammadov
no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> >> On Fri, 31 Mar 2023 at 15:15, AN-TRUONG Tran Phan < >> tr.phan.tru...@gmail.com> wrote: >> >>> Hi, >>> >>> I am

Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-03-31 Thread Mich Talebzadeh
;> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. &

Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-03-31 Thread AN-TRUONG Tran Phan
r 2023 at 15:15, AN-TRUONG Tran Phan < > tr.phan.tru...@gmail.com> wrote: > >> Hi, >> >> I am learning about Apache Spark and want to know the meaning of each >> Task created on the Jobs recorded on Spark history. >> >> For example, the application I wr

Re: Help me learn about JOB TASK and DAG in Apache Spark

2023-03-31 Thread Mich Talebzadeh
g from such loss, damage or destruction. On Fri, 31 Mar 2023 at 15:15, AN-TRUONG Tran Phan wrote: > Hi, > > I am learning about Apache Spark and want to know the meaning of each Task > created on the Jobs recorded on Spark history. > > For example, the application I write c

Cannot build Apache Spark 3.3.1 with Apache Hive 3.1.2 and Apache Hadoop 3.1.1

2022-12-28 Thread שוהם יהודה
Hi Team I have a problem with building Apache Spark compatible with Apache Hive 3.1.2. I believe Apache Spark supports Hive 3.1.2 as I saw it in the docs. https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html I saw in the docs the following guide to build spark: https

Re: Query regarding Apache spark version 3.0.1

2022-12-15 Thread Sean Owen
us to know when version 3.0.1 for Apache spark is > going to be EOS? Till when we are going to get fixes for the version 3.0.1. > > > > Regards, > > Pranav > > >

Query regarding Apache spark version 3.0.1

2022-12-15 Thread Pranav Kumar (EXT)
Hi Team, Could you please help us to know when version 3.0.1 for Apache spark is going to be EOS? Till when we are going to get fixes for the version 3.0.1. Regards, Pranav

Re: [ANNOUNCE] Apache Spark 3.2.3 released

2022-11-30 Thread L. C. Hsieh
>>> Thanks, Chao! >>> >>> >>> >>> 发件人: Maxim Gekk >>> 日期: 2022年11月30日 星期三 19:40 >>> 收件人: Jungtaek Lim >>> 抄送: Wenchen Fan , Chao Sun , dev >>> , user >>> 主题: Re: [ANNOUNCE] Apache Spark 3.2.3 released >

Re: [ANNOUNCE] Apache Spark 3.2.3 released

2022-11-30 Thread huaxin gao
1月30日 星期三 19:40 >> *收件人**: *Jungtaek Lim >> *抄送**: *Wenchen Fan , Chao Sun , >> dev , user >> *主题**: *Re: [ANNOUNCE] Apache Spark 3.2.3 released >> >> >> >> Thank you, Chao! >> >> >> >> On Wed, Nov 30, 2022 at 12:42 PM Jungtaek Lim <

Re: [ANNOUNCE] Apache Spark 3.2.3 released

2022-11-30 Thread Dongjoon Hyun
Thank you, Chao! On Wed, Nov 30, 2022 at 8:16 AM Yang,Jie(INF) wrote: > Thanks, Chao! > > > > *发件人**: *Maxim Gekk > *日期**: *2022年11月30日 星期三 19:40 > *收件人**: *Jungtaek Lim > *抄送**: *Wenchen Fan , Chao Sun , > dev , user > *主题**: *Re: [ANNOUNCE] Apache Spark 3.

Re: [ANNOUNCE] Apache Spark 3.2.3 released

2022-11-30 Thread Yang,Jie(INF)
Thanks, Chao! 发件人: Maxim Gekk 日期: 2022年11月30日 星期三 19:40 收件人: Jungtaek Lim 抄送: Wenchen Fan , Chao Sun , dev , user 主题: Re: [ANNOUNCE] Apache Spark 3.2.3 released Thank you, Chao! On Wed, Nov 30, 2022 at 12:42 PM Jungtaek Lim mailto:kabhwan.opensou...@gmail.com>> wrote: Thank

Re: [ANNOUNCE] Apache Spark 3.2.3 released

2022-11-30 Thread Maxim Gekk
gt; We are happy to announce the availability of Apache Spark 3.2.3! >>> >>> Spark 3.2.3 is a maintenance release containing stability fixes. This >>> release is based on the branch-3.2 maintenance branch of Spark. We >>> strongly >>> recommend all 3.2 users to

  1   2   3   4   5   6   7   8   9   10   >