subject:"apache\-spark"

Re: [ANNOUNCE] Apache Spark 3.5.2 released

2024-08-12 Thread Xiao Li

Thank you, Kent!

Kent Yao  于2024年8月12日周一 08:03写道：

> We are happy to announce the availability of Apache Spark 3.5.2!
>
> Spark 3.5.2 is the second maintenance release containing security
> and correctness fixes. This release is based on the branch-3.5
> maintenance branch of Spark. We strongly recommend all 3.5 users
> to upgrade to this stable release.
>
> To download Spark 3.5.2, head over to the download page:
> https://spark.apache.org/downloads.html
>
> To view the release notes:
> https://spark.apache.org/releases/spark-release-3-5-2.html
>
> We would like to acknowledge all community members for contributing to this
> release. This release would not have been possible without you.
>
> Kent Yao
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

[ANNOUNCE] Apache Spark 3.5.2 released

2024-08-12 Thread Kent Yao

We are happy to announce the availability of Apache Spark 3.5.2!

Spark 3.5.2 is the second maintenance release containing security
and correctness fixes. This release is based on the branch-3.5
maintenance branch of Spark. We strongly recommend all 3.5 users
to upgrade to this stable release.

To download Spark 3.5.2, head over to the download page:
https://spark.apache.org/downloads.html

To view the release notes:
https://spark.apache.org/releases/spark-release-3-5-2.html

We would like to acknowledge all community members for contributing to this
release. This release would not have been possible without you.

Kent Yao

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Question about installing Apache Spark [PySpark] computer requirements

2024-07-29 Thread Meena Rajani

k.scheduler.ResultTask.runTask(ResultTask.scala:93)
>   at 
> org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
>   at org.apache.spark.scheduler.Task.run(Task.scala:141)
>   at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
>   at 
> org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
>   at 
> org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   ... 1 more
>
>
> On Mon, Jul 29, 2024 at 4:34 PM Sadha Chilukoori 
> wrote:
>
>> Hi Mike,
>>
>> I'm not sure about the minimum requirements of a machine for running
>> Spark. But to run some Pyspark scripts (and Jupiter notbebooks) on a local
>> machine, I found the following steps are the easiest.
>>
>>
>> I installed Amazon corretto and updated the java_home variable as
>> instructed here
>> https://docs.aws.amazon.com/corretto/latest/corretto-11-ug/downloads-list.html
>> (Any other java works too, I'm used to corretto from work).
>>
>> Then installed the Pyspark module using pip, which enabled me run Pyspark
>> on my machine.
>>
>> -Sadha
>>
>> On Mon, Jul 29, 2024, 12:51 PM mike Jadoo  wrote:
>>
>>> Hello,
>>>
>>> I am trying to run Pyspark on my computer without success.  I follow
>>> several different directions from online sources and it appears that I need
>>> to get a faster computer.
>>>
>>> I wanted to ask what are some recommendations for computer
>>> specifications to run PySpark (Apache Spark).
>>>
>>> Any help would be greatly appreciated.
>>>
>>> Thank you,
>>>
>>> Mike
>>>
>>

Re: Question about installing Apache Spark [PySpark] computer requirements

2024-07-29 Thread Sadha Chilukoori

7)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
>   at 
> org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
>   at org.apache.spark.scheduler.Task.run(Task.scala:141)
>   at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
>   at 
> org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
>   at 
> org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   ... 1 more
>
>
> On Mon, Jul 29, 2024 at 4:34 PM Sadha Chilukoori 
> wrote:
>
>> Hi Mike,
>>
>> I'm not sure about the minimum requirements of a machine for running
>> Spark. But to run some Pyspark scripts (and Jupiter notbebooks) on a local
>> machine, I found the following steps are the easiest.
>>
>>
>> I installed Amazon corretto and updated the java_home variable as
>> instructed here
>> https://docs.aws.amazon.com/corretto/latest/corretto-11-ug/downloads-list.html
>> (Any other java works too, I'm used to corretto from work).
>>
>> Then installed the Pyspark module using pip, which enabled me run Pyspark
>> on my machine.
>>
>> -Sadha
>>
>> On Mon, Jul 29, 2024, 12:51 PM mike Jadoo  wrote:
>>
>>> Hello,
>>>
>>> I am trying to run Pyspark on my computer without success.  I follow
>>> several different directions from online sources and it appears that I need
>>> to get a faster computer.
>>>
>>> I wanted to ask what are some recommendations for computer
>>> specifications to run PySpark (Apache Spark).
>>>
>>> Any help would be greatly appreciated.
>>>
>>> Thank you,
>>>
>>> Mike
>>>
>>

Re: Question about installing Apache Spark [PySpark] computer requirements

2024-07-29 Thread mike Jadoo

 Sadha Chilukoori 
wrote:

> Hi Mike,
>
> I'm not sure about the minimum requirements of a machine for running
> Spark. But to run some Pyspark scripts (and Jupiter notbebooks) on a local
> machine, I found the following steps are the easiest.
>
>
> I installed Amazon corretto and updated the java_home variable as
> instructed here
> https://docs.aws.amazon.com/corretto/latest/corretto-11-ug/downloads-list.html
> (Any other java works too, I'm used to corretto from work).
>
> Then installed the Pyspark module using pip, which enabled me run Pyspark
> on my machine.
>
> -Sadha
>
> On Mon, Jul 29, 2024, 12:51 PM mike Jadoo  wrote:
>
>> Hello,
>>
>> I am trying to run Pyspark on my computer without success.  I follow
>> several different directions from online sources and it appears that I need
>> to get a faster computer.
>>
>> I wanted to ask what are some recommendations for computer specifications
>> to run PySpark (Apache Spark).
>>
>> Any help would be greatly appreciated.
>>
>> Thank you,
>>
>> Mike
>>
>

Re: Question about installing Apache Spark [PySpark] computer requirements

2024-07-29 Thread Sadha Chilukoori

Hi Mike,

I'm not sure about the minimum requirements of a machine for running Spark.
But to run some Pyspark scripts (and Jupiter notbebooks) on a local
machine, I found the following steps are the easiest.

I installed Amazon corretto and updated the java_home variable as
instructed here
https://docs.aws.amazon.com/corretto/latest/corretto-11-ug/downloads-list.html
(Any other java works too, I'm used to corretto from work).

Then installed the Pyspark module using pip, which enabled me run Pyspark
on my machine.

-Sadha

On Mon, Jul 29, 2024, 12:51 PM mike Jadoo  wrote:

> Hello,
>
> I am trying to run Pyspark on my computer without success.  I follow
> several different directions from online sources and it appears that I need
> to get a faster computer.
>
> I wanted to ask what are some recommendations for computer specifications
> to run PySpark (Apache Spark).
>
> Any help would be greatly appreciated.
>
> Thank you,
>
> Mike
>

Question about installing Apache Spark [PySpark] computer requirements

2024-07-29 Thread mike Jadoo

Hello,

I am trying to run Pyspark on my computer without success.  I follow
several different directions from online sources and it appears that I need
to get a faster computer.

I wanted to ask what are some recommendations for computer specifications
to run PySpark (Apache Spark).

Any help would be greatly appreciated.

Thank you,

Mike

Re: 7368396 - Apache Spark 3.5.1 (Support)

2024-06-07 Thread Sadha Chilukoori

Hi Alex,

Spark is an open source software available under  Apache License 2.0 (
https://www.apache.org/licenses/), further details can be found here in the
FAQ page (https://spark.apache.org/faq.html).

Hope this helps.


Thanks,

Sadha

On Thu, Jun 6, 2024, 1:32 PM SANTOS SOUZA, ALEX 
wrote:

> Hey guys!
>
>
>
> I am part of the team responsible for software approval at EMBRAER S.A.
> We are currently in the process of approving the Apache Spark 3.5.1
> software and are verifying the licensing of the application.
> Therefore, I would like to kindly request you to answer the questions
> below.
>
> -What type of software? (Commercial, Freeware, Component, etc...)
>  A:
>
> -What is the licensing model for commercial use? (Subscription, Perpetual,
> GPL, etc...)
> A:
>
> -What type of license? (By user, Competitor, Device, Server or others)?
> A:
>
> -Number of installations allowed per license/subscription?
> A:
>
> Can it be used in the defense and aerospace industry? (Company that
> manufactures products for national defense)
> A:
>
> -Does the license allow use in any location regardless of the origin of
> the purchase (tax restriction)?
> A:
>
> -Where can I find the End User License Agreement (EULA) for the version in
> question?
> A:
>
>
>
> Desde já, muito obrigado e qualquer dúvida estou à disposição. / Thank you
> very much in advance and I am at your disposal if you have any questions.
>
>
> Att,
>
>
> Alex Santos Souza
>
> Software Asset Management - Embraer
>
> WhatsApp: +55 12 99731-7579
>
> E-mail: alex.santosso...@dxc.com
>
> DXC Technology
>
> São José dos Campos, SP - Brazil
>
>

7368396 - Apache Spark 3.5.1 (Support)

2024-06-06 Thread SANTOS SOUZA, ALEX

Hey guys!



I am part of the team responsible for software approval at EMBRAER S.A.
We are currently in the process of approving the Apache Spark 3.5.1 software 
and are verifying the licensing of the application.
Therefore, I would like to kindly request you to answer the questions below.

-What type of software? (Commercial, Freeware, Component, etc...)
 A:

-What is the licensing model for commercial use? (Subscription, Perpetual, GPL, 
etc...)
A:

-What type of license? (By user, Competitor, Device, Server or others)?
A:

-Number of installations allowed per license/subscription?
A:

Can it be used in the defense and aerospace industry? (Company that 
manufactures products for national defense)
A:

-Does the license allow use in any location regardless of the origin of the 
purchase (tax restriction)?
A:

-Where can I find the End User License Agreement (EULA) for the version in 
question?
A:



Desde já, muito obrigado e qualquer dúvida estou à disposição. / Thank you very 
much in advance and I am at your disposal if you have any questions.


Att,

[cid:babbaea5-d892-4b6e-abd9-d0da0cc3e296]

Alex Santos Souza

Software Asset Management - Embraer

WhatsApp: +55 12 99731-7579

E-mail: alex.santosso...@dxc.com

DXC Technology

São José dos Campos, SP - Brazil

Inquiry Regarding Security Compliance of Apache Spark Docker Image

2024-06-05 Thread Tonmoy Sagar

Dear Apache Team,

I hope this email finds you well.

We are a team from Ernst and Young LLP - India, dedicated to providing 
innovative supply chain solutions for a diverse range of clients. Our team 
recently encountered a pivotal use case necessitating the utilization of 
PySpark for a project aimed at handling substantial volumes of data. As part of 
our deployment strategy, we are endeavouring to implement a Spark-based 
application on our Azure Kubernetes service.

Regrettably, we have encountered challenges from a security perspective with 
the latest Apache Spark Docker image, specifically apache/spark-py:latest. Our 
security team has meticulously conducted an assessment and has generated a 
comprehensive vulnerability report highlighting areas of concern.

Given the non-compliance of the Docker image with our organization's stringent 
security protocols, we find ourselves unable to proceed with its integration 
into our applications. We attach the vulnerability report herewith for your 
perusal.

Considering these circumstances, we kindly request your esteemed team to 
provide any resolutions or guidance that may assist us in mitigating the 
identified security vulnerabilities. Your prompt attention to this matter would 
be greatly appreciated, as it is crucial for the successful deployment and 
operation of our Spark-based application within our infrastructure.

Thank you for your attention to this inquiry, and we look forward to your 
valued support and assistance.



Please find attachment for the vulnerability report
Best Regards,
Tonmoy Sagar | Sr. Consultant | Advisory | Asterisk
Ernst & Young LLP
C-401, Panchshil Tech Park One, Yerawada, Pune, Maharashtra 411006, India
Mobile: +91 8724918230 | tonmoy.sa...@in.ey.com<mailto:tonmoy.sa...@in.ey.com>
Thrive in the Transformative Age with the better-connected consultants - 
ey.com/consulting<http://ey.com/consulting>



The information contained in this communication is intended solely for the use 
of the individual or entity to whom it is addressed and others authorized to 
receive it. It may contain confidential or legally privileged information. If 
you are not the intended recipient you are hereby notified that any disclosure, 
copying, distribution or taking any action in reliance on the contents of this 
information is strictly prohibited and may be unlawful. If you have received 
this communication in error, please notify us immediately by responding to this 
email and then delete it from your system. The firm is neither liable for the 
proper and complete transmission of the information contained in this 
communication nor for any delay in its receipt.


spark_vulnerability_report.xlsx
Description: spark_vulnerability_report.xlsx

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

[ANNOUNCE] Announcing Apache Spark 4.0.0-preview1

2024-06-03 Thread Wenchen Fan

Hi all,

To enable wide-scale community testing of the upcoming Spark 4.0 release,
the Apache Spark community has posted a preview release of Spark 4.0. This
preview is not a stable release in terms of either API or functionality,
but it is meant to give the community early access to try the code that
will become Spark 4.0. If you would like to test the release, please
download it, and send feedback using either the mailing lists or JIRA.

There are a lot of exciting new features added to Spark 4.0, including ANSI
mode by default, Python data source, polymorphic Python UDTF, string
collation support, new VARIANT data type, streaming state store data
source, structured logging, Java 17 by default, and many more.

We'd like to thank our contributors and users for their contributions and
early feedback to this release. This release would not have been possible
without you.

To download Spark 4.0.0-preview1, head over to the download page:
https://archive.apache.org/dist/spark/spark-4.0.0-preview1 . It's also
available in PyPI, with version name "4.0.0.dev1".

Thanks,

Wenchen

[apache-spark][spark-dataframe] DataFrameWriter.partitionBy does not guarantee previous sort result

2024-05-31 Thread leeyc0

I have a dataset that have the following schema:
(timestamp, partitionKey, logValue)

I want to have the dataset to be sorted by timestamp, but write to file in
the follow directory layout:
outputDir/partitionKey/files
The output file only contains logValue, that is, timestamp is used for
sorting only and is not used for output.
(FYI, logValue contains textual representation of timestamp which is not
sortable)

My first attempt is to use DataFrameWriter.partitionBy:
dataset
.sort("timestamp")
.select("partitionKey", "logValue")
.write()
.partitionBy("partitionKey")
.text("output");

However, as mentioned in SPARK-44512 (
https://issues.apache.org/jira/browse/SPARK-44512), this does not guarantee
the output is globally sorted.
(note: I found that even setting
spark.sql.optimizer.plannedWrite.enabled=false still does not guarantee
sorted result in low memory environment)

And the developers say DataFrameWriter.partitionBy does not guarantee
sorted results:
"Although I understand Apache Spark 3.4.0 changes the behavior like the
above, I don't think there is a contract that Apache Spark's `partitionBy`
operation preserves the previous ordering."

To workaround this problem, I have to resort to creating a hadoop output
format by extending org.apache.hadoop.mapred.lib.MultipleTextOutputFormat
and output the file by saveAsHadoopFile:

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.mapred.lib.MultipleTextOutputFormat;

public final class PartitionedMultipleTextOutputFormat
extends MultipleTextOutputFormat {
@SuppressWarnings("MissingJavadocMethod")
public PartitionedMultipleTextOutputFormat() {
super();
}

@Override
protected Object generateActualKey(final Object key, final V value) {
return NullWritable.get();
}

@Override
protected String generateFileNameForKeyValue(final Object key, final V
value, final String leaf) {
return new Path(key.toString(), leaf).toString();
}
}

private static Tuple2 mapRDDToDomainLogPair(final Row row) {
final String domain = row.getAs(" partitionKey ");
final var log = (String) row.getAs("logValue");
final var logTextClass = new Text(log);
return new Tuple2(domain, logTextClass);
}

dataset
.sort("timestamp")
.javaRDD()
.mapToPair(TheClass::mapRDDToDomainLogPair)
.saveAsHadoopFile(hdfsTmpPath, String.class, Text.class,
PartitionedMultipleTextOutputFormat.class, GzipCodec.class);

Which seems a little bit hacky.
Does anyone have another better method?

Request for Assistance: Adding User Authentication to Apache Spark Application

2024-05-16 Thread NIKHIL RAJ SHRIVASTAVA

Dear Team,

I hope this email finds you well. My name is Nikhil Raj, and I am currently
working with Apache Spark for one of my projects , where through the help
of a parquet file we are creating an external table in Spark.

I am reaching out to seek assistance regarding user authentication for our
Apache Spark application. Currently, we can connect to the application
using only the host and port information. However, for security reasons, we
would like to implement user authentication to control access and ensure
data integrity.

After reviewing the available documentation and resources, I found that
adding user authentication to our Spark setup requires additional
configurations or plugins. However, I'm facing challenges in understanding
the exact steps or best practices to implement this.

Could you please provide guidance or point me towards relevant
documentation/resources that detail how to integrate user authentication
into Apache Spark?  Additionally, if there are any recommended practices or
considerations for ensuring the security of our Spark setup, we would
greatly appreciate your insights on that as well.

Your assistance in this matter would be invaluable to us, as we aim to
enhance the security of our Spark application and safeguard our data
effectively.

Thank you very much for your time and consideration. I look forward to
hearing from you and your suggestions.

Warm regards,

NIKHIL RAJ
Developer
Estuate Software Pvt. Ltd.
Thanks & Regards

[ANNOUNCE] Apache Spark 3.4.3 released

2024-04-18 Thread Dongjoon Hyun

We are happy to announce the availability of Apache Spark 3.4.3!

Spark 3.4.3 is a maintenance release containing many fixes including
security and correctness domains. This release is based on the
branch-3.4 maintenance branch of Spark. We strongly
recommend all 3.4 users to upgrade to this stable release.

To download Spark 3.4.3, head over to the download page:
https://spark.apache.org/downloads.html

To view the release notes:
https://spark.apache.org/releases/spark-release-3-4-3.html

We would like to acknowledge all community members for contributing to this
release. This release would not have been possible without you.

Dongjoon Hyun

Apache Spark integration with Spring Boot 3.0.0+

2024-03-28 Thread Szymon Kasperkiewicz

Hello,  I've got a project which has to use newest versions of both Apache 
Spark and Spring Boot due to vulnerabilities issues.  I build my project using 
Gradle. And when I try to run it i get :   Unsatisfied dependecy exception 
about javax/servlet/Servlet.  I've tried to add jakarta servlet, javax 
older version, etc. None of them worked.  The only solution which I saw was to 
downgrade Spring Boot but i can't do that unfortunatelly.  Is there any 
known option to use both Apache Spark and Spring Boot in project?  Best regards 
 Szymon

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-23 Thread Winston Lai

+1

--
Thank You & Best Regards
Winston Lai

From: Jay Han 
Date: Sunday, 24 March 2024 at 08:39
To: Kiran Kumar Dusi 
Cc: Farshid Ashouri , Matei Zaharia 
, Mich Talebzadeh , Spark 
dev list , user @spark 
Subject: Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark 
Community
+1. It sounds awesome!

Kiran Kumar Dusi mailto:kirankumard...@gmail.com>> 
于2024年3月21日周四 14:16写道：
+1

On Thu, 21 Mar 2024 at 7:46 AM, Farshid Ashouri 
mailto:farsheed.asho...@gmail.com>> wrote:
+1

On Mon, 18 Mar 2024, 11:00 Mich Talebzadeh, 
mailto:mich.talebza...@gmail.com>> wrote:
Some of you may be aware that Databricks community Home | Databricks
have just launched a knowledge sharing hub. I thought it would be a
good idea for the Apache Spark user group to have the same, especially
for repeat questions on Spark core, Spark SQL, Spark Structured
Streaming, Spark Mlib and so forth.

Apache Spark user and dev groups have been around for a good while.
They are serving their purpose . We went through creating a slack
community that managed to create more more heat than light.. This is
what Databricks community came up with and I quote

"Knowledge Sharing Hub
Dive into a collaborative space where members like YOU can exchange
knowledge, tips, and best practices. Join the conversation today and
unlock a wealth of collective wisdom to enhance your experience and
drive success."

I don't know the logistics of setting it up.but I am sure that should
not be that difficult. If anyone is supportive of this proposal, let
the usual +1, 0, -1 decide

HTH

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom

   view my Linkedin profile

 https://en.everybodywiki.com/Mich_Talebzadeh

Disclaimer: The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner Von Braun)".

-
To unsubscribe e-mail: 
user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-23 Thread Jay Han

+1. It sounds awesome!

Kiran Kumar Dusi  于2024年3月21日周四 14:16写道：

> +1
>
> On Thu, 21 Mar 2024 at 7:46 AM, Farshid Ashouri <
> farsheed.asho...@gmail.com> wrote:
>
>> +1
>>
>> On Mon, 18 Mar 2024, 11:00 Mich Talebzadeh, 
>> wrote:
>>
>>> Some of you may be aware that Databricks community Home | Databricks
>>> have just launched a knowledge sharing hub. I thought it would be a
>>> good idea for the Apache Spark user group to have the same, especially
>>> for repeat questions on Spark core, Spark SQL, Spark Structured
>>> Streaming, Spark Mlib and so forth.
>>>
>>> Apache Spark user and dev groups have been around for a good while.
>>> They are serving their purpose . We went through creating a slack
>>> community that managed to create more more heat than light.. This is
>>> what Databricks community came up with and I quote
>>>
>>> "Knowledge Sharing Hub
>>> Dive into a collaborative space where members like YOU can exchange
>>> knowledge, tips, and best practices. Join the conversation today and
>>> unlock a wealth of collective wisdom to enhance your experience and
>>> drive success."
>>>
>>> I don't know the logistics of setting it up.but I am sure that should
>>> not be that difficult. If anyone is supportive of this proposal, let
>>> the usual +1, 0, -1 decide
>>>
>>> HTH
>>>
>>> Mich Talebzadeh,
>>> Dad | Technologist | Solutions Architect | Engineer
>>> London
>>> United Kingdom
>>>
>>>
>>>view my Linkedin profile
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> Disclaimer: The information provided is correct to the best of my
>>> knowledge but of course cannot be guaranteed . It is essential to note
>>> that, as with any advice, quote "one test result is worth one-thousand
>>> expert opinions (Werner Von Braun)".
>>>
>>> -
>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>
>>>

Re: Feature article: Leveraging Generative AI with Apache Spark: Transforming Data Engineering

2024-03-22 Thread Mich Talebzadeh

Sorry from this link

Leveraging Generative AI with Apache Spark: Transforming Data Engineering |
LinkedIn
<https://www.linkedin.com/pulse/leveraging-generative-ai-apache-spark-transforming-mich-lxbte/?trackingId=aqZMBOg4O1KYRB4Una7NEg%3D%3D>

Mich Talebzadeh,
Technologist | Data | Generative AI | Financial Fraud
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


On Fri, 22 Mar 2024 at 16:16, Mich Talebzadeh 
wrote:

> You may find this link of mine in Linkedin for the said article. We
> can use Linkedin for now.
>
> Leveraging Generative AI with Apache Spark: Transforming Data
> Engineering | LinkedIn
>
>
> Mich Talebzadeh,
>
> Technologist | Data | Generative AI | Financial Fraud
>
> London
> United Kingdom
>
>
>view my Linkedin profile
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> Disclaimer: The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner Von Braun)".
>

Feature article: Leveraging Generative AI with Apache Spark: Transforming Data Engineering

2024-03-22 Thread Mich Talebzadeh

You may find this link of mine in Linkedin for the said article. We
can use Linkedin for now.

Leveraging Generative AI with Apache Spark: Transforming Data
Engineering | LinkedIn


Mich Talebzadeh,

Technologist | Data | Generative AI | Financial Fraud

London
United Kingdom


   view my Linkedin profile


 https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner Von Braun)".

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-20 Thread Kiran Kumar Dusi

+1

On Thu, 21 Mar 2024 at 7:46 AM, Farshid Ashouri 
wrote:

> +1
>
> On Mon, 18 Mar 2024, 11:00 Mich Talebzadeh, 
> wrote:
>
>> Some of you may be aware that Databricks community Home | Databricks
>> have just launched a knowledge sharing hub. I thought it would be a
>> good idea for the Apache Spark user group to have the same, especially
>> for repeat questions on Spark core, Spark SQL, Spark Structured
>> Streaming, Spark Mlib and so forth.
>>
>> Apache Spark user and dev groups have been around for a good while.
>> They are serving their purpose . We went through creating a slack
>> community that managed to create more more heat than light.. This is
>> what Databricks community came up with and I quote
>>
>> "Knowledge Sharing Hub
>> Dive into a collaborative space where members like YOU can exchange
>> knowledge, tips, and best practices. Join the conversation today and
>> unlock a wealth of collective wisdom to enhance your experience and
>> drive success."
>>
>> I don't know the logistics of setting it up.but I am sure that should
>> not be that difficult. If anyone is supportive of this proposal, let
>> the usual +1, 0, -1 decide
>>
>> HTH
>>
>> Mich Talebzadeh,
>> Dad | Technologist | Solutions Architect | Engineer
>> London
>> United Kingdom
>>
>>
>>view my Linkedin profile
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> Disclaimer: The information provided is correct to the best of my
>> knowledge but of course cannot be guaranteed . It is essential to note
>> that, as with any advice, quote "one test result is worth one-thousand
>> expert opinions (Werner Von Braun)".
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-20 Thread Farshid Ashouri

+1

On Mon, 18 Mar 2024, 11:00 Mich Talebzadeh, 
wrote:

> Some of you may be aware that Databricks community Home | Databricks
> have just launched a knowledge sharing hub. I thought it would be a
> good idea for the Apache Spark user group to have the same, especially
> for repeat questions on Spark core, Spark SQL, Spark Structured
> Streaming, Spark Mlib and so forth.
>
> Apache Spark user and dev groups have been around for a good while.
> They are serving their purpose . We went through creating a slack
> community that managed to create more more heat than light.. This is
> what Databricks community came up with and I quote
>
> "Knowledge Sharing Hub
> Dive into a collaborative space where members like YOU can exchange
> knowledge, tips, and best practices. Join the conversation today and
> unlock a wealth of collective wisdom to enhance your experience and
> drive success."
>
> I don't know the logistics of setting it up.but I am sure that should
> not be that difficult. If anyone is supportive of this proposal, let
> the usual +1, 0, -1 decide
>
> HTH
>
> Mich Talebzadeh,
> Dad | Technologist | Solutions Architect | Engineer
> London
> United Kingdom
>
>
>view my Linkedin profile
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> Disclaimer: The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner Von Braun)".
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-19 Thread Mich Talebzadeh

One option that comes to my mind, is that given the cyclic nature of these
types of proposals in these two forums, we should be able to use
Databricks's existing knowledge sharing hub Knowledge Sharing Hub -
Databricks
<https://community.databricks.com/t5/knowledge-sharing-hub/bd-p/Knowledge-Sharing-Hub>
as well.

The majority of topics will be of interest to their audience as well. In
addition, they seem to invite everyone to contribute. Unless you have an
overriding concern why we should not take this approach, I can enquire from
Databricks community managers whether they can entertain this idea. They
seem to have a well defined structure for hosting topics.

Let me know your thoughts

Thanks
<https://community.databricks.com/t5/knowledge-sharing-hub/bd-p/Knowledge-Sharing-Hub>
Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


On Tue, 19 Mar 2024 at 08:25, Joris Billen 
wrote:

> +1
>
>
> On 18 Mar 2024, at 21:53, Mich Talebzadeh 
> wrote:
>
> Well as long as it works.
>
> Please all check this link from Databricks and let us know your thoughts.
> Will something similar work for us?. Of course Databricks have much deeper
> pockets than our ASF community. Will it require moderation in our side to
> block spams and nutcases.
>
> Knowledge Sharing Hub - Databricks
> <https://community.databricks.com/t5/knowledge-sharing-hub/bd-p/Knowledge-Sharing-Hub>
>
>
> Mich Talebzadeh,
> Dad | Technologist | Solutions Architect | Engineer
> London
> United Kingdom
>
>view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
> Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>
>
> On Mon, 18 Mar 2024 at 20:31, Bjørn Jørgensen 
> wrote:
>
>> something like this  Spark community · GitHub
>> <https://github.com/Spark-community>
>>
>>
>> man. 18. mars 2024 kl. 17:26 skrev Parsian, Mahmoud
>> :
>>
>>> Good idea. Will be useful
>>>
>>>
>>>
>>> +1
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *From: *ashok34...@yahoo.com.INVALID 
>>> *Date: *Monday, March 18, 2024 at 6:36 AM
>>> *To: *user @spark , Spark dev list <
>>> d...@spark.apache.org>, Mich Talebzadeh 
>>> *Cc: *Matei Zaharia 
>>> *Subject: *Re: A proposal for creating a Knowledge Sharing Hub for
>>> Apache Spark Community
>>>
>>> External message, be mindful when clicking links or attachments
>>>
>>>
>>>
>>> Good idea. Will be useful
>>>
>>>
>>>
>>> +1
>>>
>>>
>>>
>>> On Monday, 18 March 2024 at 11:00:40 GMT, Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
>>>
>>>
>>>
>>>
>>> Some of you may be aware that Databricks community Home | Databricks
>>>
>>> have just launched a knowledge sharing hub. I thought it would be a
>>>
>>> good idea for the Apache Spark user group to have the same, especially
>>>
>>> for repeat questions on Spark core, Spark SQL, Spark Structured
>>>
>>> Streaming, Spark Mlib and so forth.
>>>
>>>
>>>
>>> Apache Spark user and dev groups have been around for a good while.
>>>
>>> They are serving their purpose . We went through creating a slack
>>>
>>> community that managed to create more more heat than light.. This is
>>>
>>> what Databricks community came up with and I quote
>>>
>>>
>>>
>>> "Knowledge Sharing Hub
>>>
>>> Dive into a collaborative space where members like YOU c

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-19 Thread Joris Billen

+1


On 18 Mar 2024, at 21:53, Mich Talebzadeh  wrote:

Well as long as it works.

Please all check this link from Databricks and let us know your thoughts. Will 
something similar work for us?. Of course Databricks have much deeper pockets 
than our ASF community. Will it require moderation in our side to block spams 
and nutcases.

Knowledge Sharing Hub - 
Databricks<https://community.databricks.com/t5/knowledge-sharing-hub/bd-p/Knowledge-Sharing-Hub>


Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom

 
[https://ci3.googleusercontent.com/mail-sig/AIorK4zholKucR2Q9yMrKbHNn-o1TuS4mYXyi2KO6Xmx6ikHPySa9MLaLZ8t2hrA6AUcxSxDgHIwmKE]
   view my Linkedin 
profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

 https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: The information provided is correct to the best of my knowledge but 
of course cannot be guaranteed . It is essential to note that, as with any 
advice, quote "one test result is worth one-thousand expert opinions (Werner 
<https://en.wikipedia.org/wiki/Wernher_von_Braun> Von 
Braun<https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


On Mon, 18 Mar 2024 at 20:31, Bjørn Jørgensen 
mailto:bjornjorgen...@gmail.com>> wrote:
something like this  Spark community · 
GitHub<https://github.com/Spark-community>


man. 18. mars 2024 kl. 17:26 skrev Parsian, Mahmoud 
:
Good idea. Will be useful

+1



From: ashok34...@yahoo.com.INVALID 
Date: Monday, March 18, 2024 at 6:36 AM
To: user @spark mailto:user@spark.apache.org>>, Spark 
dev list mailto:d...@spark.apache.org>>, Mich Talebzadeh 
mailto:mich.talebza...@gmail.com>>
Cc: Matei Zaharia mailto:matei.zaha...@gmail.com>>
Subject: Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark 
Community
External message, be mindful when clicking links or attachments

Good idea. Will be useful

+1

On Monday, 18 March 2024 at 11:00:40 GMT, Mich Talebzadeh 
mailto:mich.talebza...@gmail.com>> wrote:


Some of you may be aware that Databricks community Home | Databricks
have just launched a knowledge sharing hub. I thought it would be a
good idea for the Apache Spark user group to have the same, especially
for repeat questions on Spark core, Spark SQL, Spark Structured
Streaming, Spark Mlib and so forth.

Apache Spark user and dev groups have been around for a good while.
They are serving their purpose . We went through creating a slack
community that managed to create more more heat than light.. This is
what Databricks community came up with and I quote

"Knowledge Sharing Hub
Dive into a collaborative space where members like YOU can exchange
knowledge, tips, and best practices. Join the conversation today and
unlock a wealth of collective wisdom to enhance your experience and
drive success."

I don't know the logistics of setting it up.but I am sure that should
not be that difficult. If anyone is supportive of this proposal, let
the usual +1, 0, -1 decide

HTH

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


  view my Linkedin profile


https://en.everybodywiki.com/Mich_Talebzadeh<https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!HrbR-XT-OQ!Wu9fFP8RFJW2N_YUvwl9yctGHxtM-CFPe6McqOJDrxGBjIaRoF8vRwpjT9WzHojwI2R09Nbg8YE9ggB4FtocU8cQFw$>



Disclaimer: The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner Von Braun)".

-
To unsubscribe e-mail: 
user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>



--
Bjørn Jørgensen
Vestre Aspehaug 4, 6010 Ålesund
Norge

+47 480 94 297

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Varun Shah

+1  Great initiative.

QQ : Stack overflow has a similar feature called "Collectives", but I am
not sure of the expenses to create one for Apache Spark. With SO being used
( atleast before ChatGPT became quite the norm for searching questions), it
already has a lot of questions asked and answered by the community over a
period of time and hence, if possible, we could leverage it as the starting
point for building a community before creating a complete new website from
scratch. Any thoughts on this?

Regards,
Varun Shah


On Mon, Mar 18, 2024, 16:29 Mich Talebzadeh 
wrote:

> Some of you may be aware that Databricks community Home | Databricks
> have just launched a knowledge sharing hub. I thought it would be a
> good idea for the Apache Spark user group to have the same, especially
> for repeat questions on Spark core, Spark SQL, Spark Structured
> Streaming, Spark Mlib and so forth.
>
> Apache Spark user and dev groups have been around for a good while.
> They are serving their purpose . We went through creating a slack
> community that managed to create more more heat than light.. This is
> what Databricks community came up with and I quote
>
> "Knowledge Sharing Hub
> Dive into a collaborative space where members like YOU can exchange
> knowledge, tips, and best practices. Join the conversation today and
> unlock a wealth of collective wisdom to enhance your experience and
> drive success."
>
> I don't know the logistics of setting it up.but I am sure that should
> not be that difficult. If anyone is supportive of this proposal, let
> the usual +1, 0, -1 decide
>
> HTH
>
> Mich Talebzadeh,
> Dad | Technologist | Solutions Architect | Engineer
> London
> United Kingdom
>
>
>view my Linkedin profile
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> Disclaimer: The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner Von Braun)".
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Deepak Sharma

+1 .
I can contribute to it as well .

On Tue, 19 Mar 2024 at 9:19 AM, Code Tutelage 
wrote:

> +1
>
> Thanks for proposing
>
> On Mon, Mar 18, 2024 at 9:25 AM Parsian, Mahmoud
>  wrote:
>
>> Good idea. Will be useful
>>
>>
>>
>> +1
>>
>>
>>
>>
>>
>>
>>
>> *From: *ashok34...@yahoo.com.INVALID 
>> *Date: *Monday, March 18, 2024 at 6:36 AM
>> *To: *user @spark , Spark dev list <
>> d...@spark.apache.org>, Mich Talebzadeh 
>> *Cc: *Matei Zaharia 
>> *Subject: *Re: A proposal for creating a Knowledge Sharing Hub for
>> Apache Spark Community
>>
>> External message, be mindful when clicking links or attachments
>>
>>
>>
>> Good idea. Will be useful
>>
>>
>>
>> +1
>>
>>
>>
>> On Monday, 18 March 2024 at 11:00:40 GMT, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>
>>
>>
>>
>> Some of you may be aware that Databricks community Home | Databricks
>>
>> have just launched a knowledge sharing hub. I thought it would be a
>>
>> good idea for the Apache Spark user group to have the same, especially
>>
>> for repeat questions on Spark core, Spark SQL, Spark Structured
>>
>> Streaming, Spark Mlib and so forth.
>>
>>
>>
>> Apache Spark user and dev groups have been around for a good while.
>>
>> They are serving their purpose . We went through creating a slack
>>
>> community that managed to create more more heat than light.. This is
>>
>> what Databricks community came up with and I quote
>>
>>
>>
>> "Knowledge Sharing Hub
>>
>> Dive into a collaborative space where members like YOU can exchange
>>
>> knowledge, tips, and best practices. Join the conversation today and
>>
>> unlock a wealth of collective wisdom to enhance your experience and
>>
>> drive success."
>>
>>
>>
>> I don't know the logistics of setting it up.but I am sure that should
>>
>> not be that difficult. If anyone is supportive of this proposal, let
>>
>> the usual +1, 0, -1 decide
>>
>>
>>
>> HTH
>>
>>
>>
>> Mich Talebzadeh,
>>
>> Dad | Technologist | Solutions Architect | Engineer
>>
>> London
>>
>> United Kingdom
>>
>>
>>
>>
>>
>>   view my Linkedin profile
>>
>>
>>
>>
>>
>> https://en.everybodywiki.com/Mich_Talebzadeh
>> <https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!HrbR-XT-OQ!Wu9fFP8RFJW2N_YUvwl9yctGHxtM-CFPe6McqOJDrxGBjIaRoF8vRwpjT9WzHojwI2R09Nbg8YE9ggB4FtocU8cQFw$>
>>
>>
>>
>>
>>
>>
>>
>> Disclaimer: The information provided is correct to the best of my
>>
>> knowledge but of course cannot be guaranteed . It is essential to note
>>
>> that, as with any advice, quote "one test result is worth one-thousand
>>
>> expert opinions (Werner Von Braun)".
>>
>>
>>
>> -
>>
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>>
>

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Hyukjin Kwon

One very good example is SparkR releases in Conda channel (
https://github.com/conda-forge/r-sparkr-feedstock).
This is fully run by the community unofficially.

On Tue, 19 Mar 2024 at 09:54, Mich Talebzadeh 
wrote:

> +1 for me
>
> Mich Talebzadeh,
> Dad | Technologist | Solutions Architect | Engineer
> London
> United Kingdom
>
>
>view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
> Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>
>
> On Mon, 18 Mar 2024 at 16:23, Parsian, Mahmoud 
> wrote:
>
>> Good idea. Will be useful
>>
>>
>>
>> +1
>>
>>
>>
>>
>>
>>
>>
>> *From: *ashok34...@yahoo.com.INVALID 
>> *Date: *Monday, March 18, 2024 at 6:36 AM
>> *To: *user @spark , Spark dev list <
>> d...@spark.apache.org>, Mich Talebzadeh 
>> *Cc: *Matei Zaharia 
>> *Subject: *Re: A proposal for creating a Knowledge Sharing Hub for
>> Apache Spark Community
>>
>> External message, be mindful when clicking links or attachments
>>
>>
>>
>> Good idea. Will be useful
>>
>>
>>
>> +1
>>
>>
>>
>> On Monday, 18 March 2024 at 11:00:40 GMT, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>
>>
>>
>>
>> Some of you may be aware that Databricks community Home | Databricks
>>
>> have just launched a knowledge sharing hub. I thought it would be a
>>
>> good idea for the Apache Spark user group to have the same, especially
>>
>> for repeat questions on Spark core, Spark SQL, Spark Structured
>>
>> Streaming, Spark Mlib and so forth.
>>
>>
>>
>> Apache Spark user and dev groups have been around for a good while.
>>
>> They are serving their purpose . We went through creating a slack
>>
>> community that managed to create more more heat than light.. This is
>>
>> what Databricks community came up with and I quote
>>
>>
>>
>> "Knowledge Sharing Hub
>>
>> Dive into a collaborative space where members like YOU can exchange
>>
>> knowledge, tips, and best practices. Join the conversation today and
>>
>> unlock a wealth of collective wisdom to enhance your experience and
>>
>> drive success."
>>
>>
>>
>> I don't know the logistics of setting it up.but I am sure that should
>>
>> not be that difficult. If anyone is supportive of this proposal, let
>>
>> the usual +1, 0, -1 decide
>>
>>
>>
>> HTH
>>
>>
>>
>> Mich Talebzadeh,
>>
>> Dad | Technologist | Solutions Architect | Engineer
>>
>> London
>>
>> United Kingdom
>>
>>
>>
>>
>>
>>   view my Linkedin profile
>>
>>
>>
>>
>>
>> https://en.everybodywiki.com/Mich_Talebzadeh
>> <https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!HrbR-XT-OQ!Wu9fFP8RFJW2N_YUvwl9yctGHxtM-CFPe6McqOJDrxGBjIaRoF8vRwpjT9WzHojwI2R09Nbg8YE9ggB4FtocU8cQFw$>
>>
>>
>>
>>
>>
>>
>>
>> Disclaimer: The information provided is correct to the best of my
>>
>> knowledge but of course cannot be guaranteed . It is essential to note
>>
>> that, as with any advice, quote "one test result is worth one-thousand
>>
>> expert opinions (Werner Von Braun)".
>>
>>
>>
>> -
>>
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>>
>

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Mich Talebzadeh

OK thanks for the update.

What does officially blessed signify here? Can we have and run it as a
sister site? The reason this comes to my mind is that the interested
parties should have easy access to this site (from ISUG Spark sites) as a
reference repository. I guess the advice would be that the information
(topics) are provided as best efforts and cannot be guaranteed.

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


On Mon, 18 Mar 2024 at 21:04, Reynold Xin  wrote:

> One of the problem in the past when something like this was brought up was
> that the ASF couldn't have officially blessed venues beyond the already
> approved ones. So that's something to look into.
>
> Now of course you are welcome to run unofficial things unblessed as long
> as they follow trademark rules.
>
>
>
> On Mon, Mar 18, 2024 at 1:53 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Well as long as it works.
>>
>> Please all check this link from Databricks and let us know your thoughts.
>> Will something similar work for us?. Of course Databricks have much deeper
>> pockets than our ASF community. Will it require moderation in our side to
>> block spams and nutcases.
>>
>> Knowledge Sharing Hub - Databricks
>> <https://community.databricks.com/t5/knowledge-sharing-hub/bd-p/Knowledge-Sharing-Hub>
>>
>>
>> Mich Talebzadeh,
>> Dad | Technologist | Solutions Architect | Engineer
>> London
>> United Kingdom
>>
>>
>>view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* The information provided is correct to the best of my
>> knowledge but of course cannot be guaranteed . It is essential to note
>> that, as with any advice, quote "one test result is worth one-thousand
>> expert opinions (Werner
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>
>>
>> On Mon, 18 Mar 2024 at 20:31, Bjørn Jørgensen 
>> wrote:
>>
>>> something like this  Spark community · GitHub
>>> <https://github.com/Spark-community>
>>>
>>>
>>> man. 18. mars 2024 kl. 17:26 skrev Parsian, Mahmoud <
>>> mpars...@illumina.com.invalid>:
>>>
>>>> Good idea. Will be useful
>>>>
>>>>
>>>>
>>>> +1
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From: *ashok34...@yahoo.com.INVALID 
>>>> *Date: *Monday, March 18, 2024 at 6:36 AM
>>>> *To: *user @spark , Spark dev list <
>>>> d...@spark.apache.org>, Mich Talebzadeh 
>>>> *Cc: *Matei Zaharia 
>>>> *Subject: *Re: A proposal for creating a Knowledge Sharing Hub for
>>>> Apache Spark Community
>>>>
>>>> External message, be mindful when clicking links or attachments
>>>>
>>>>
>>>>
>>>> Good idea. Will be useful
>>>>
>>>>
>>>>
>>>> +1
>>>>
>>>>
>>>>
>>>> On Monday, 18 March 2024 at 11:00:40 GMT, Mich Talebzadeh <
>>>> mich.talebza...@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Some of you may be aware that Databricks community Home | Databricks
>>>>
>>>> have just launched a knowledge sharing hub. I thought it would be a
>>>>
>>>> good idea for the Apache Spark user group to have the same, especially
>>>>
>>>> for repeat questions on Spark core, Spark SQL, Spark Structured
>>>>
>>>> Streaming, Spark Mlib and so forth.
>>>>
>>>>
>>>>
>>>> Apache Spark user and dev groups have been around for a good while.
>>>>
>>>> They are serving their purpose . We went

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Reynold Xin

One of the problem in the past when something like this was brought up was that 
the ASF couldn't have officially blessed venues beyond the already approved 
ones. So that's something to look into.

Now of course you are welcome to run unofficial things unblessed as long as 
they follow trademark rules.

On Mon, Mar 18, 2024 at 1:53 PM, Mich Talebzadeh < mich.talebza...@gmail.com > 
wrote:

> 
> Well as long as it works.
> 
> Please all check this link from Databricks and let us know your thoughts.
> Will something similar work for us?. Of course Databricks have much deeper
> pockets than our ASF community. Will it require moderation in our side to
> block spams and nutcases.
> 
> 
> 
> Knowledge Sharing Hub - Databricks (
> https://community.databricks.com/t5/knowledge-sharing-hub/bd-p/Knowledge-Sharing-Hub
> )
> 
> 
> 
> Mich Talebzadeh,
> Dad | Technologist | Solutions Architect | Engineer
> 
> London
> 
> United Kingdom
> 
> 
> 
> 
> 
> 
> 
> ** view my Linkedin profile (
> https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/ )
> 
> 
> 
> 
> 
> 
> 
> 
> https:/ / en. everybodywiki. com/ Mich_Talebzadeh (
> https://en.everybodywiki.com/Mich_Talebzadeh )
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one - thousand
> expert opinions ( Werner ( https://en.wikipedia.org/wiki/Wernher_von_Braun
> ) Von Braun ( https://en.wikipedia.org/wiki/Wernher_von_Braun ) )".
> 
> 
> 
> 
> 
> On Mon, 18 Mar 2024 at 20:31, Bjørn Jørgensen < bjornjorgensen@ gmail. com
> ( bjornjorgen...@gmail.com ) > wrote:
> 
> 
>> something like this Spark community · GitHub (
>> https://github.com/Spark-community )
>> 
>> 
>> 
>> man. 18. mars 2024 kl. 17:26 skrev Parsian, Mahmoud < mparsian@ illumina. 
>> com.
>> invalid ( mpars...@illumina.com.invalid ) >:
>> 
>> 
>>> 
>>> 
>>> Good idea. Will be useful
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> +1
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> *From:* ashok34668@ yahoo. com. INVALID ( ashok34...@yahoo.com.INVALID ) <
>>> ashok34668@ yahoo. com. INVALID ( ashok34...@yahoo.com.INVALID ) >
>>> *Date:* Monday, March 18 , 2024 at 6:36 AM
>>> *To:* user @spark < user@ spark. apache. org ( user@spark.apache.org ) >,
>>> Spark dev list < dev@ spark. apache. org ( d...@spark.apache.org ) >, Mich
>>> Talebzadeh < mich. talebzadeh@ gmail. com ( mich.talebza...@gmail.com ) >
>>> *Cc:* Matei Zaharia < matei. zaharia@ gmail. com ( matei.zaha...@gmail.com
>>> ) >
>>> *Subject:* Re: A proposal for creating a Knowledge Sharing Hub for Apache
>>> Spark Community
>>> 
>>> 
>>> 
>>> 
>>> External message, be mindful when clicking links or attachments
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Good idea. Will be useful
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> +1
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Monday, 18 March 2024 at 11:00:40 GMT, Mich Talebzadeh < mich. 
>>> talebzadeh@
>>> gmail. com ( mich.talebza...@gmail.com ) > wrote:
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Some of you may be aware that Databricks community Home | Databricks
>>> 
>>> 
>>> 
>>> 
>>> have just launched a knowledge sharing hub. I thought it would be a
>>> 
>>> 
>>> 
>>> 
>>> good idea for the Apache Spark user group to have the same, especially
>>> 
>>> 
>>> 
>>> 
>>> for repeat questions on Spark core, Spark SQL, Spark Structured
>>> 
>>> 
>>> 
>>> 
>>> Streaming, Spark Mlib and so forth.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>>

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Mich Talebzadeh

Well as long as it works.

Please all check this link from Databricks and let us know your thoughts.
Will something similar work for us?. Of course Databricks have much deeper
pockets than our ASF community. Will it require moderation in our side to
block spams and nutcases.

Knowledge Sharing Hub - Databricks
<https://community.databricks.com/t5/knowledge-sharing-hub/bd-p/Knowledge-Sharing-Hub>


Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


On Mon, 18 Mar 2024 at 20:31, Bjørn Jørgensen 
wrote:

> something like this  Spark community · GitHub
> <https://github.com/Spark-community>
>
>
> man. 18. mars 2024 kl. 17:26 skrev Parsian, Mahmoud
> :
>
>> Good idea. Will be useful
>>
>>
>>
>> +1
>>
>>
>>
>>
>>
>>
>>
>> *From: *ashok34...@yahoo.com.INVALID 
>> *Date: *Monday, March 18, 2024 at 6:36 AM
>> *To: *user @spark , Spark dev list <
>> d...@spark.apache.org>, Mich Talebzadeh 
>> *Cc: *Matei Zaharia 
>> *Subject: *Re: A proposal for creating a Knowledge Sharing Hub for
>> Apache Spark Community
>>
>> External message, be mindful when clicking links or attachments
>>
>>
>>
>> Good idea. Will be useful
>>
>>
>>
>> +1
>>
>>
>>
>> On Monday, 18 March 2024 at 11:00:40 GMT, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>
>>
>>
>>
>> Some of you may be aware that Databricks community Home | Databricks
>>
>> have just launched a knowledge sharing hub. I thought it would be a
>>
>> good idea for the Apache Spark user group to have the same, especially
>>
>> for repeat questions on Spark core, Spark SQL, Spark Structured
>>
>> Streaming, Spark Mlib and so forth.
>>
>>
>>
>> Apache Spark user and dev groups have been around for a good while.
>>
>> They are serving their purpose . We went through creating a slack
>>
>> community that managed to create more more heat than light.. This is
>>
>> what Databricks community came up with and I quote
>>
>>
>>
>> "Knowledge Sharing Hub
>>
>> Dive into a collaborative space where members like YOU can exchange
>>
>> knowledge, tips, and best practices. Join the conversation today and
>>
>> unlock a wealth of collective wisdom to enhance your experience and
>>
>> drive success."
>>
>>
>>
>> I don't know the logistics of setting it up.but I am sure that should
>>
>> not be that difficult. If anyone is supportive of this proposal, let
>>
>> the usual +1, 0, -1 decide
>>
>>
>>
>> HTH
>>
>>
>>
>> Mich Talebzadeh,
>>
>> Dad | Technologist | Solutions Architect | Engineer
>>
>> London
>>
>> United Kingdom
>>
>>
>>
>>
>>
>>   view my Linkedin profile
>>
>>
>>
>>
>>
>> https://en.everybodywiki.com/Mich_Talebzadeh
>> <https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!HrbR-XT-OQ!Wu9fFP8RFJW2N_YUvwl9yctGHxtM-CFPe6McqOJDrxGBjIaRoF8vRwpjT9WzHojwI2R09Nbg8YE9ggB4FtocU8cQFw$>
>>
>>
>>
>>
>>
>>
>>
>> Disclaimer: The information provided is correct to the best of my
>>
>> knowledge but of course cannot be guaranteed . It is essential to note
>>
>> that, as with any advice, quote "one test result is worth one-thousand
>>
>> expert opinions (Werner Von Braun)".
>>
>>
>>
>> -
>>
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>>
>
>
> --
> Bjørn Jørgensen
> Vestre Aspehaug 4, 6010 Ålesund
> Norge
>
> +47 480 94 297
>

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Bjørn Jørgensen

something like this  Spark community · GitHub
<https://github.com/Spark-community>


man. 18. mars 2024 kl. 17:26 skrev Parsian, Mahmoud
:

> Good idea. Will be useful
>
>
>
> +1
>
>
>
>
>
>
>
> *From: *ashok34...@yahoo.com.INVALID 
> *Date: *Monday, March 18, 2024 at 6:36 AM
> *To: *user @spark , Spark dev list <
> d...@spark.apache.org>, Mich Talebzadeh 
> *Cc: *Matei Zaharia 
> *Subject: *Re: A proposal for creating a Knowledge Sharing Hub for Apache
> Spark Community
>
> External message, be mindful when clicking links or attachments
>
>
>
> Good idea. Will be useful
>
>
>
> +1
>
>
>
> On Monday, 18 March 2024 at 11:00:40 GMT, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>
>
>
>
> Some of you may be aware that Databricks community Home | Databricks
>
> have just launched a knowledge sharing hub. I thought it would be a
>
> good idea for the Apache Spark user group to have the same, especially
>
> for repeat questions on Spark core, Spark SQL, Spark Structured
>
> Streaming, Spark Mlib and so forth.
>
>
>
> Apache Spark user and dev groups have been around for a good while.
>
> They are serving their purpose . We went through creating a slack
>
> community that managed to create more more heat than light.. This is
>
> what Databricks community came up with and I quote
>
>
>
> "Knowledge Sharing Hub
>
> Dive into a collaborative space where members like YOU can exchange
>
> knowledge, tips, and best practices. Join the conversation today and
>
> unlock a wealth of collective wisdom to enhance your experience and
>
> drive success."
>
>
>
> I don't know the logistics of setting it up.but I am sure that should
>
> not be that difficult. If anyone is supportive of this proposal, let
>
> the usual +1, 0, -1 decide
>
>
>
> HTH
>
>
>
> Mich Talebzadeh,
>
> Dad | Technologist | Solutions Architect | Engineer
>
> London
>
> United Kingdom
>
>
>
>
>
>   view my Linkedin profile
>
>
>
>
>
> https://en.everybodywiki.com/Mich_Talebzadeh
> <https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!HrbR-XT-OQ!Wu9fFP8RFJW2N_YUvwl9yctGHxtM-CFPe6McqOJDrxGBjIaRoF8vRwpjT9WzHojwI2R09Nbg8YE9ggB4FtocU8cQFw$>
>
>
>
>
>
>
>
> Disclaimer: The information provided is correct to the best of my
>
> knowledge but of course cannot be guaranteed . It is essential to note
>
> that, as with any advice, quote "one test result is worth one-thousand
>
> expert opinions (Werner Von Braun)".
>
>
>
> -
>
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>
>


-- 
Bjørn Jørgensen
Vestre Aspehaug 4, 6010 Ålesund
Norge

+47 480 94 297

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Code Tutelage

+1

Thanks for proposing

On Mon, Mar 18, 2024 at 9:25 AM Parsian, Mahmoud
 wrote:

> Good idea. Will be useful
>
>
>
> +1
>
>
>
>
>
>
>
> *From: *ashok34...@yahoo.com.INVALID 
> *Date: *Monday, March 18, 2024 at 6:36 AM
> *To: *user @spark , Spark dev list <
> d...@spark.apache.org>, Mich Talebzadeh 
> *Cc: *Matei Zaharia 
> *Subject: *Re: A proposal for creating a Knowledge Sharing Hub for Apache
> Spark Community
>
> External message, be mindful when clicking links or attachments
>
>
>
> Good idea. Will be useful
>
>
>
> +1
>
>
>
> On Monday, 18 March 2024 at 11:00:40 GMT, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>
>
>
>
> Some of you may be aware that Databricks community Home | Databricks
>
> have just launched a knowledge sharing hub. I thought it would be a
>
> good idea for the Apache Spark user group to have the same, especially
>
> for repeat questions on Spark core, Spark SQL, Spark Structured
>
> Streaming, Spark Mlib and so forth.
>
>
>
> Apache Spark user and dev groups have been around for a good while.
>
> They are serving their purpose . We went through creating a slack
>
> community that managed to create more more heat than light.. This is
>
> what Databricks community came up with and I quote
>
>
>
> "Knowledge Sharing Hub
>
> Dive into a collaborative space where members like YOU can exchange
>
> knowledge, tips, and best practices. Join the conversation today and
>
> unlock a wealth of collective wisdom to enhance your experience and
>
> drive success."
>
>
>
> I don't know the logistics of setting it up.but I am sure that should
>
> not be that difficult. If anyone is supportive of this proposal, let
>
> the usual +1, 0, -1 decide
>
>
>
> HTH
>
>
>
> Mich Talebzadeh,
>
> Dad | Technologist | Solutions Architect | Engineer
>
> London
>
> United Kingdom
>
>
>
>
>
>   view my Linkedin profile
>
>
>
>
>
> https://en.everybodywiki.com/Mich_Talebzadeh
> <https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!HrbR-XT-OQ!Wu9fFP8RFJW2N_YUvwl9yctGHxtM-CFPe6McqOJDrxGBjIaRoF8vRwpjT9WzHojwI2R09Nbg8YE9ggB4FtocU8cQFw$>
>
>
>
>
>
>
>
> Disclaimer: The information provided is correct to the best of my
>
> knowledge but of course cannot be guaranteed . It is essential to note
>
> that, as with any advice, quote "one test result is worth one-thousand
>
> expert opinions (Werner Von Braun)".
>
>
>
> -
>
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>
>

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Mich Talebzadeh

+1 for me

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


On Mon, 18 Mar 2024 at 16:23, Parsian, Mahmoud 
wrote:

> Good idea. Will be useful
>
>
>
> +1
>
>
>
>
>
>
>
> *From: *ashok34...@yahoo.com.INVALID 
> *Date: *Monday, March 18, 2024 at 6:36 AM
> *To: *user @spark , Spark dev list <
> d...@spark.apache.org>, Mich Talebzadeh 
> *Cc: *Matei Zaharia 
> *Subject: *Re: A proposal for creating a Knowledge Sharing Hub for Apache
> Spark Community
>
> External message, be mindful when clicking links or attachments
>
>
>
> Good idea. Will be useful
>
>
>
> +1
>
>
>
> On Monday, 18 March 2024 at 11:00:40 GMT, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>
>
>
>
> Some of you may be aware that Databricks community Home | Databricks
>
> have just launched a knowledge sharing hub. I thought it would be a
>
> good idea for the Apache Spark user group to have the same, especially
>
> for repeat questions on Spark core, Spark SQL, Spark Structured
>
> Streaming, Spark Mlib and so forth.
>
>
>
> Apache Spark user and dev groups have been around for a good while.
>
> They are serving their purpose . We went through creating a slack
>
> community that managed to create more more heat than light.. This is
>
> what Databricks community came up with and I quote
>
>
>
> "Knowledge Sharing Hub
>
> Dive into a collaborative space where members like YOU can exchange
>
> knowledge, tips, and best practices. Join the conversation today and
>
> unlock a wealth of collective wisdom to enhance your experience and
>
> drive success."
>
>
>
> I don't know the logistics of setting it up.but I am sure that should
>
> not be that difficult. If anyone is supportive of this proposal, let
>
> the usual +1, 0, -1 decide
>
>
>
> HTH
>
>
>
> Mich Talebzadeh,
>
> Dad | Technologist | Solutions Architect | Engineer
>
> London
>
> United Kingdom
>
>
>
>
>
>   view my Linkedin profile
>
>
>
>
>
> https://en.everybodywiki.com/Mich_Talebzadeh
> <https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!HrbR-XT-OQ!Wu9fFP8RFJW2N_YUvwl9yctGHxtM-CFPe6McqOJDrxGBjIaRoF8vRwpjT9WzHojwI2R09Nbg8YE9ggB4FtocU8cQFw$>
>
>
>
>
>
>
>
> Disclaimer: The information provided is correct to the best of my
>
> knowledge but of course cannot be guaranteed . It is essential to note
>
> that, as with any advice, quote "one test result is worth one-thousand
>
> expert opinions (Werner Von Braun)".
>
>
>
> -
>
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>
>

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Parsian, Mahmoud

Good idea. Will be useful

+1

From: ashok34...@yahoo.com.INVALID 
Date: Monday, March 18, 2024 at 6:36 AM
To: user @spark , Spark dev list 
, Mich Talebzadeh 
Cc: Matei Zaharia 
Subject: Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark 
Community
External message, be mindful when clicking links or attachments

Good idea. Will be useful

+1

On Monday, 18 March 2024 at 11:00:40 GMT, Mich Talebzadeh 
 wrote:

Some of you may be aware that Databricks community Home | Databricks
have just launched a knowledge sharing hub. I thought it would be a
good idea for the Apache Spark user group to have the same, especially
for repeat questions on Spark core, Spark SQL, Spark Structured
Streaming, Spark Mlib and so forth.

Apache Spark user and dev groups have been around for a good while.
They are serving their purpose . We went through creating a slack
community that managed to create more more heat than light.. This is
what Databricks community came up with and I quote

"Knowledge Sharing Hub
Dive into a collaborative space where members like YOU can exchange
knowledge, tips, and best practices. Join the conversation today and
unlock a wealth of collective wisdom to enhance your experience and
drive success."

I don't know the logistics of setting it up.but I am sure that should
not be that difficult. If anyone is supportive of this proposal, let
the usual +1, 0, -1 decide

HTH

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom

  view my Linkedin profile

https://en.everybodywiki.com/Mich_Talebzadeh<https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!HrbR-XT-OQ!Wu9fFP8RFJW2N_YUvwl9yctGHxtM-CFPe6McqOJDrxGBjIaRoF8vRwpjT9WzHojwI2R09Nbg8YE9ggB4FtocU8cQFw$>

Disclaimer: The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner Von Braun)".

-
To unsubscribe e-mail: 
user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>

Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread ashok34...@yahoo.com.INVALID

 Good idea. Will be useful
+1
On Monday, 18 March 2024 at 11:00:40 GMT, Mich Talebzadeh 
 wrote:  
 
 Some of you may be aware that Databricks community Home | Databricks
have just launched a knowledge sharing hub. I thought it would be a
good idea for the Apache Spark user group to have the same, especially
for repeat questions on Spark core, Spark SQL, Spark Structured
Streaming, Spark Mlib and so forth.

Apache Spark user and dev groups have been around for a good while.
They are serving their purpose . We went through creating a slack
community that managed to create more more heat than light.. This is
what Databricks community came up with and I quote

"Knowledge Sharing Hub
Dive into a collaborative space where members like YOU can exchange
knowledge, tips, and best practices. Join the conversation today and
unlock a wealth of collective wisdom to enhance your experience and
drive success."

I don't know the logistics of setting it up.but I am sure that should
not be that difficult. If anyone is supportive of this proposal, let
the usual +1, 0, -1 decide

HTH

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


  view my Linkedin profile


 https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner Von Braun)".

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-18 Thread Mich Talebzadeh

Some of you may be aware that Databricks community Home | Databricks
have just launched a knowledge sharing hub. I thought it would be a
good idea for the Apache Spark user group to have the same, especially
for repeat questions on Spark core, Spark SQL, Spark Structured
Streaming, Spark Mlib and so forth.

Apache Spark user and dev groups have been around for a good while.
They are serving their purpose . We went through creating a slack
community that managed to create more more heat than light.. This is
what Databricks community came up with and I quote

"Knowledge Sharing Hub
Dive into a collaborative space where members like YOU can exchange
knowledge, tips, and best practices. Join the conversation today and
unlock a wealth of collective wisdom to enhance your experience and
drive success."

I don't know the logistics of setting it up.but I am sure that should
not be that difficult. If anyone is supportive of this proposal, let
the usual +1, 0, -1 decide

HTH

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile


 https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner Von Braun)".

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun

Okay, Let me double-check it carefully.

Thank you very much for your help!

发件人: Jungtaek Lim
发送时间: 2024年3月5日 21:56:41
收件人: Pan,Bingkun
抄送: Dongjoon Hyun; dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

Yeah the approach seems OK to me - please double check that the doc generation
in Spark repo won't fail after the move of the js file. Other than that, it
would be probably just a matter of updating the release process.

On Tue, Mar 5, 2024 at 7:24 PM Pan,Bingkun
mailto:panbing...@baidu.com>> wrote:

Okay, I see.

Perhaps we can solve this confusion by sharing the same file `version.json`
across `all versions` in the `Spark website repo`? Make each version of the
document display the `same` data in the dropdown menu.

发件人: Jungtaek Lim
mailto:kabhwan.opensou...@gmail.com>>
发送时间: 2024年3月5日 17:09:07
收件人: Pan,Bingkun
抄送: Dongjoon Hyun; dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

Let me be more specific.

We have two active release version lines, 3.4.x and 3.5.x. We just released
Spark 3.5.1, having a dropdown as 3.5.1 and 3.4.2 given the fact the last
version of 3.4.x is 3.4.2. After a month we released Spark 3.4.3. In the
dropdown of Spark 3.4.3, there will be 3.5.1 and 3.4.3. But if we call this as
done, 3.5.1 (still latest) won't show 3.4.3 in the dropdown, giving confusion
that 3.4.3 wasn't ever released.

This is just about two active release version lines with keeping only the
latest version of version lines. If you expand this to EOLed version lines and
versions which aren't the latest in their version line, the problem gets much
more complicated.

On Tue, Mar 5, 2024 at 6:01 PM Pan,Bingkun
mailto:panbing...@baidu.com>> wrote:

Based on my understanding, we should not update versions that have already been
released,

such as the situation you mentioned: `But what about dropout of version D?
Should we add E in the dropdown?` We only need to record the latest `version.
json` file that has already been published at the time of each new document
release.

Of course, if we need to keep the latest in every document, I think it's also
possible.

Only by sharing the same version. json file in each version.

发件人: Jungtaek Lim
mailto:kabhwan.opensou...@gmail.com>>
发送时间: 2024年3月5日 16:47:30
收件人: Pan,Bingkun
抄送: Dongjoon Hyun; dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

But this does not answer my question about updating the dropdown for the doc of
"already released versions", right?

Let's say we just released version D, and the dropdown has version A, B, C. We
have another release tomorrow as version E, and it's probably easy to add A, B,
C, D in the dropdown of E. But what about dropdown of version D? Should we add
E in the dropdown? How do we maintain it if we will have 10 releases afterwards?

On Tue, Mar 5, 2024 at 5:27 PM Pan,Bingkun
mailto:panbing...@baidu.com>> wrote:

According to my understanding, the original intention of this feature is that
when a user has entered the pyspark document, if he finds that the version he
is currently in is not the version he wants, he can easily jump to the version
he wants by clicking on the drop-down box. Additionally, in this PR, the
current automatic mechanism for PRs did not merge in.

https://github.com/apache/spark/pull/42881<https://mailshield.baidu.com/check?q=NXF5O0EN4F6TOoAzxFGzXSJvMnQlPeztKpu%2fBYaKpd2sRl6qEYTx2NGUrTYUrhOk>

So, we need to manually update this file. I can manually submit an update first
to get this feature working.

发件人: Jungtaek Lim
mailto:kabhwan.opensou...@gmail.com>>
发送时间: 2024年3月4日 6:34:42
收件人: Dongjoon Hyun
抄送: dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

Shall we revisit this functionality? The API doc is built with individual
versions, and for each individual version we depend on other released versions.
This does not seem to be right to me. Also, the functionality is only in
PySpark API doc which does not seem to be consistent as well.

I don't think this is manageable with the current approach (listing versions in
version-dependent doc). Let's say we release 3.4.3 after 3.5.1. Should we
update the versions in 3.5.1 to add 3.4.3 in version switcher? How about the
time we are going to release the new version after releasing 10 versions?
What's the criteria of pruning the version?

Unless we have a good answer to these questions, I think it's better to revert
the functionality - it missed various considerations.

On Fri, Mar 1, 2024 at 2:44 PM Jungtaek Lim
mailto:kabhwan.opensou...@gmail.com>> wrote:
Thanks for reporting - this is odd - the dropdown did not exist in other recent
releases.

https://spark.apache.org/docs/3.5.0/api/python/index.html<https://mailshield.baidu.com/check?q=uXELebgeq9S

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Jungtaek Lim

Yeah the approach seems OK to me - please double check that the doc
generation in Spark repo won't fail after the move of the js file. Other
than that, it would be probably just a matter of updating the release
process.

On Tue, Mar 5, 2024 at 7:24 PM Pan,Bingkun  wrote:

> Okay, I see.
>
> Perhaps we can solve this confusion by sharing the same file `version.json`
> across `all versions` in the `Spark website repo`? Make each version of
> the document display the `same` data in the dropdown menu.
> --
> *发件人:* Jungtaek Lim 
> *发送时间:* 2024年3月5日 17:09:07
> *收件人:* Pan,Bingkun
> *抄送:* Dongjoon Hyun; dev; user
> *主题:* Re: [ANNOUNCE] Apache Spark 3.5.1 released
>
> Let me be more specific.
>
> We have two active release version lines, 3.4.x and 3.5.x. We just
> released Spark 3.5.1, having a dropdown as 3.5.1 and 3.4.2 given the fact
> the last version of 3.4.x is 3.4.2. After a month we released Spark 3.4.3.
> In the dropdown of Spark 3.4.3, there will be 3.5.1 and 3.4.3. But if we
> call this as done, 3.5.1 (still latest) won't show 3.4.3 in the dropdown,
> giving confusion that 3.4.3 wasn't ever released.
>
> This is just about two active release version lines with keeping only the
> latest version of version lines. If you expand this to EOLed version lines
> and versions which aren't the latest in their version line, the problem
> gets much more complicated.
>
> On Tue, Mar 5, 2024 at 6:01 PM Pan,Bingkun  wrote:
>
>> Based on my understanding, we should not update versions that have
>> already been released,
>>
>> such as the situation you mentioned: `But what about dropout of version
>> D? Should we add E in the dropdown?` We only need to record the latest
>> `version. json` file that has already been published at the time of each
>> new document release.
>>
>> Of course, if we need to keep the latest in every document, I think it's
>> also possible.
>>
>> Only by sharing the same version. json file in each version.
>> --
>> *发件人:* Jungtaek Lim 
>> *发送时间:* 2024年3月5日 16:47:30
>> *收件人:* Pan,Bingkun
>> *抄送:* Dongjoon Hyun; dev; user
>> *主题:* Re: [ANNOUNCE] Apache Spark 3.5.1 released
>>
>> But this does not answer my question about updating the dropdown for the
>> doc of "already released versions", right?
>>
>> Let's say we just released version D, and the dropdown has version A, B,
>> C. We have another release tomorrow as version E, and it's probably easy to
>> add A, B, C, D in the dropdown of E. But what about dropdown of version D?
>> Should we add E in the dropdown? How do we maintain it if we will have 10
>> releases afterwards?
>>
>> On Tue, Mar 5, 2024 at 5:27 PM Pan,Bingkun  wrote:
>>
>>> According to my understanding, the original intention of this feature is
>>> that when a user has entered the pyspark document, if he finds that the
>>> version he is currently in is not the version he wants, he can easily jump
>>> to the version he wants by clicking on the drop-down box. Additionally, in
>>> this PR, the current automatic mechanism for PRs did not merge in.
>>>
>>> https://github.com/apache/spark/pull/42881
>>> <https://mailshield.baidu.com/check?q=NXF5O0EN4F6TOoAzxFGzXSJvMnQlPeztKpu%2fBYaKpd2sRl6qEYTx2NGUrTYUrhOk>
>>>
>>> So, we need to manually update this file. I can manually submit an
>>> update first to get this feature working.
>>> --
>>> *发件人:* Jungtaek Lim 
>>> *发送时间:* 2024年3月4日 6:34:42
>>> *收件人:* Dongjoon Hyun
>>> *抄送:* dev; user
>>> *主题:* Re: [ANNOUNCE] Apache Spark 3.5.1 released
>>>
>>> Shall we revisit this functionality? The API doc is built with
>>> individual versions, and for each individual version we depend on other
>>> released versions. This does not seem to be right to me. Also, the
>>> functionality is only in PySpark API doc which does not seem to be
>>> consistent as well.
>>>
>>> I don't think this is manageable with the current approach (listing
>>> versions in version-dependent doc). Let's say we release 3.4.3 after 3.5.1.
>>> Should we update the versions in 3.5.1 to add 3.4.3 in version switcher?
>>> How about the time we are going to release the new version after releasing
>>> 10 versions? What's the criteria of pruning the version?
>>>
>>> Unless we have a good answer to these questions, I think it's better to
>>> revert the functionality - it missed various c

答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun

Okay, I see.

发件人: Jungtaek Lim
发送时间: 2024年3月5日 17:09:07
收件人: Pan,Bingkun
抄送: Dongjoon Hyun; dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released