Hi,
On the issue of Spark shuffle it is accepted that shuffle *often involves*
the following if not all below:
- Disk I/O
- Data serialization and deserialization
- Network I/O
Excluding external shuffle service and without relying on the configuration
options provided by spark
Mon, 15 May 2023 at 13:11, Faiz Halde
wrote:
> Hello,
>
> We've been in touch with a few spark specialists who suggested us a
> potential solution to improve the reliability of our jobs that are shuffle
> heavy
>
> Here is what our setup looks like
>
>- Spark version:
Hello,
We've been in touch with a few spark specialists who suggested us a
potential solution to improve the reliability of our jobs that are shuffle
heavy
Here is what our setup looks like
- Spark version: 3.3.1
- Java version: 1.8
- We do not use external shuffle service
- We use
In my view spark is behaving as expected.
TL:DR
Every time a dataframe is reused or branched or forked the sequence operations
evaluated run again. Use Cache or persist to avoid this behavior and un-persist
when no longer required, spark does not un-persist automatically.
Couple of things
Please see if this works
-- aggregate array into map of element of count
SELECT aggregate(array(1,2,3,4,5),
map('cnt',0),
(acc,x) -> map('cnt', acc.cnt+1)) as array_count
thanks
Vijay
On 2023/05/05 19:32:04 Yong Zhang wrote:
> Hi, This is on Spark 3.1 environment.
>
> For some r
When I run this job in local mode spark-submit --master local[4]
with
spark = SparkSession.builder \
.appName("tests") \
.enableHiveSupport() \
.getOrCreate()
spark.conf.set("spark.sql.adaptive.enabled", "true")
df3.explain(extende
acc -> acc) AS feq_cnt
Here are my questions:
* Is using "map()" above the best way? The "start" structure in this case
should be Map.empty[String, Int], but of course, it won't work in pure Spark
SQL, so the best solution I can think of is "map()&quo
v], PartitionFilters: [],
PushedFilters: [], ReadSchema: struct
```
On Mon, May 8, 2023 at 1:07 AM Mich Talebzadeh
wrote:
> When I run this job in local mode spark-submit --master local[4]
>
> with
>
> spark = SparkSession.builder \
> .appName(&
acc -> acc) AS feq_cnt
Here are my questions:
* Is using "map()" above the best way? The "start" structure in this case
should be Map.empty[String, Int], but of course, it won't work in pure Spark
SQL, so the best solution I can think of is "map()", and it is
here. It has to read it twice to
perform this operation. HJ was not invented by Spark. It has been around
in databases for years plus NLJ and MJ.
[image: image.png]
Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Kingdom
view my Linkedin
to help me out is minimized
I don't think Spark validating the file existence qualifies as an action
according to Spark parlance. Sure there would be an analysis exception in
case the file is not found as per the location provided, however, if you
provided a schema and a valid path then no job would
You have started with panda DF which won't scale outside of the driver
itself.
Let us put that aside.
df1.to_csv("./df1.csv",index_label = "index") ## write the dataframe to
the underlying file system
starting with spark
df1 = spark.read.csv("./df1.csv", header=
t;
>> I get how using .cache I can ensure that the data from a particular
>> checkpoint is reused and the computations do not happen again.
>>
>> However, In my case here I am calling just one action. Within the purview
>> of one action Spark should not rerun t
st one action. Within the purview
> of one action Spark should not rerun the overlapping parts of the DAG. I do
> not understand why the file scan is happening several times. I can easily
> mitigate the issue by using window functions and creating all the columns
> in one go without hav
checkpoint is reused and the computations do not happen again.
However, In my case here I am calling just one action. Within the purview
of one action Spark should not rerun the overlapping parts of the DAG. I do
not understand why the file scan is happening several times. I can easily
mitigate the issue
When your memory is not sufficient to keep the cached data for your jobs in two
different stages, it might be read twice because Spark might have to clear the
previous cache for other jobs. In those cases, a spill may triggered when Spark
write your data from memory to disk.
One way
e author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Fri, 5 May 2023 at 20:33, Yong Zhang wrote:
> Hi, This is on Spark 3.1 environment.
>
> For some reason, I can ONLY do this in Spark SQL, instead of either Scala
> or PySpark en
Hi, This is on Spark 3.1 environment.
For some reason, I can ONLY do this in Spark SQL, instead of either Scala or
PySpark environment.
I want to aggregate an array into a Map of element count, within that array,
but in Spark SQL.
I know that there is an aggregate function available like
We are using spark.Today I see the FunctionCatalog , and I have seen the
source of
spark\sql\core\src\test\scala\org\apache\spark\sql\connector\DataSourceV2FunctionSuite.scala
and have implements the ScalarFunction.But I still not know how to register
it in sql
Hi all,
I have written a program and overridden two events onStageCompleted and
onTaskEnd. However, these two events do not provide information on when a
Task/Stage is completed.
What I want to know is which Task corresponds to which stage of a DAG (the
Spark history server only tells me how
Severity: important
Affected versions:
- Apache Spark 3.1.1 before 3.2.2
Description:
** UNSUPPORTED WHEN ASSIGNED ** The Apache Spark UI offers the possibility to
enable ACLs via the configuration option spark.acls.enable. With an
authentication filter, this checks whether a user has access
l.com> wrote:
>
>
> You don't want to use CPUs with Tensorflow.
> If it's not scaling, you may have a problem that is far too small to
> distribute.
>
> On Sat, Apr 29, 2023 at 7:30 AM second_co...@yahoo.com.INVALID
> wrote:
>
> Anyone successfully run nativ
Hello all,
Is there any way to use the pyspark core to read some text files with GBK
encoding?
Although the pyspark sql has an option to set the encoding, but these text
files are not structural format.
Any advices are appreciated.
Thank you
lianyou Li
second_co...@yahoo.com.INVALID
wrote:
Anyone successfully run native tensorflow on Spark ? i tested example at
https://github.com/tensorflow/ecosystem/tree/master/spark/spark-tensorflow-distributor
on Kubernetes CPU . By running in on multiple workers CPUs. I do not see any
speed up in training
You don't want to use CPUs with Tensorflow.
If it's not scaling, you may have a problem that is far too small to
distribute.
On Sat, Apr 29, 2023 at 7:30 AM second_co...@yahoo.com.INVALID
wrote:
> Anyone successfully run native tensorflow on Spark ? i tested example at
> https://gith
Anyone successfully run native tensorflow on Spark ? i tested example at
https://github.com/tensorflow/ecosystem/tree/master/spark/spark-tensorflow-distributor
on Kubernetes CPU . By running in on multiple workers CPUs. I do not see any
speed up in training time by setting number of slot from1
/04/14 16:41:36 Yuval Itzchakov wrote:
> Hi,
>
> ATM I see the most used option for a Spark operator is the one provided by
> Google: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator
>
> Unfortunately, it doesn't seem actively maintained. Are there any plans to
>
Hi team,
Myself akhil,
Iam trying to create a spark using virtual machine through chef.
Could you please help us how we can do it.
If possible could you please share the documentation.
Regards
Akhil
Hi,
For an aggregating UDF, use spark.udf.registerJavaUDAF(name, className).
Enrico
Am 23.04.23 um 23:42 schrieb Thomas Wang:
Hi Spark Community,
I have implemented a custom Spark Aggregator (a subclass to
|org.apache.spark.sql.expressions.Aggregator|). Now I'm trying to use
Hi Spark Community,
I have implemented a custom Spark Aggregator (a subclass to
org.apache.spark.sql.expressions.Aggregator). Now I'm trying to use it in a
PySpark application, but for some reason, I'm not able to trigger the
function. Here is what I'm doing, could someone help me take a look
should work.
> --
> Raghavendra
>
>
> On Sun, Apr 23, 2023 at 9:20 PM Thomas Wang wrote:
>
>> Hi Spark Community,
>>
>> I'm trying to implement a custom Spark Aggregator (a subclass to
>> org.apache.spark.sql.expressions.Aggregator). Correct me if I'm
For simple array types setting encoder to ExpressionEncoder() should work.
--
Raghavendra
On Sun, Apr 23, 2023 at 9:20 PM Thomas Wang wrote:
> Hi Spark Community,
>
> I'm trying to implement a custom Spark Aggregator (a subclass to
> org.apache.spark.sql.expressions.Aggregator)
Hi Spark Community,
I'm trying to implement a custom Spark Aggregator (a subclass to
org.apache.spark.sql.expressions.Aggregator). Correct me if I'm wrong, but
I'm assuming I will be able to use it as an aggregation function like SUM.
What I'm trying to do is that I have a column of ARRAY and I
I am writing a spark application which uses java and spring boot to process
rows. For every row it performs some logic and saves data into the database.
<https://stackoverflow.com/posts/76058897/timeline>
The logic is performed using some services defined in my application and
some ex
You can reproduce the behavior in ordinary Scala code if you keep reduce in
an object outside the main method. Hope it might help
On Mon, Apr 17, 2023 at 10:22 PM Dhruv Singla wrote:
> Hi Team
>I was trying to run spark using `sbt console` on the terminal. I am
> able
Thanks Elliot ! Let me check it out !
On Mon, 17 Apr, 2023, 10:08 pm Elliot West, wrote:
> Hi Ankit,
>
> While not a part of Spark, there is a project called 'WaggleDance' that
> can federate multiple Hive metastores so that they are accessible via a
> single URI: htt
Hi Team
I was trying to run spark using `sbt console` on the terminal. I am
able to build the project successfully using build.sbt and the following
piece of code runs fine on IntelliJ. The only issue I am facing while
running the same on terminal is that the Executor keeps running
There is a DSv2-based Hive connector in Apache Kyuubi[1] that supports
connecting multiple HMS in a single Spark application.
Some limitations
- currently only supports Spark 3.3
- has a known issue when using w/ `spark-sql`, but OK w/ spark-shell and
normal jar-based Spark application.
[1
Hi Ankit,
While not a part of Spark, there is a project called 'WaggleDance' that can
federate multiple Hive metastores so that they are accessible via a single
URI: https://github.com/ExpediaGroup/waggle-dance
This may be useful or perhaps serve as inspiration.
Thanks,
Elliot.
On Mon, 17 Apr
Greetings Everyone!
We are in need to ship spark (driver and executor) logs (not spark event
logs) from K8 to cloud bucket ADLS/S3.
Using fluentbit we are able to ship the log files but only to one single
path container/logs/.
This will cause a huge number of files in a single folder
++
User Mailing List
Just a reminder, anyone who can help on this.
Thanks a lot !
Ankit Prakash Gupta
On Wed, Apr 12, 2023 at 8:22 AM Ankit Gupta wrote:
> Hi All
>
> The question is regarding the support of multiple Remote Hive Metastore
> catalogs with Spark. Starting Spark
Description:
In Apache Spark versions prior to 3.4.0, applications using spark-submit can
specify a 'proxy-user' to run as, limiting privileges. The application can
execute code with the privileges of the submitting user, however, by providing
malicious configuration-related classes
I'm not running on GKE. I am wondering what's the long term strategy around
a Spark operator. Operators are the de-facto way to run complex
deployments. The Flink community now has an official community led
operator, and I was wondering if there are any similar plans for Spark.
On Fri, Apr 14
Hi,
What exactly are you trying to achieve? Spark on GKE works fine and you can
run Datapoc now on GKE
https://www.linkedin.com/pulse/running-google-dataproc-kubernetes-engine-gke-spark-mich/?trackingId=lz12GC5dRFasLiaJm5qDSw%3D%3D
Unless I misunderstood your point.
HTH
Mich Talebzadeh,
Lead
Hi,
ATM I see the most used option for a Spark operator is the one provided by
Google: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator
Unfortunately, it doesn't seem actively maintained. Are there any plans to
support an official Apache Spark community driven operator?
Hi,
Start with intercepting stage completions using SparkListenerStageCompleted
[1]. That's Spark Core (jobs, stages and tasks).
Go up the execution chain to Spark SQL with SparkListenerSQLExecutionStart
[2] and SparkListenerSQLExecutionEnd [3], and correlate infos.
You may want to look at how
com.invalid> wrote:
> We are using spark.Today I see the FunctionCatalog , and I have seen the
> source of
> spark\sql\core\src\test\scala\org\apache\spark\sql\connector\DataSourceV2FunctionSuite.scala
> and have implements the ScalarFunction.But i still not konw how
> to register it in sql
We are using spark.Today I see the FunctionCatalog , and I haveseen the
source of
spark\sql\core\src\test\scala\org\apache\spark\sql\connector\DataSourceV2FunctionSuite.scala
and have implements theScalarFunction.But i still not konw how
toregisterit in sql
We are happy to announce the availability of Apache Spark 3.2.4!
Spark 3.2.4 is a maintenance release containing stability fixes. This
release is based on the branch-3.2 maintenance branch of Spark. We strongly
recommend all 3.2 users to upgrade to this stable release.
To download Spark 3.2.4
;reverse-engineer" tasks to functions.
>
> In essence, Spark SQL is an abstraction layer over RDD API that's made up
> of partitions and tasks. Tasks are Scala functions (possibly with some
> Python for PySpark). A simple-looking high-level operator like
> DataFrame.join can end up with
Hi,
I was wondering if it's not possible to determine tasks to functions, is it
still possible to easily figure out which job and stage completed which
part of the query from the UI?
For example, in the SQL tab of the Spark UI, I am able to see the query and
the Job IDs for that query. However
laimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Wed, 12 Apr 2023 at 02:55, Lingzhe Sun wrote:
> Hi Mich,
>
> FYI we're using spark operator(
> https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) t
Hi,
tl;dr it's not possible to "reverse-engineer" tasks to functions.
In essence, Spark SQL is an abstraction layer over RDD API that's made up
of partitions and tasks. Tasks are Scala functions (possibly with some
Python for PySpark). A simple-looking high-level operator like
DataFram
Hi Rajesh,
It's working fine, at least for now. But you'll need to build your own spark
image using later versions.
Lingzhe Sun
Hirain Technologies
Original:
From:Rajesh Katkar Date:2023-04-12 21:36:52To:Lingzhe
SunCc:Mich Talebzadeh ,
user Subject:Re: Re: spark streaming
unsubscribe
On Wed, Apr 12, 2023 at 3:59 PM Rajesh Katkar
wrote:
> Hi Lingzhe,
>
> We are also started using this operator.
> Do you see any issues with it?
>
>
> On Wed, 12 Apr, 2023, 7:25 am Lingzhe Sun, wrote:
>
>> Hi Mich,
>>
>> FYI we're u
Hi Lingzhe,
We are also started using this operator.
Do you see any issues with it?
On Wed, 12 Apr, 2023, 7:25 am Lingzhe Sun, wrote:
> Hi Mich,
>
> FYI we're using spark operator(
> https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) to build
> stateful structured s
Hi Mich,
FYI we're using spark
operator(https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) to build
stateful structured streaming on k8s for a year. Haven't test it using
non-operator way.
Besides that, the main contributor of the spark operator, Yinan Li, has been
inactive
Just to clarify, a major benefit of k8s in this case is to host your Spark
applications in the form of containers in an automated fashion so that one
can easily deploy as many instances of the application as required
(autoscaling). From below:
https://price2meet.com/gcp/docs
What I said was this
"In so far as I know k8s does not support spark structured streaming?"
So it is an open question. I just recalled it. I have not tested myself. I
know structured streaming works on Google Dataproc cluster but I have not
seen any official link that says Spark
Do you have any link or ticket which justifies that k8s does not support
spark streaming ?
On Thu, 6 Apr, 2023, 9:15 pm Mich Talebzadeh,
wrote:
> Do you have a high level diagram of the proposed solution?
>
> In so far as I know k8s does not support spark structured streaming?
remove
On Wed, Apr 5, 2023 at 8:06 AM Mich Talebzadeh
wrote:
> OK Spark Structured Streaming.
>
> How are you getting messages into Spark? Is it Kafka?
>
> This to me index that the message is incomplete or having another value in
> Json
>
> HTH
>
> Mich Talebzad
Use case is , we want to read/write to kinesis streams using k8s
Officially I could not find the connector or reader for kinesis from spark
like it has for kafka.
Checking here if anyone used kinesis and spark streaming combination ?
On Thu, 6 Apr, 2023, 7:23 pm Mich Talebzadeh,
wrote:
>
kar
Cc: u...@spark.incubator.apache.org
Subject: Re: spark streaming and kinesis integration
⚠ [EXTERNAL EMAIL]: Use Caution
Do you have a high level diagram of the proposed solution?
In so far as I know k8s does not support spark structured streaming?
Mich Talebzadeh,
Lead Solutions
Do you have a high level diagram of the proposed solution?
In so far as I know k8s does not support spark structured streaming?
Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom
view my Linkedin profile
<https://www.linkedin.com/in/m
elying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Thu, 6 Apr 2023 at 13:08, Rajesh Katkar wrote:
> Hi Spark Team,
>
> We need to read/write the kinesis streams using
Hi Spark Team,
We need to read/write the kinesis streams using spark streaming.
We checked the official documentation -
https://spark.apache.org/docs/latest/streaming-kinesis-integration.html
It does not mention kinesis connector. Alternative is -
https://github.com/qubole/kinesis-sql which
OK Spark Structured Streaming.
How are you getting messages into Spark? Is it Kafka?
This to me index that the message is incomplete or having another value in
Json
HTH
Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies
London
United Kingdom
view my Linkedin
Dear Apache Spark users,
I have a long running Spark application that is encountering an
ArrayIndexOutOfBoundsException once every two weeks. The exception does not
disrupt the operation of my app, but I'm still concerned about it and would
like to find a solution.
Here's some
Dear Apache Spark users,
I have a long running Spark application that is encountering an
ArrayIndexOutOfBoundsException once every two weeks. The exception does not
disrupt the operation of my app, but I'm still concerned about it and would
like to find a solution.
Here's some
Good stuff Khalid.
I have created a section in Apache Spark Community Stack called spark
foundation. spark-foundation - Apache Spark Community - Slack
<https://app.slack.com/client/T04URTRBZ1R/C051CL5T1KL/thread/C0501NBTNQG-1680132989.091199>
I invite you to add your weblink to that s
Hey AN-TRUONG
I have got some articles about this subject that should help.
E.g.
https://khalidmammadov.github.io/spark/spark_internals_rdd.html
Also check other Spark Internals on web.
Regards
Khalid
On Fri, 31 Mar 2023, 16:29 AN-TRUONG Tran Phan,
wrote:
> Thank you for your informat
author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Fri, 31 Mar 2023 at 16:17, AN-TRUONG Tran Phan
wrote:
> Thank you for your information,
>
> I have tracked the spark history server on port 18080 and the spark UI on
> port 4040
Thank you for your information,
I have tracked the spark history server on port 18080 and the spark UI on
port 4040. I see the result of these two tools as similar right?
I want to know what each Task ID (Example Task ID 0, 1, 3, 4, 5, ) in
the images does, is it possible?
https
Are you familiar with spark GUI default on port 4040?
have a look.
HTH
Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
https://en.everybodywi
t;
>>> There is a section in slack called webinars
>>>
>>>
>>> https://sparkcommunitytalk.slack.com/x-p4977943407059-5006939220983-5006939446887/messages/C0501NBTNQG
>>>
>>> Asma Zgolli, agreed to prepare materials for Spark internals and/
>
>> There is a section in slack called webinars
>>
>>
>> https://sparkcommunitytalk.slack.com/x-p4977943407059-5006939220983-5006939446887/messages/C0501NBTNQG
>>
>> Asma Zgolli, agreed to prepare materials for Spark internals and/or
>> comparing spark
887/messages/C0501NBTNQG
>
> Asma Zgolli, agreed to prepare materials for Spark internals and/or
> comparing spark 3 and 2.
>
> I like to contribute to "Spark Streaming & Spark Structured Streaming"
> plus "Spark on k8s for both GCP and EKS concepts and con
Hi all,
There is a section in slack called webinars
https://sparkcommunitytalk.slack.com/x-p4977943407059-5006939220983-5006939446887/messages/C0501NBTNQG
Asma Zgolli, agreed to prepare materials for Spark internals and/or
comparing spark 3 and 2.
I like to contribute to "Spark Stre
Hello everyone,
I suggest using the slack for the spark community created recently to
collaborate and work together on these topics and use the LinkedIn page to
publish the events and the webinars.
Cheers,
Asma
Le jeu. 16 mars 2023 à 01:39, Denny Lee a écrit :
> What we can do is
Agreed. How does asynchronous communication relate to Spark Structured
streaming?
In the previous post of yours, you made your Spark to run on the driver in
a single JVM. You attempted to increase the number of executors to 3 after
submission of the job that (as Sean alluded to) would not work
What do you mean by asynchronously here?
On Sun, Mar 26, 2023, 10:22 AM Emmanouil Kritharakis <
kritharakismano...@gmail.com> wrote:
> Hello again,
>
> Do we have any news for the above question?
> I would really appreciate it.
>
> Thank you,
>
>
Hello again,
Do we have any news for the above question?
I would really appreciate it.
Thank you,
--
Emmanouil (Manos) Kritharakis
Ph.D. candidate in the Department of Computer Science
Hi all,
As you may be aware we are proposing to set-up community classes and
webinars for Spark interest group or simply for those who could benefit
from them.
@Denny Lee and myself had a discussion on how to
put this framework forward. The idea is first and foremost getting support
from
e or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Thu, 16 Mar 2023
netary damages arising from
such loss, damage or destruction.
On Thu, 16 Mar 2023 at 23:49, karan alang wrote:
> Fyi .. apache spark version is 3.1.3
>
> On Wed, Mar 15, 2023 at 4:34 PM karan alang wrote:
>
>> Hi Mich, this doesn't seem to be working for me .. the watermar
Hi team,
In a single Linux node, I would like to set up Rstudio with Sparkly. Three
to four people make up the dev team.
I am aware of the single-node spark cluster's constraints. When there is a
resource problem with Spark, I want to know when more users join in to use
Sparkly in Rstudio
Fyi .. apache spark version is 3.1.3
On Wed, Mar 15, 2023 at 4:34 PM karan alang wrote:
> Hi Mich, this doesn't seem to be working for me .. the watermark seems to
> be getting ignored !
>
> Here is the data pu
n an article perhaps. Comments and
>> contributions are welcome.
>>
>> HTH
>>
>> Mich Talebzadeh,
>> Lead Solutions Architect/Engineering Lead,
>> Palantir Technologies Limited
>>
>>
>>
>>view my Linkedin profile
>>
"2023-03-13T10:12:00.000-07:00" should have got dropped,
it is more than 2 days old (i.e. dated - 2023-03-13)!
Any ideas what needs to be changed to make this work ?
Here is the code (modified for my requirement, but essentially the same)
```
schema = StructType([
2/>
>
>
> https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explici
aimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Tue, 14 Mar 2023 at 15:09, Mich Talebzadeh
> wrote:
>
> Hi Denny,
>
> That Apache Spark Linkedin page
> https://www.linkedin.
Great.
A case that I hope can be better documented, especially now that we have
Pandas API on Spark and many potential new users coming from Pandas.
Is how to start Spark with full available memory and CPU.
I use this function to do this in a notebook.
import multiprocessing
import os
import sys
ng from
> such loss, damage or destruction.
>
>
>
>
> On Tue, 14 Mar 2023 at 15:09, Mich Talebzadeh
> wrote:
>
>> Hi Denny,
>>
>> That Apache Spark Linkedin page
>> https://www.linkedin.com/company/apachespark/ looks fine. It also allows
>> a wider
uction.
On Tue, 14 Mar 2023 at 15:09, Mich Talebzadeh
wrote:
> Hi Denny,
>
> That Apache Spark Linkedin page
> https://www.linkedin.com/company/apachespark/ looks fine. It also allows
> a wider audience to benefit from it.
>
> +1 for me
>
>
>
>view my Linke
Hello,
I hope this email finds you well!
I have a simple dataflow in which I read from a kafka topic, perform a map
transformation and then I write the result to another topic. Based on your
documentation here
Hi Denny,
That Apache Spark Linkedin page
https://www.linkedin.com/company/apachespark/ looks fine. It also allows a
wider audience to benefit from it.
+1 for me
view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
https://en.everybodywi
In the past, we've been using the Apache Spark LinkedIn page
<https://www.linkedin.com/company/apachespark/> and group to broadcast
these type of events - if you're cool with this? Or we could go through
the process of submitting and updating the current https://spark.apache.org
or r
, Mich Talebzadeh wrote:
Apologies I missed the list.
To move forward I selected these topics from the thread "Online classes for
spark topics".
To take this further I propose a confluence page to be seup.
1. Spark UI
2. Dynamic allocation
3. Tuning of jobs
4. Collec
From the release notes of antl4 , there are two key changes in antl4 4.10:
1. 4.10-generated parsers incompatible with previous runtimes
2. Increasing minimum java version to Java 11
So I personally think it is temporarily impossible for Spark to upgrade to the
antl4 version above 4.10
You want Antlr 3 and Spark is on 4? no I don't think Spark would downgrade.
You can shade your app's dependencies maybe.
On Tue, Mar 14, 2023 at 8:21 AM Sahu, Karuna
wrote:
> Hi Team
>
>
>
> We are upgrading a legacy application using Spring boot , Spark and
> Hibernat
601 - 700 of 34849 matches
Mail list logo