Re: Log4j2 upgrade

2022-01-12 Thread Jörn Franke
You cannot simply replace it - log4j2 has a slightly different API than log4j. The Spark source code needs to be changed in a couple of places > Am 12.01.2022 um 20:53 schrieb Amit Sharma : > >  > Hello, everyone. I am replacing log4j with log4j2 in my spark streaming > application. When i

Log4j2 upgrade

2022-01-12 Thread Amit Sharma
Hello, everyone. I am replacing log4j with log4j2 in my spark streaming application. When i deployed my application to spark cluster it is giving me the below error . " ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to

RE: Does Spark 3.1.2/3.2 support log4j 2.17.1+, and how? your target release day for Spark3.3?

2022-01-12 Thread Crowe, John
I get that Sean, I really do, but customers being customers, they see Log4j, and they panic.. I’ve been telling them since this began that Version 1x is not affected, but.. but.. Letting them know that 2.17.1 is on the way, IS helpful, but of course they ask us when is it coming? Just trying

Re: Does Spark 3.1.2/3.2 support log4j 2.17.1+, and how? your target release day for Spark3.3?

2022-01-12 Thread Sean Owen
Again: the CVE has no known effect on released Spark versions. Spark 3.3 will have log4j 2.x anyway. On Wed, Jan 12, 2022 at 10:21 AM Crowe, John wrote: > I too would like to know when you anticipate Spark 3.3.0 to be released > due to the Log4j CVE’s. > > Our customers are all quite concerned.

Re: Does Spark 3.1.2/3.2 support log4j 2.17.1+, and how? your target release day for Spark3.3?

2022-01-12 Thread Sean Owen
As noted, there is no known effect on Spark, as released versions do not use an affected log4j version and configuration, thus no documentation about remediation. It is in any event a good idea to update to 2.x; please see JIRA for the log4j 2.x update, which will come in Spark 3.3.0 as this is

RE: Does Spark 3.1.2/3.2 support log4j 2.17.1+, and how? your target release day for Spark3.3?

2022-01-12 Thread Crowe, John
I too would like to know when you anticipate Spark 3.3.0 to be released due to the Log4j CVE’s. Our customers are all quite concerned. Regards; John Crowe TDi Technologies, Inc. 1600 10th Street Suite B Plano, TX 75074 (800) 695-1258

Re: Does Spark 3.1.2/3.2 support log4j 2.17.1+, and how? your target release day for Spark3.3?

2022-01-12 Thread Artemis User
There was a discussion on this issue couple of weeks ago.  Basically if you look at the CVE definition of Log4j, the vulnerability only affects certain versions of log4j 2.x, not 1.x.  Since Spark doesn't use any of the affected log4j versions, this shouldn't be a concern..

Does Spark 3.1.2/3.2 support log4j 2.17.1+, and how? your target release day for Spark3.3?

2022-01-12 Thread Juan Liu
Dear Spark support, Due to the known log4j security issue, we are required to upgrade log4j version to 2.17.1. Currently, we use Spark3.1.2 with default log4j 1.2.17. Also we found log4j configuration document here: https://spark.apache.org/docs/3.2.0/configuration.html#configuring-logging

RE: Re: [Spark ML Pipeline]: Error Loading Pipeline Model with Custom Transformer

2022-01-12 Thread Alana Young
I have updated the gist (https://gist.github.com/ally1221/5acddd9650de3dc67f6399a4687893aa ). Please let me know if there are any additional questions.

Re: pyspark loop optimization

2022-01-12 Thread Ramesh Natarajan
Sorry for confusing with cume_dist and percent_rank. I was playing around with these to see if the difference in computation made any difference. I must have copied the percent rank accidentally. My requirement is to compute cume_dist. I have a dataframe with a bunch of columns (10+ columns)

Re: [Spark ML Pipeline]: Error Loading Pipeline Model with Custom Transformer

2022-01-12 Thread Gourav Sengupta
Hi, may be I have less time, but can you please add some inline comments in your code to explain what you are trying to do? Regards, Gourav Sengupta On Tue, Jan 11, 2022 at 5:29 PM Alana Young wrote: > I am experimenting with creating and persisting ML pipelines using custom > transformers