Re: Log4j 1.2.17 spark CVE
My understanding is that we don’t need to do anything. Log4j2-core not used in spark. > 2021年12月13日 下午12:45,Pralabh Kumar 写道: > > Hi developers, users > > Spark is built using log4j 1.2.17 . Is there a plan to upgrade based on > recent CVE detected ? > > > Regards > Pralabh kumar - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Log4j 1.2.17 spark CVE
You would want to shade this dependency in your app, in which case you would be using log4j 2. If you don't shade and just include it, you will also be using log4j 2 as some of the API classes are different. If they overlap with log4j 1, you will probably hit errors anyway. On Mon, Dec 13, 2021 at 6:33 PM James Yu wrote: > Question: Spark use log4j 1.2.17, if my application jar contains log4j 2.x > and gets submitted to the Spark cluster. Which version of log4j gets > actually used during the Spark session? > -- > *From:* Sean Owen > *Sent:* Monday, December 13, 2021 8:25 AM > *To:* Jörn Franke > *Cc:* Pralabh Kumar ; dev ; > user.spark > *Subject:* Re: Log4j 1.2.17 spark CVE > > This has come up several times over years - search JIRA. The very short > summary is: Spark does not use log4j 1.x, but its dependencies do, and > that's the issue. > Anyone that can successfully complete the surgery at this point is welcome > to, but I failed ~2 years ago. > > On Mon, Dec 13, 2021 at 10:02 AM Jörn Franke wrote: > > Is it in any case appropriate to use log4j 1.x which is not maintained > anymore and has other security vulnerabilities which won’t be fixed anymore > ? > > Am 13.12.2021 um 06:06 schrieb Sean Owen : > > > Check the CVE - the log4j vulnerability appears to affect log4j 2, not > 1.x. There was mention that it could affect 1.x when used with JNDI or SMS > handlers, but Spark does neither. (unless anyone can think of something I'm > missing, but never heard or seen that come up at all in 7 years in Spark) > > The big issue would be applications that themselves configure log4j 2.x, > but that's not a Spark issue per se. > > On Sun, Dec 12, 2021 at 10:46 PM Pralabh Kumar > wrote: > > Hi developers, users > > Spark is built using log4j 1.2.17 . Is there a plan to upgrade based on > recent CVE detected ? > > > Regards > Pralabh kumar > >
Re: Log4j 1.2.17 spark CVE
Question: Spark use log4j 1.2.17, if my application jar contains log4j 2.x and gets submitted to the Spark cluster. Which version of log4j gets actually used during the Spark session? From: Sean Owen Sent: Monday, December 13, 2021 8:25 AM To: Jörn Franke Cc: Pralabh Kumar ; dev ; user.spark Subject: Re: Log4j 1.2.17 spark CVE This has come up several times over years - search JIRA. The very short summary is: Spark does not use log4j 1.x, but its dependencies do, and that's the issue. Anyone that can successfully complete the surgery at this point is welcome to, but I failed ~2 years ago. On Mon, Dec 13, 2021 at 10:02 AM Jörn Franke mailto:jornfra...@gmail.com>> wrote: Is it in any case appropriate to use log4j 1.x which is not maintained anymore and has other security vulnerabilities which won’t be fixed anymore ? Am 13.12.2021 um 06:06 schrieb Sean Owen mailto:sro...@gmail.com>>: Check the CVE - the log4j vulnerability appears to affect log4j 2, not 1.x. There was mention that it could affect 1.x when used with JNDI or SMS handlers, but Spark does neither. (unless anyone can think of something I'm missing, but never heard or seen that come up at all in 7 years in Spark) The big issue would be applications that themselves configure log4j 2.x, but that's not a Spark issue per se. On Sun, Dec 12, 2021 at 10:46 PM Pralabh Kumar mailto:pralabhku...@gmail.com>> wrote: Hi developers, users Spark is built using log4j 1.2.17 . Is there a plan to upgrade based on recent CVE detected ? Regards Pralabh kumar
Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView
You are correct, I understand. My only concern is the back compatibility problem, which worked for the previous version of Apache Spark. It's painful when an OOTB feature breaks without documentation or a workaround like "spark.sql.legacy.keepSqlRecursive" true/false. It's not about "my code", it is about all production code running out there. Thank you so much On Mon, Dec 13, 2021 at 2:32 PM Sean Owen wrote: > I think we're around in circles - you should not do this. You essentially > have "__TABLE__ = SELECT * FROM __TABLE__" and I hope it's clear why that > can't work in general. > At first execution, sure, maybe "old" __TABLE__ refers to "SELECT 1", but > what about the second time? if you stick to that interpretation, it's > actually not executing correctly, though 'works'. If you execute it as is, > it fails for circularity. Both are bad, so it's just disallowed. > Just fix your code? > > On Mon, Dec 13, 2021 at 11:27 AM Daniel de Oliveira Mantovani < > daniel.oliveira.mantov...@gmail.com> wrote: > >> I've reduced the code to reproduce the issue, >> >> val df = spark.sql("SELECT 1") >> df.createOrReplaceTempView("__TABLE__") >> spark.sql("SELECT * FROM __TABLE__").show >> val df2 = spark.sql("SELECT *,2 FROM __TABLE__") >> df2.createOrReplaceTempView("__TABLE__") // Exception in Spark 3.2 but >> works for Spark 2.4.x and Spark 3.1.x >> spark.sql("SELECT * FROM __TABLE__").show >> >> org.apache.spark.sql.AnalysisException: Recursive view `__TABLE__` >> detected (cycle: `__TABLE__` -> `__TABLE__`) >> at >> org.apache.spark.sql.errors.QueryCompilationErrors$.recursiveViewDetectedError(QueryCompilationErrors.scala:2045) >> at >> org.apache.spark.sql.execution.command.ViewHelper$.checkCyclicViewReference(views.scala:515) >> at >> org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2(views.scala:522) >> at >> org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2$adapted(views.scala:522) >> >> On Mon, Dec 13, 2021 at 2:10 PM Sean Owen wrote: >> >>> _shrug_ I think this is a bug fix, unless I am missing something here. >>> You shouldn't just use __TABLE__ for everything, and I'm not seeing a good >>> reason to do that other than it's what you do now. >>> I'm not clear if it's coming across that this _can't_ work in the >>> general case. >>> >>> On Mon, Dec 13, 2021 at 11:03 AM Daniel de Oliveira Mantovani < >>> daniel.oliveira.mantov...@gmail.com> wrote: >>> In this context, I don't want to worry about the name of the temporary table. That's why it is "__TABLE__". The point is that this behavior for Spark 3.2.x it's breaking back compatibility for all previous versions of Apache Spark. In my opinion we should at least have some flag like "spark.sql.legacy.keepSqlRecursive" true/false. >>> >> >> -- >> >> -- >> Daniel Mantovani >> >> -- -- Daniel Mantovani
Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView
I think we're around in circles - you should not do this. You essentially have "__TABLE__ = SELECT * FROM __TABLE__" and I hope it's clear why that can't work in general. At first execution, sure, maybe "old" __TABLE__ refers to "SELECT 1", but what about the second time? if you stick to that interpretation, it's actually not executing correctly, though 'works'. If you execute it as is, it fails for circularity. Both are bad, so it's just disallowed. Just fix your code? On Mon, Dec 13, 2021 at 11:27 AM Daniel de Oliveira Mantovani < daniel.oliveira.mantov...@gmail.com> wrote: > I've reduced the code to reproduce the issue, > > val df = spark.sql("SELECT 1") > df.createOrReplaceTempView("__TABLE__") > spark.sql("SELECT * FROM __TABLE__").show > val df2 = spark.sql("SELECT *,2 FROM __TABLE__") > df2.createOrReplaceTempView("__TABLE__") // Exception in Spark 3.2 but > works for Spark 2.4.x and Spark 3.1.x > spark.sql("SELECT * FROM __TABLE__").show > > org.apache.spark.sql.AnalysisException: Recursive view `__TABLE__` > detected (cycle: `__TABLE__` -> `__TABLE__`) > at > org.apache.spark.sql.errors.QueryCompilationErrors$.recursiveViewDetectedError(QueryCompilationErrors.scala:2045) > at > org.apache.spark.sql.execution.command.ViewHelper$.checkCyclicViewReference(views.scala:515) > at > org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2(views.scala:522) > at > org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2$adapted(views.scala:522) > > On Mon, Dec 13, 2021 at 2:10 PM Sean Owen wrote: > >> _shrug_ I think this is a bug fix, unless I am missing something here. >> You shouldn't just use __TABLE__ for everything, and I'm not seeing a good >> reason to do that other than it's what you do now. >> I'm not clear if it's coming across that this _can't_ work in the general >> case. >> >> On Mon, Dec 13, 2021 at 11:03 AM Daniel de Oliveira Mantovani < >> daniel.oliveira.mantov...@gmail.com> wrote: >> >>> >>> In this context, I don't want to worry about the name of the temporary >>> table. That's why it is "__TABLE__". >>> The point is that this behavior for Spark 3.2.x it's breaking back >>> compatibility for all previous versions of Apache Spark. In my opinion we >>> should at least have some flag like "spark.sql.legacy.keepSqlRecursive" >>> true/false. >>> >> > > -- > > -- > Daniel Mantovani > >
Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView
I've reduced the code to reproduce the issue, val df = spark.sql("SELECT 1") df.createOrReplaceTempView("__TABLE__") spark.sql("SELECT * FROM __TABLE__").show val df2 = spark.sql("SELECT *,2 FROM __TABLE__") df2.createOrReplaceTempView("__TABLE__") // Exception in Spark 3.2 but works for Spark 2.4.x and Spark 3.1.x spark.sql("SELECT * FROM __TABLE__").show org.apache.spark.sql.AnalysisException: Recursive view `__TABLE__` detected (cycle: `__TABLE__` -> `__TABLE__`) at org.apache.spark.sql.errors.QueryCompilationErrors$.recursiveViewDetectedError(QueryCompilationErrors.scala:2045) at org.apache.spark.sql.execution.command.ViewHelper$.checkCyclicViewReference(views.scala:515) at org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2(views.scala:522) at org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2$adapted(views.scala:522) On Mon, Dec 13, 2021 at 2:10 PM Sean Owen wrote: > _shrug_ I think this is a bug fix, unless I am missing something here. You > shouldn't just use __TABLE__ for everything, and I'm not seeing a good > reason to do that other than it's what you do now. > I'm not clear if it's coming across that this _can't_ work in the general > case. > > On Mon, Dec 13, 2021 at 11:03 AM Daniel de Oliveira Mantovani < > daniel.oliveira.mantov...@gmail.com> wrote: > >> >> In this context, I don't want to worry about the name of the temporary >> table. That's why it is "__TABLE__". >> The point is that this behavior for Spark 3.2.x it's breaking back >> compatibility for all previous versions of Apache Spark. In my opinion we >> should at least have some flag like "spark.sql.legacy.keepSqlRecursive" >> true/false. >> > -- -- Daniel Mantovani
Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView
_shrug_ I think this is a bug fix, unless I am missing something here. You shouldn't just use __TABLE__ for everything, and I'm not seeing a good reason to do that other than it's what you do now. I'm not clear if it's coming across that this _can't_ work in the general case. On Mon, Dec 13, 2021 at 11:03 AM Daniel de Oliveira Mantovani < daniel.oliveira.mantov...@gmail.com> wrote: > > In this context, I don't want to worry about the name of the temporary > table. That's why it is "__TABLE__". > The point is that this behavior for Spark 3.2.x it's breaking back > compatibility for all previous versions of Apache Spark. In my opinion we > should at least have some flag like "spark.sql.legacy.keepSqlRecursive" > true/false. >
Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView
In this context, I don't want to worry about the name of the temporary table. That's why it is "__TABLE__". The point is that this behavior for Spark 3.2.x it's breaking back compatibility for all previous versions of Apache Spark. In my opinion we should at least have some flag like "spark.sql.legacy.keepSqlRecursive" true/false. On Mon, Dec 13, 2021 at 1:47 PM Sean Owen wrote: > You can replace temp views. Again: what you can't do here is define a temp > view in terms of itself. If you are reusing the same name over and over, > it's probably easy to do that, so you don't want to do that. You want > different names for different temp views, or else ensure you aren't doing > the kind of thing shown in the SO post. You get the problem right? > > On Mon, Dec 13, 2021 at 10:43 AM Daniel de Oliveira Mantovani < > daniel.oliveira.mantov...@gmail.com> wrote: > >> I didn't post the SO issue, I've just found the same exception I'm facing >> for Spark 3.2. Almaren Framework has a concept of create temporary views >> with the name "__TABLE__". >> >> Example, if you want to use SQL dialect to a DataFrame to join a >> table/aggregation/apply a function whatever. Instead of you create a >> temporary table you just use the "__TABLE__" alias. You don't really care >> about the name of the table. You may use this "__TABLE__" approach in >> different parts of your code. >> >> Why can't I create or replace temporary views in different DataFrame with >> the same name as before ? >> >> >> >> On Mon, Dec 13, 2021 at 1:27 PM Sean Owen wrote: >> >>> If the issue is what you posted in SO, I think the stack trace explains >>> it already. You want to avoid this recursive definition, which in general >>> can't work. >>> I think it's simply explicitly disallowed in all cases now, but, you >>> should not be depending on this anyway - why can't this just be avoided? >>> >>> On Mon, Dec 13, 2021 at 10:06 AM Daniel de Oliveira Mantovani < >>> daniel.oliveira.mantov...@gmail.com> wrote: >>> Sean, https://github.com/music-of-the-ainur/almaren-framework/tree/spark-3.2 Just executing "sbt test" will reproduce the error, the same code works for spark 2.3.x, 2.4.x and 3.1.x why doesn't it work for spark 3.2 ? Thank you so much On Mon, Dec 13, 2021 at 12:59 PM Sean Owen wrote: > ... but the error is not "because that already exists". See your stack > trace. It's because the definition is recursive. You define temp view > test1, create a second DF from it, and then redefine test1 as that result. > test1 depends on test1. > > On Mon, Dec 13, 2021 at 9:58 AM Daniel de Oliveira Mantovani < > daniel.oliveira.mantov...@gmail.com> wrote: > >> Sean, >> >> The method name is very clear "createOrReplaceTempView" doesn't make >> any sense to throw an exception because this view already exists. Spark >> 3.2.x is breaking back compatibility with no reason or sense. >> >> >> On Mon, Dec 13, 2021 at 12:53 PM Sean Owen wrote: >> >>> The error looks 'valid' - you define a temp view in terms of its own >>> previous version, which doesn't quite make sense - somewhere the new >>> definition depends on the old definition. I think it just correctly >>> surfaces as an error now,. >>> >>> On Mon, Dec 13, 2021 at 9:41 AM Daniel de Oliveira Mantovani < >>> daniel.oliveira.mantov...@gmail.com> wrote: >>> Hello team, I've found this issue while I was porting my project from Apache Spark 3.1.x to 3.2.x. https://stackoverflow.com/questions/69937415/spark-3-2-0-the-different-dataframe-createorreplacetempview-the-same-name-tempvi Do we have a bug for that in apache-spark or I need to create one ? Thank you so much [info] com.github.music.of.the.ainur.almaren.Test *** ABORTED *** [info] org.apache.spark.sql.AnalysisException: Recursive view `__TABLE__` detected (cycle: `__TABLE__` -> `__TABLE__`) [info] at org.apache.spark.sql.errors.QueryCompilationErrors$.recursiveViewDetectedError(QueryCompilationErrors.scala:2045) [info] at org.apache.spark.sql.execution.command.ViewHelper$.checkCyclicViewReference(views.scala:515) [info] at org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2(views.scala:522) [info] at org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2$adapted(views.scala:522) [info] at scala.collection.Iterator.foreach(Iterator.scala:941) [info] at scala.collection.Iterator.foreach$(Iterator.scala:941) [info] at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) [info] at scala.collection.IterableLike.foreach(IterableLike.scala:74) [info]
Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView
You can replace temp views. Again: what you can't do here is define a temp view in terms of itself. If you are reusing the same name over and over, it's probably easy to do that, so you don't want to do that. You want different names for different temp views, or else ensure you aren't doing the kind of thing shown in the SO post. You get the problem right? On Mon, Dec 13, 2021 at 10:43 AM Daniel de Oliveira Mantovani < daniel.oliveira.mantov...@gmail.com> wrote: > I didn't post the SO issue, I've just found the same exception I'm facing > for Spark 3.2. Almaren Framework has a concept of create temporary views > with the name "__TABLE__". > > Example, if you want to use SQL dialect to a DataFrame to join a > table/aggregation/apply a function whatever. Instead of you create a > temporary table you just use the "__TABLE__" alias. You don't really care > about the name of the table. You may use this "__TABLE__" approach in > different parts of your code. > > Why can't I create or replace temporary views in different DataFrame with > the same name as before ? > > > > On Mon, Dec 13, 2021 at 1:27 PM Sean Owen wrote: > >> If the issue is what you posted in SO, I think the stack trace explains >> it already. You want to avoid this recursive definition, which in general >> can't work. >> I think it's simply explicitly disallowed in all cases now, but, you >> should not be depending on this anyway - why can't this just be avoided? >> >> On Mon, Dec 13, 2021 at 10:06 AM Daniel de Oliveira Mantovani < >> daniel.oliveira.mantov...@gmail.com> wrote: >> >>> Sean, >>> >>> https://github.com/music-of-the-ainur/almaren-framework/tree/spark-3.2 >>> >>> Just executing "sbt test" will reproduce the error, the same code works >>> for spark 2.3.x, 2.4.x and 3.1.x why doesn't it work for spark 3.2 ? >>> >>> Thank you so much >>> >>> >>> >>> On Mon, Dec 13, 2021 at 12:59 PM Sean Owen wrote: >>> ... but the error is not "because that already exists". See your stack trace. It's because the definition is recursive. You define temp view test1, create a second DF from it, and then redefine test1 as that result. test1 depends on test1. On Mon, Dec 13, 2021 at 9:58 AM Daniel de Oliveira Mantovani < daniel.oliveira.mantov...@gmail.com> wrote: > Sean, > > The method name is very clear "createOrReplaceTempView" doesn't make > any sense to throw an exception because this view already exists. Spark > 3.2.x is breaking back compatibility with no reason or sense. > > > On Mon, Dec 13, 2021 at 12:53 PM Sean Owen wrote: > >> The error looks 'valid' - you define a temp view in terms of its own >> previous version, which doesn't quite make sense - somewhere the new >> definition depends on the old definition. I think it just correctly >> surfaces as an error now,. >> >> On Mon, Dec 13, 2021 at 9:41 AM Daniel de Oliveira Mantovani < >> daniel.oliveira.mantov...@gmail.com> wrote: >> >>> Hello team, >>> >>> I've found this issue while I was porting my project from Apache >>> Spark 3.1.x to 3.2.x. >>> >>> >>> https://stackoverflow.com/questions/69937415/spark-3-2-0-the-different-dataframe-createorreplacetempview-the-same-name-tempvi >>> >>> Do we have a bug for that in apache-spark or I need to create one ? >>> >>> Thank you so much >>> >>> [info] com.github.music.of.the.ainur.almaren.Test *** ABORTED *** >>> [info] org.apache.spark.sql.AnalysisException: Recursive view >>> `__TABLE__` detected (cycle: `__TABLE__` -> `__TABLE__`) >>> [info] at >>> org.apache.spark.sql.errors.QueryCompilationErrors$.recursiveViewDetectedError(QueryCompilationErrors.scala:2045) >>> [info] at >>> org.apache.spark.sql.execution.command.ViewHelper$.checkCyclicViewReference(views.scala:515) >>> [info] at >>> org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2(views.scala:522) >>> [info] at >>> org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2$adapted(views.scala:522) >>> [info] at scala.collection.Iterator.foreach(Iterator.scala:941) >>> [info] at scala.collection.Iterator.foreach$(Iterator.scala:941) >>> [info] at >>> scala.collection.AbstractIterator.foreach(Iterator.scala:1429) >>> [info] at >>> scala.collection.IterableLike.foreach(IterableLike.scala:74) >>> [info] at >>> scala.collection.IterableLike.foreach$(IterableLike.scala:73) >>> [info] at >>> scala.collection.AbstractIterable.foreach(Iterable.scala:56) >>> >>> -- >>> >>> -- >>> Daniel Mantovani >>> >>> > > -- > > -- > Daniel Mantovani > > >>> >>> -- >>> >>> -- >>> Daniel Mantovani >>> >>> > > -- > > -- > Daniel Mantovani > >
Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView
I didn't post the SO issue, I've just found the same exception I'm facing for Spark 3.2. Almaren Framework has a concept of create temporary views with the name "__TABLE__". Example, if you want to use SQL dialect to a DataFrame to join a table/aggregation/apply a function whatever. Instead of you create a temporary table you just use the "__TABLE__" alias. You don't really care about the name of the table. You may use this "__TABLE__" approach in different parts of your code. Why can't I create or replace temporary views in different DataFrame with the same name as before ? On Mon, Dec 13, 2021 at 1:27 PM Sean Owen wrote: > If the issue is what you posted in SO, I think the stack trace explains it > already. You want to avoid this recursive definition, which in general > can't work. > I think it's simply explicitly disallowed in all cases now, but, you > should not be depending on this anyway - why can't this just be avoided? > > On Mon, Dec 13, 2021 at 10:06 AM Daniel de Oliveira Mantovani < > daniel.oliveira.mantov...@gmail.com> wrote: > >> Sean, >> >> https://github.com/music-of-the-ainur/almaren-framework/tree/spark-3.2 >> >> Just executing "sbt test" will reproduce the error, the same code works >> for spark 2.3.x, 2.4.x and 3.1.x why doesn't it work for spark 3.2 ? >> >> Thank you so much >> >> >> >> On Mon, Dec 13, 2021 at 12:59 PM Sean Owen wrote: >> >>> ... but the error is not "because that already exists". See your stack >>> trace. It's because the definition is recursive. You define temp view >>> test1, create a second DF from it, and then redefine test1 as that result. >>> test1 depends on test1. >>> >>> On Mon, Dec 13, 2021 at 9:58 AM Daniel de Oliveira Mantovani < >>> daniel.oliveira.mantov...@gmail.com> wrote: >>> Sean, The method name is very clear "createOrReplaceTempView" doesn't make any sense to throw an exception because this view already exists. Spark 3.2.x is breaking back compatibility with no reason or sense. On Mon, Dec 13, 2021 at 12:53 PM Sean Owen wrote: > The error looks 'valid' - you define a temp view in terms of its own > previous version, which doesn't quite make sense - somewhere the new > definition depends on the old definition. I think it just correctly > surfaces as an error now,. > > On Mon, Dec 13, 2021 at 9:41 AM Daniel de Oliveira Mantovani < > daniel.oliveira.mantov...@gmail.com> wrote: > >> Hello team, >> >> I've found this issue while I was porting my project from Apache >> Spark 3.1.x to 3.2.x. >> >> >> https://stackoverflow.com/questions/69937415/spark-3-2-0-the-different-dataframe-createorreplacetempview-the-same-name-tempvi >> >> Do we have a bug for that in apache-spark or I need to create one ? >> >> Thank you so much >> >> [info] com.github.music.of.the.ainur.almaren.Test *** ABORTED *** >> [info] org.apache.spark.sql.AnalysisException: Recursive view >> `__TABLE__` detected (cycle: `__TABLE__` -> `__TABLE__`) >> [info] at >> org.apache.spark.sql.errors.QueryCompilationErrors$.recursiveViewDetectedError(QueryCompilationErrors.scala:2045) >> [info] at >> org.apache.spark.sql.execution.command.ViewHelper$.checkCyclicViewReference(views.scala:515) >> [info] at >> org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2(views.scala:522) >> [info] at >> org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2$adapted(views.scala:522) >> [info] at scala.collection.Iterator.foreach(Iterator.scala:941) >> [info] at scala.collection.Iterator.foreach$(Iterator.scala:941) >> [info] at >> scala.collection.AbstractIterator.foreach(Iterator.scala:1429) >> [info] at >> scala.collection.IterableLike.foreach(IterableLike.scala:74) >> [info] at >> scala.collection.IterableLike.foreach$(IterableLike.scala:73) >> [info] at >> scala.collection.AbstractIterable.foreach(Iterable.scala:56) >> >> -- >> >> -- >> Daniel Mantovani >> >> -- -- Daniel Mantovani >> >> -- >> >> -- >> Daniel Mantovani >> >> -- -- Daniel Mantovani
Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView
If the issue is what you posted in SO, I think the stack trace explains it already. You want to avoid this recursive definition, which in general can't work. I think it's simply explicitly disallowed in all cases now, but, you should not be depending on this anyway - why can't this just be avoided? On Mon, Dec 13, 2021 at 10:06 AM Daniel de Oliveira Mantovani < daniel.oliveira.mantov...@gmail.com> wrote: > Sean, > > https://github.com/music-of-the-ainur/almaren-framework/tree/spark-3.2 > > Just executing "sbt test" will reproduce the error, the same code works > for spark 2.3.x, 2.4.x and 3.1.x why doesn't it work for spark 3.2 ? > > Thank you so much > > > > On Mon, Dec 13, 2021 at 12:59 PM Sean Owen wrote: > >> ... but the error is not "because that already exists". See your stack >> trace. It's because the definition is recursive. You define temp view >> test1, create a second DF from it, and then redefine test1 as that result. >> test1 depends on test1. >> >> On Mon, Dec 13, 2021 at 9:58 AM Daniel de Oliveira Mantovani < >> daniel.oliveira.mantov...@gmail.com> wrote: >> >>> Sean, >>> >>> The method name is very clear "createOrReplaceTempView" doesn't make >>> any sense to throw an exception because this view already exists. Spark >>> 3.2.x is breaking back compatibility with no reason or sense. >>> >>> >>> On Mon, Dec 13, 2021 at 12:53 PM Sean Owen wrote: >>> The error looks 'valid' - you define a temp view in terms of its own previous version, which doesn't quite make sense - somewhere the new definition depends on the old definition. I think it just correctly surfaces as an error now,. On Mon, Dec 13, 2021 at 9:41 AM Daniel de Oliveira Mantovani < daniel.oliveira.mantov...@gmail.com> wrote: > Hello team, > > I've found this issue while I was porting my project from Apache Spark > 3.1.x to 3.2.x. > > > https://stackoverflow.com/questions/69937415/spark-3-2-0-the-different-dataframe-createorreplacetempview-the-same-name-tempvi > > Do we have a bug for that in apache-spark or I need to create one ? > > Thank you so much > > [info] com.github.music.of.the.ainur.almaren.Test *** ABORTED *** > [info] org.apache.spark.sql.AnalysisException: Recursive view > `__TABLE__` detected (cycle: `__TABLE__` -> `__TABLE__`) > [info] at > org.apache.spark.sql.errors.QueryCompilationErrors$.recursiveViewDetectedError(QueryCompilationErrors.scala:2045) > [info] at > org.apache.spark.sql.execution.command.ViewHelper$.checkCyclicViewReference(views.scala:515) > [info] at > org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2(views.scala:522) > [info] at > org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2$adapted(views.scala:522) > [info] at scala.collection.Iterator.foreach(Iterator.scala:941) > [info] at scala.collection.Iterator.foreach$(Iterator.scala:941) > [info] at > scala.collection.AbstractIterator.foreach(Iterator.scala:1429) > [info] at > scala.collection.IterableLike.foreach(IterableLike.scala:74) > [info] at > scala.collection.IterableLike.foreach$(IterableLike.scala:73) > [info] at > scala.collection.AbstractIterable.foreach(Iterable.scala:56) > > -- > > -- > Daniel Mantovani > > >>> >>> -- >>> >>> -- >>> Daniel Mantovani >>> >>> > > -- > > -- > Daniel Mantovani > >
Re: Log4j 1.2.17 spark CVE
This has come up several times over years - search JIRA. The very short summary is: Spark does not use log4j 1.x, but its dependencies do, and that's the issue. Anyone that can successfully complete the surgery at this point is welcome to, but I failed ~2 years ago. On Mon, Dec 13, 2021 at 10:02 AM Jörn Franke wrote: > Is it in any case appropriate to use log4j 1.x which is not maintained > anymore and has other security vulnerabilities which won’t be fixed anymore > ? > > Am 13.12.2021 um 06:06 schrieb Sean Owen : > > > Check the CVE - the log4j vulnerability appears to affect log4j 2, not > 1.x. There was mention that it could affect 1.x when used with JNDI or SMS > handlers, but Spark does neither. (unless anyone can think of something I'm > missing, but never heard or seen that come up at all in 7 years in Spark) > > The big issue would be applications that themselves configure log4j 2.x, > but that's not a Spark issue per se. > > On Sun, Dec 12, 2021 at 10:46 PM Pralabh Kumar > wrote: > >> Hi developers, users >> >> Spark is built using log4j 1.2.17 . Is there a plan to upgrade based on >> recent CVE detected ? >> >> >> Regards >> Pralabh kumar >> >
Re: Log4j 1.2.17 spark CVE
There is a discussion on Github on this topic and the recommendation is to upgrade from 1.x to 2.15.0, due to the vulnerability of 1.x: https://github.com/apache/logging-log4j2/pull/608 This discussion is also referenced by the German Federal Office for Information Security: https://www.bsi.bund.de/EN/Home/home_node.html Cheers, Martin Am 13.12.21 um 17:02 schrieb Jörn Franke: Is it in any case appropriate to use log4j 1.x which is not maintained anymore and has other security vulnerabilities which won’t be fixed anymore ? Am 13.12.2021 um 06:06 schrieb Sean Owen : Check the CVE - the log4j vulnerability appears to affect log4j 2, not 1.x. There was mention that it could affect 1.x when used with JNDI or SMS handlers, but Spark does neither. (unless anyone can think of something I'm missing, but never heard or seen that come up at all in 7 years in Spark) The big issue would be applications that themselves configure log4j 2.x, but that's not a Spark issue per se. On Sun, Dec 12, 2021 at 10:46 PM Pralabh Kumar wrote: Hi developers, users Spark is built using log4j 1.2.17 . Is there a plan to upgrade based on recent CVE detected ? Regards Pralabh kumar
Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView
Sean, https://github.com/music-of-the-ainur/almaren-framework/tree/spark-3.2 Just executing "sbt test" will reproduce the error, the same code works for spark 2.3.x, 2.4.x and 3.1.x why doesn't it work for spark 3.2 ? Thank you so much On Mon, Dec 13, 2021 at 12:59 PM Sean Owen wrote: > ... but the error is not "because that already exists". See your stack > trace. It's because the definition is recursive. You define temp view > test1, create a second DF from it, and then redefine test1 as that result. > test1 depends on test1. > > On Mon, Dec 13, 2021 at 9:58 AM Daniel de Oliveira Mantovani < > daniel.oliveira.mantov...@gmail.com> wrote: > >> Sean, >> >> The method name is very clear "createOrReplaceTempView" doesn't make any >> sense to throw an exception because this view already exists. Spark 3.2.x >> is breaking back compatibility with no reason or sense. >> >> >> On Mon, Dec 13, 2021 at 12:53 PM Sean Owen wrote: >> >>> The error looks 'valid' - you define a temp view in terms of its own >>> previous version, which doesn't quite make sense - somewhere the new >>> definition depends on the old definition. I think it just correctly >>> surfaces as an error now,. >>> >>> On Mon, Dec 13, 2021 at 9:41 AM Daniel de Oliveira Mantovani < >>> daniel.oliveira.mantov...@gmail.com> wrote: >>> Hello team, I've found this issue while I was porting my project from Apache Spark 3.1.x to 3.2.x. https://stackoverflow.com/questions/69937415/spark-3-2-0-the-different-dataframe-createorreplacetempview-the-same-name-tempvi Do we have a bug for that in apache-spark or I need to create one ? Thank you so much [info] com.github.music.of.the.ainur.almaren.Test *** ABORTED *** [info] org.apache.spark.sql.AnalysisException: Recursive view `__TABLE__` detected (cycle: `__TABLE__` -> `__TABLE__`) [info] at org.apache.spark.sql.errors.QueryCompilationErrors$.recursiveViewDetectedError(QueryCompilationErrors.scala:2045) [info] at org.apache.spark.sql.execution.command.ViewHelper$.checkCyclicViewReference(views.scala:515) [info] at org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2(views.scala:522) [info] at org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2$adapted(views.scala:522) [info] at scala.collection.Iterator.foreach(Iterator.scala:941) [info] at scala.collection.Iterator.foreach$(Iterator.scala:941) [info] at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) [info] at scala.collection.IterableLike.foreach(IterableLike.scala:74) [info] at scala.collection.IterableLike.foreach$(IterableLike.scala:73) [info] at scala.collection.AbstractIterable.foreach(Iterable.scala:56) -- -- Daniel Mantovani >> >> -- >> >> -- >> Daniel Mantovani >> >> -- -- Daniel Mantovani
Re: Log4j 1.2.17 spark CVE
Is it in any case appropriate to use log4j 1.x which is not maintained anymore and has other security vulnerabilities which won’t be fixed anymore ? > Am 13.12.2021 um 06:06 schrieb Sean Owen : > > > Check the CVE - the log4j vulnerability appears to affect log4j 2, not 1.x. > There was mention that it could affect 1.x when used with JNDI or SMS > handlers, but Spark does neither. (unless anyone can think of something I'm > missing, but never heard or seen that come up at all in 7 years in Spark) > > The big issue would be applications that themselves configure log4j 2.x, but > that's not a Spark issue per se. > >> On Sun, Dec 12, 2021 at 10:46 PM Pralabh Kumar >> wrote: >> Hi developers, users >> >> Spark is built using log4j 1.2.17 . Is there a plan to upgrade based on >> recent CVE detected ? >> >> >> Regards >> Pralabh kumar
Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView
... but the error is not "because that already exists". See your stack trace. It's because the definition is recursive. You define temp view test1, create a second DF from it, and then redefine test1 as that result. test1 depends on test1. On Mon, Dec 13, 2021 at 9:58 AM Daniel de Oliveira Mantovani < daniel.oliveira.mantov...@gmail.com> wrote: > Sean, > > The method name is very clear "createOrReplaceTempView" doesn't make any > sense to throw an exception because this view already exists. Spark 3.2.x > is breaking back compatibility with no reason or sense. > > > On Mon, Dec 13, 2021 at 12:53 PM Sean Owen wrote: > >> The error looks 'valid' - you define a temp view in terms of its own >> previous version, which doesn't quite make sense - somewhere the new >> definition depends on the old definition. I think it just correctly >> surfaces as an error now,. >> >> On Mon, Dec 13, 2021 at 9:41 AM Daniel de Oliveira Mantovani < >> daniel.oliveira.mantov...@gmail.com> wrote: >> >>> Hello team, >>> >>> I've found this issue while I was porting my project from Apache Spark >>> 3.1.x to 3.2.x. >>> >>> >>> https://stackoverflow.com/questions/69937415/spark-3-2-0-the-different-dataframe-createorreplacetempview-the-same-name-tempvi >>> >>> Do we have a bug for that in apache-spark or I need to create one ? >>> >>> Thank you so much >>> >>> [info] com.github.music.of.the.ainur.almaren.Test *** ABORTED *** >>> [info] org.apache.spark.sql.AnalysisException: Recursive view >>> `__TABLE__` detected (cycle: `__TABLE__` -> `__TABLE__`) >>> [info] at >>> org.apache.spark.sql.errors.QueryCompilationErrors$.recursiveViewDetectedError(QueryCompilationErrors.scala:2045) >>> [info] at >>> org.apache.spark.sql.execution.command.ViewHelper$.checkCyclicViewReference(views.scala:515) >>> [info] at >>> org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2(views.scala:522) >>> [info] at >>> org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2$adapted(views.scala:522) >>> [info] at scala.collection.Iterator.foreach(Iterator.scala:941) >>> [info] at scala.collection.Iterator.foreach$(Iterator.scala:941) >>> [info] at >>> scala.collection.AbstractIterator.foreach(Iterator.scala:1429) >>> [info] at scala.collection.IterableLike.foreach(IterableLike.scala:74) >>> [info] at scala.collection.IterableLike.foreach$(IterableLike.scala:73) >>> [info] at scala.collection.AbstractIterable.foreach(Iterable.scala:56) >>> >>> -- >>> >>> -- >>> Daniel Mantovani >>> >>> > > -- > > -- > Daniel Mantovani > >
Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView
Sean, The method name is very clear "createOrReplaceTempView" doesn't make any sense to throw an exception because this view already exists. Spark 3.2.x is breaking back compatibility with no reason or sense. On Mon, Dec 13, 2021 at 12:53 PM Sean Owen wrote: > The error looks 'valid' - you define a temp view in terms of its own > previous version, which doesn't quite make sense - somewhere the new > definition depends on the old definition. I think it just correctly > surfaces as an error now,. > > On Mon, Dec 13, 2021 at 9:41 AM Daniel de Oliveira Mantovani < > daniel.oliveira.mantov...@gmail.com> wrote: > >> Hello team, >> >> I've found this issue while I was porting my project from Apache Spark >> 3.1.x to 3.2.x. >> >> >> https://stackoverflow.com/questions/69937415/spark-3-2-0-the-different-dataframe-createorreplacetempview-the-same-name-tempvi >> >> Do we have a bug for that in apache-spark or I need to create one ? >> >> Thank you so much >> >> [info] com.github.music.of.the.ainur.almaren.Test *** ABORTED *** >> [info] org.apache.spark.sql.AnalysisException: Recursive view >> `__TABLE__` detected (cycle: `__TABLE__` -> `__TABLE__`) >> [info] at >> org.apache.spark.sql.errors.QueryCompilationErrors$.recursiveViewDetectedError(QueryCompilationErrors.scala:2045) >> [info] at >> org.apache.spark.sql.execution.command.ViewHelper$.checkCyclicViewReference(views.scala:515) >> [info] at >> org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2(views.scala:522) >> [info] at >> org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2$adapted(views.scala:522) >> [info] at scala.collection.Iterator.foreach(Iterator.scala:941) >> [info] at scala.collection.Iterator.foreach$(Iterator.scala:941) >> [info] at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) >> [info] at scala.collection.IterableLike.foreach(IterableLike.scala:74) >> [info] at scala.collection.IterableLike.foreach$(IterableLike.scala:73) >> [info] at scala.collection.AbstractIterable.foreach(Iterable.scala:56) >> >> -- >> >> -- >> Daniel Mantovani >> >> -- -- Daniel Mantovani
Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView
The error looks 'valid' - you define a temp view in terms of its own previous version, which doesn't quite make sense - somewhere the new definition depends on the old definition. I think it just correctly surfaces as an error now,. On Mon, Dec 13, 2021 at 9:41 AM Daniel de Oliveira Mantovani < daniel.oliveira.mantov...@gmail.com> wrote: > Hello team, > > I've found this issue while I was porting my project from Apache Spark > 3.1.x to 3.2.x. > > > https://stackoverflow.com/questions/69937415/spark-3-2-0-the-different-dataframe-createorreplacetempview-the-same-name-tempvi > > Do we have a bug for that in apache-spark or I need to create one ? > > Thank you so much > > [info] com.github.music.of.the.ainur.almaren.Test *** ABORTED *** > [info] org.apache.spark.sql.AnalysisException: Recursive view > `__TABLE__` detected (cycle: `__TABLE__` -> `__TABLE__`) > [info] at > org.apache.spark.sql.errors.QueryCompilationErrors$.recursiveViewDetectedError(QueryCompilationErrors.scala:2045) > [info] at > org.apache.spark.sql.execution.command.ViewHelper$.checkCyclicViewReference(views.scala:515) > [info] at > org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2(views.scala:522) > [info] at > org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2$adapted(views.scala:522) > [info] at scala.collection.Iterator.foreach(Iterator.scala:941) > [info] at scala.collection.Iterator.foreach$(Iterator.scala:941) > [info] at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) > [info] at scala.collection.IterableLike.foreach(IterableLike.scala:74) > [info] at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > [info] at scala.collection.AbstractIterable.foreach(Iterable.scala:56) > > -- > > -- > Daniel Mantovani > >
spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView
Hello team, I've found this issue while I was porting my project from Apache Spark 3.1.x to 3.2.x. https://stackoverflow.com/questions/69937415/spark-3-2-0-the-different-dataframe-createorreplacetempview-the-same-name-tempvi Do we have a bug for that in apache-spark or I need to create one ? Thank you so much [info] com.github.music.of.the.ainur.almaren.Test *** ABORTED *** [info] org.apache.spark.sql.AnalysisException: Recursive view `__TABLE__` detected (cycle: `__TABLE__` -> `__TABLE__`) [info] at org.apache.spark.sql.errors.QueryCompilationErrors$.recursiveViewDetectedError(QueryCompilationErrors.scala:2045) [info] at org.apache.spark.sql.execution.command.ViewHelper$.checkCyclicViewReference(views.scala:515) [info] at org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2(views.scala:522) [info] at org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2$adapted(views.scala:522) [info] at scala.collection.Iterator.foreach(Iterator.scala:941) [info] at scala.collection.Iterator.foreach$(Iterator.scala:941) [info] at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) [info] at scala.collection.IterableLike.foreach(IterableLike.scala:74) [info] at scala.collection.IterableLike.foreach$(IterableLike.scala:73) [info] at scala.collection.AbstractIterable.foreach(Iterable.scala:56) -- -- Daniel Mantovani
Re: About some Spark technical assistance
you were added to the repo to contribute, thanks. I included the java class and the paper i am replicating Le lun. 13 déc. 2021 à 04:27, a écrit : > github url please. > > On 2021-12-13 01:06, sam smith wrote: > > Hello guys, > > > > I am replicating a paper's algorithm (graph coloring algorithm) in > > Spark under Java, and thought about asking you guys for some > > assistance to validate / review my 600 lines of code. Any volunteers > > to share the code with ? > > Thanks >