date:20220204

Re: Spark 3.1.2 full thread dumps

2022-02-04 Thread Mich Talebzadeh

Indeed. Apologies for going on a tangent. view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on

Re: Spark 3.1.2 full thread dumps

2022-02-04 Thread Maksim Grinman

Not that this discussion is not interesting (it is), but this has strayed pretty far from my original question. Which was: How do I prevent spark from dumping huge Java Full Thread dumps when an executor appears to not be doing anything (in my case, there's a loop where it sleeps waiting for a

Re: Spark 3.1.2 full thread dumps

2022-02-04 Thread Mich Talebzadeh

OK basically, do we have a scenario where Spark or for that matter any cluster manager can deploy a new node (after the loss of an existing node) with the view of running the failed tasks on the new executor(s) deployed on that newly spun node? view my Linkedin profile

Re: Spark 3.1.2 full thread dumps

2022-02-04 Thread Holden Karau

We don’t block scaling up after node failure in classic Spark if that’s the question. On Fri, Feb 4, 2022 at 6:30 PM Mich Talebzadeh wrote: > From what I can see in auto scaling setup, you will always need a min of > two worker nodes as primary. It also states and I quote "Scaling primary >

Re: Spark 3.1.2 full thread dumps

2022-02-04 Thread Mich Talebzadeh

>From what I can see in auto scaling setup, you will always need a min of two worker nodes as primary. It also states and I quote "Scaling primary workers is not recommended due to HDFS limitations which result in instability while scaling. These limitations do not exist for secondary workers". So

Re: Spark 3.1 Json4s-native jar compatibility

2022-02-04 Thread Amit Sharma

Thanks Sean/Martin, my bad, Spark version was 3.0.1 so after using json 3.6.6 it fixed the issue. Thanks Amit On Fri, Feb 4, 2022 at 3:37 PM Sean Owen wrote: > My guess is that something else you depend on is actually bringing in a > different json4s, or you're otherwise mixing library/Spark

Re: Spark 3.1 Json4s-native jar compatibility

2022-02-04 Thread Sean Owen

My guess is that something else you depend on is actually bringing in a different json4s, or you're otherwise mixing library/Spark versions. Use mvn dependency:tree or equivalent on your build to see what you actually build in. You probably do not need to include json4s at all as it is in Spark

Re: Spark 3.1 Json4s-native jar compatibility

2022-02-04 Thread Amit Sharma

Martin Sean, changed it to 3.7.0-MS still getting the below error. I am still getting the same issue Exception in thread "streaming-job-executor-0" java.lang.NoSuchMethodError: org.json4s.ShortTypeHints$.apply$default$2()Ljava/lang/String; Thanks Amit On Fri, Feb 4, 2022 at 9:03 AM Martin

Re: how can I remove the warning message

2022-02-04 Thread Martin Grigorov

Hi, This is a JVM warning, as Sean explained. You cannot control it via loggers. You can disable it by passing --illegal-access=permit to java. Read more about it at https://softwaregarden.dev/en/posts/new-java/illegal-access-in-java-16/ On Sun, Jan 30, 2022 at 4:32 PM Sean Owen wrote: > This

Re: Spark 3.1 Json4s-native jar compatibility

2022-02-04 Thread Martin Grigorov

Hi, Amit said that he uses Spark 3.1, so the link should be https://github.com/apache/spark/blob/branch-3.1/pom.xml#L879 (3.7.0-M5) @Amit: check your classpath. Maybe there are more jars of this dependency. On Thu, Feb 3, 2022 at 10:53 PM Sean Owen wrote: > You can look it up: >

Re: Python performance

2022-02-04 Thread Sean Owen

Yes, in the sense that any transformation that can be expressed in the SQL-like DataFrame API will push down to the JVM, and take advantage of other optimizations, avoiding the data movement to/from Python and more. But you can't do this if you're expressing operations that are not in the

Re: Spark 3.1.2 full thread dumps

2022-02-04 Thread Sean Owen

I have not seen stack traces under autoscaling, so not even sure what the error in question is. There is always delay in acquiring a whole new executor in the cloud as it usually means a new VM is provisioned. Spark treats the new executor like any other, available for executing tasks. On Fri,

Re: Spark on K8s : property simillar to yarn.max.application.attempt

2022-02-04 Thread Mich Talebzadeh

Not as far as I know. If your driver pod fails, then you need to resubmit the job. I cannot see what else can be done? HTH view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss,

Re: Spark 3.1.2 full thread dumps

2022-02-04 Thread Mich Talebzadeh

Thanks for the info. My concern has always been on how Spark handles autoscaling (adding new executors) when the load pattern changes.I have tried to test this with setting the following parameters (Spark 3.1.2 on GCP) spark-submit --verbose \ ... --conf

Spark on K8s : property simillar to yarn.max.application.attempt

2022-02-04 Thread Pralabh Kumar

Hi Spark Team I am running spark on K8s and looking for a property/mechanism similar to yarn.max.application.attempt . I know this is not really a spark question , but i thought if anyone have faced the similar issue, Basically I want if my driver pod fails , it should be retried on a different

Re: Python performance

2022-02-04 Thread Bitfox

Please see my this test: https://blog.cloudcache.net/computing-performance-comparison-for-words-statistics/ Don’t use Python RDD, using dataframe instead. Regards On Fri, Feb 4, 2022 at 5:02 PM Hinko Kocevar wrote: > I'm looking into using Python interface with Spark and came across this >

Python performance

2022-02-04 Thread Hinko Kocevar

I'm looking into using Python interface with Spark and came across this [1] chart showing some performance hit when going with Python RDD. Data is ~ 7 years and for older version of Spark. Is this still the case with more recent Spark releases? I'm trying to understand what to expect from

Re: Spark 3.1.2 full thread dumps

Re: Spark 3.1.2 full thread dumps

Re: Spark 3.1.2 full thread dumps

Re: Spark 3.1.2 full thread dumps

Re: Spark 3.1.2 full thread dumps

Re: Spark 3.1 Json4s-native jar compatibility

Re: Spark 3.1 Json4s-native jar compatibility

Re: Spark 3.1 Json4s-native jar compatibility

Re: how can I remove the warning message

Re: Spark 3.1 Json4s-native jar compatibility

Re: Python performance

Re: Spark 3.1.2 full thread dumps

Re: Spark on K8s : property simillar to yarn.max.application.attempt

Re: Spark 3.1.2 full thread dumps

Spark on K8s : property simillar to yarn.max.application.attempt

Re: Python performance

Python performance

17 matches

Site Navigation

Mail list logo

Footer information