Re: Attaching Remote Debugger to Executor Threads

2020-10-15 Thread Jeff Evans
Use spark.executor.extraJavaOptions https://spark.apache.org/docs/latest/configuration.html#runtime-environment On Thu, Oct 15, 2020 at 1:22 PM Akshat Bordia wrote: > Hi, > > We are trying to debug an issue with Spark and need to connect a remote > debugger to the executors thread. The general

Supporting Row and DataFrame level metadata?

2020-09-24 Thread Jeff Evans
Hi, I'm wondering if there has been any past discussion on the subject of supporting metadata attributes as a first class concept, both at the row level, as well as the DataFrame level? I did a Jira search, but most of the items I found were unrelated to this concept, or pertained to column

Re: Unsubscribe

2020-06-23 Thread Jeff Evans
That is not how you unsubscribe. See here: https://gist.github.com/jeff303/ba1906bb7bcb2f2501528a8bb1521b8e On Tue, Jun 23, 2020 at 5:02 AM Kiran Kumar Dusi wrote: > Unsubscribe > > On Tue, 23 Jun 2020 at 15:18 Akhil Anil wrote: > >> -- >> Sent from Gmail Mobile >> >

Re: ./dev/run-tests failing at master

2020-05-14 Thread Jeff Evans
Are you positive you set up your Python environment correctly? To me, those error messages look like you are running Python 2, but it should be Python 3. On Thu, May 14, 2020 at 1:34 PM Sudharshann D wrote: > Hello! ;) > > I'm new to spark development and have been trying to set up my dev >

Re: What options do I have to handle third party classes that are not serializable?

2020-02-25 Thread Jeff Evans
Did you try this? https://stackoverflow.com/a/2114387/375670 On Tue, Feb 25, 2020 at 10:23 AM yeikel valdes wrote: > I am currently using a third party library(Lucene) with Spark that is not > serializable. Due to that reason, it generates the following exception : > > Job aborted due to

PySpark setup for IntelliJ IDEA

2020-01-24 Thread Jeff Evans
I couldn't find any specific information on setting up IntelliJ to debug PySpark correctly, so I did a short writeup here, after fumbling my way through it: https://github.com/jeff303/spark-development-tips Any improvements, corrections, or suggestions are welcomed.

Re: Build error: python/lib/pyspark.zip is not a ZIP archive

2020-01-10 Thread Jeff Evans
Actually, there is a really trivial fix for that (an existing file not being deleted when packaging). Opened SPARK-30489 for it. On Fri, Jan 10, 2020 at 3:52 PM Jeff Evans wrote: > Thanks for the tip. Fixed by simply removing python/lib/pyspark.zip > (since it's apparently gen

Re: Build error: python/lib/pyspark.zip is not a ZIP archive

2020-01-10 Thread Jeff Evans
t; any of the automated test builders failing. Nuke your local assembly > build and try again? > > On Fri, Jan 10, 2020 at 3:49 PM Jeff Evans > wrote: > > > > Greetings, > > > > I'm getting an error when building, on latest master (2bd873181 as of > this writing). F

Build error: python/lib/pyspark.zip is not a ZIP archive

2020-01-10 Thread Jeff Evans
Greetings, I'm getting an error when building, on latest master (2bd873181 as of this writing). Full build command I'm running is: ./build/mvn -DskipTests clean package [ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.8:run (create-tmp-dir) on project