How to overwrite PySpark DataFrame schema without data scan?

2022-04-12 Thread Rafał Wojdyła
Hello, Anyone has any comment or ideas regarding: https://stackoverflow.com/questions/71610435/how-to-overwrite-pyspark-dataframe-schema-without-data-scan please? Cheers - Rafal

Re: [SPARK-38438] pyspark - how to update spark.jars.packages on existing default context?

2022-03-11 Thread Rafał Wojdyła
ur feedback about **that specific workaround** please. Any reason not to use it? Cheers - Rafal On Thu, 10 Mar 2022 at 18:50, Rafał Wojdyła wrote: > If you have a long running python orchestrator worker (e.g. Luigi worker), > and say it's gets a DAG of A -> B ->C, and say the worke

Re: [SPARK-38438] pyspark - how to update spark.jars.packages on existing default context?

2022-03-10 Thread Rafał Wojdyła
ges or jars isn't your concern, why not just specify ALL > packages that you would need for the Spark environment? You know you can > define multiple packages under the packages option. This shouldn't cause > memory issues since JVM uses dynamic class loading... > > On 3/9/22 10:03 PM, R

Re: [SPARK-38438] pyspark - how to update spark.jars.packages on existing default context?

2022-03-09 Thread Rafał Wojdyła
ning or error message > when changing the configuration since it doesn't do any harm. Spark > uses lazy binding so you can do a lot of such "unharmful" things. > Developers will have to understand the behaviors of each API before when > using them.. > > > On 3/9/22 9:31 AM, Raf

Re: [SPARK-38438] pyspark - how to update spark.jars.packages on existing default context?

2022-03-09 Thread Rafał Wojdyła
- Rafal On Wed, 9 Mar 2022 at 14:24, Sean Owen wrote: > Unfortunately this opens a lot more questions and problems than it solves. > What if you take something off the classpath, for example? change a class? > > On Wed, Mar 9, 2022 at 8:22 AM Rafał Wojdyła wrote: > >> Thanks

Re: [SPARK-38438] pyspark - how to update spark.jars.packages on existing default context?

2022-03-09 Thread Rafał Wojdyła
orchestration process (python process if that matters). Cheers - Rafal On Wed, 9 Mar 2022 at 13:15, Sean Owen wrote: > That isn't a bug - you can't change the classpath once the JVM is > executing. > > On Wed, Mar 9, 2022 at 7:11 AM Rafał Wojdyła wrote: > >> Hi, >> My use case is

[SPARK-38438] pyspark - how to update spark.jars.packages on existing default context?

2022-03-09 Thread Rafał Wojdyła
Hi, My use case is that, I have a long running process (orchestrator) with multiple tasks, some tasks might require extra spark dependencies. It seems once the spark context is started it's not possible to update `spark.jars.packages`? I have reported an issue at