Hello,
Anyone has any comment or ideas regarding:
https://stackoverflow.com/questions/71610435/how-to-overwrite-pyspark-dataframe-schema-without-data-scan
please?
Cheers - Rafal
ur feedback about **that specific
workaround** please. Any reason not to use it?
Cheers - Rafal
On Thu, 10 Mar 2022 at 18:50, Rafał Wojdyła wrote:
> If you have a long running python orchestrator worker (e.g. Luigi worker),
> and say it's gets a DAG of A -> B ->C, and say the worke
ges or jars isn't your concern, why not just specify ALL
> packages that you would need for the Spark environment? You know you can
> define multiple packages under the packages option. This shouldn't cause
> memory issues since JVM uses dynamic class loading...
>
> On 3/9/22 10:03 PM, R
ning or error message
> when changing the configuration since it doesn't do any harm. Spark
> uses lazy binding so you can do a lot of such "unharmful" things.
> Developers will have to understand the behaviors of each API before when
> using them..
>
>
> On 3/9/22 9:31 AM, Raf
- Rafal
On Wed, 9 Mar 2022 at 14:24, Sean Owen wrote:
> Unfortunately this opens a lot more questions and problems than it solves.
> What if you take something off the classpath, for example? change a class?
>
> On Wed, Mar 9, 2022 at 8:22 AM Rafał Wojdyła wrote:
>
>> Thanks
orchestration
process (python process if that matters).
Cheers - Rafal
On Wed, 9 Mar 2022 at 13:15, Sean Owen wrote:
> That isn't a bug - you can't change the classpath once the JVM is
> executing.
>
> On Wed, Mar 9, 2022 at 7:11 AM Rafał Wojdyła wrote:
>
>> Hi,
>> My use case is
Hi,
My use case is that, I have a long running process (orchestrator) with
multiple tasks, some tasks might require extra spark dependencies. It seems
once the spark context is started it's not possible to update
`spark.jars.packages`? I have reported an issue at