This is indeed a JVM issue, not a Spark issue.  You may want to ask yourself why it is necessary to change the jar packages during runtime.  Changing package doesn't mean to reload the classes. There is no way to reload the same class unless you customize the classloader of Spark.  I also don't think it is necessary to implement a warning or error message when changing the configuration since it doesn't do any harm.  Spark uses lazy binding so you can do a lot of such "unharmful" things.  Developers will have to understand the behaviors of each API before when using them..

On 3/9/22 9:31 AM, Rafał Wojdyła wrote:
 Sean,
I understand you might be sceptical about adding this functionality into (py)spark, I'm curious: * would error/warning on update in configuration that is currently effectively impossible (requires restart of JVM) be reasonable?
* what do you think about the workaround in the issue?
Cheers - Rafal

On Wed, 9 Mar 2022 at 14:24, Sean Owen <sro...@gmail.com> wrote:

    Unfortunately this opens a lot more questions and problems than it
    solves. What if you take something off the classpath, for example?
    change a class?

    On Wed, Mar 9, 2022 at 8:22 AM Rafał Wojdyła
    <ravwojd...@gmail.com> wrote:

        Thanks Sean,
        To be clear, if you prefer to change the label on this issue
        from bug to sth else, feel free to do so, no strong opinions
        on my end. What happens to the classpath, whether spark uses
        some classloader magic, is probably an implementation detail.
        That said, it's definitely not intuitive that you can change
        the configuration and get the context (with the updated
        config) without any warnings/errors. Also what would you
        recommend as a workaround or solution to this problem? Any
        comments about the workaround in the issue? Keep in mind that
        I can't restart the long running orchestration process (python
        process if that matters).
        Cheers - Rafal

        On Wed, 9 Mar 2022 at 13:15, Sean Owen <sro...@gmail.com> wrote:

            That isn't a bug - you can't change the classpath once the
            JVM is executing.

            On Wed, Mar 9, 2022 at 7:11 AM Rafał Wojdyła
            <ravwojd...@gmail.com> wrote:

                Hi,
                My use case is that, I have a long running process
                (orchestrator) with multiple tasks, some tasks might
                require extra spark dependencies. It seems once the
                spark context is started it's not possible to update
                `spark.jars.packages`? I have reported an issue at
                https://issues.apache.org/jira/browse/SPARK-38438,
                together with a workaround ("hard reset of the
                cluster"). I wonder if anyone has a solution for this?
                Cheers - Rafal

Reply via email to