While I understand your concern about confusion for reverting the decision
on deprecation, we had a revert of deprecation against API which was
deprecated for multiple years before reverting the decision. See SPARK-32686
<https://issues.apache.org/jira/browse/SPARK-32686>. Maybe we had more
cases, and googling indicates to me there are more cases on various
projects about reverting their decision on reverting.

Also, while we are saying we shouldn't remove API although we deprecate
API, we never describe such a thing into a deprecation message. Users still
understand the deprecation as what they understand for other projects,
"refrain using this and migrate sooner than later". Constructing project
policy and guaranteeing to users are different - if we "guarantee" that the
API is deprecated but never be removed in future, that leads to another
sort of confusion. Do other deprecations really implicitly mean they can be
removed in future, as we start to guarantee some deprecation that we never
remove the API in future?

That said, deprecation would be the right way if Trigger.AvailableNow
covers the entire workload Trigger.Once has covered. We just indicated
there are some gaps, and people are trying to migrate by themselves with a
tricky/hacky approach and ended up complaining to us. I'd say we can truly
deprecate (I'd really like to) when we have a confidence that users don't
need any trick/hack on migration and it's just a piece of cake via changing
the trigger and done. Unfortunately that was figured out to be not the case.

We need some time to figure out gaps and address them in
Trigger.AvailableNow - if it's not addressable, we'd never be able to
deprecate Trigger.Once. Before that, I feel like strongly advising to
migrate "if possible" in documentation (or warn message in runtime) seems
to be the best bet. I meant, my bad on not truly understanding users, I
actually knew the gap and thought that's not Spark should guarantee, but
never imagined users heavily rely on the behavior (not on semantics but on
behavior itself). They consider this as breaking change if the semantic is
the same but behavior is not the same.


On Sat, Apr 20, 2024 at 1:16 PM Dongjoon Hyun <dongjoon.h...@gmail.com>
wrote:

> For that case, I believe it's enough for us to revise the deprecation
> message only by making sure that Apache Spark will keep it without removal
> for backward-compatibility purposes only. That's what the users asked,
> isn't that?
>
> > deprecation  of Trigger.Once confuses users that the trigger won't be
> available sooner (though we rarely remove public API).
>
> The feature was deprecated in Apache Spark 3.4.0 and `Undeprecation(?)`
> may cause another confusion in the community, not only for Trigger.Once but
> also for all historic `Deprecated` items.
>
> Dongjoon.
>
>
> On Fri, Apr 19, 2024 at 7:44 PM Jungtaek Lim <kabhwan.opensou...@gmail.com>
> wrote:
>
>> Hi dev,
>>
>> I'd like to raise a discussion to un-deprecate Trigger.Once in future
>> releases.
>>
>> I've proposed deprecation of Trigger.Once because it's semantically
>> broken and we made a change, but we've realized that there are really users
>> who strictly require the behavior of Trigger.Once (only run a single batch
>> in whatever reason) despite the semantic issue, and workaround with
>> Trigger.AvailableNow is arguably much more hacky or sometimes not even
>> possible.
>>
>> I still think we have to advise using Trigger.AvailableNow whenever
>> feasible, but deprecation  of Trigger.Once confuses users that the trigger
>> won't be available sooner (though we rarely remove public API). So maybe
>> warning log on usage sounds to me as a reasonable alternative.
>>
>> Thoughts?
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>

Reply via email to