Re: [DISCUSS] @Experimental annotations - processes and alternatives

Ismaël Mejía Sun, 08 Mar 2020 13:56:11 -0700

Kenn can you adjust the script to match only source code files: `--include
\*.java --include \*.py --include \*.go` otherwise it produces a lot of extra
false positives due to html files and cache files.  Also can we extract the full
annotation as a column so we can filter/group for the full kind (type) of the
experimental annotation e.g. @Experimental(Kind.SCHEMAS),
@Experimental(Kind.SOURCE_SINK), etc.


This way we can group occurrences per kind and quickly triage some of them which
are still clearly still experimental (and with ongoing independent stabilization
efforts [1]) like these:
@Experimental(Kind.SCHEMAS)
@Experimental(Kind.SPLITTABLE_DO_FN)
@Experimental(Kind.PORTABILITY)
(and probably @Experimental(Kind.CONTEXTFUL)

I have been going in the last weeks adjusting the Experimental annotations to
follow the @Experimental(Kind.FOO) pattern thinking about this future triage so
good to see the effort may pay :) As part of this work one idea we agreed with
Luke Cwik was to remove the Experimental annotations from ‘runners/core*’
because historically Beam has not had strong compatibility guarantees for users
of these APIs (runner authors). It is probably worth to re run the script
against the latest master because results in the spreadsheet do not correspond
with the current master. (Note that the remaining External class is still tagged
as Experimental because it is still pending to move it into ‘sdks/java/core’).

Not related to Experimental but worth mentioning is that we also
started tagging:
sdks/java/core/src/main/java/org/apache/beam/sdk/util/*
sdks/java/core/src/main/java/org/apache/beam/sdk/testing/*
as @Internal for the same reasons, classes in both packages are basically for
Internal use on Beam SDK Harness, for runner authors and for tests. And pipeline
authors should not be relying on their stability.

We also introduced package level Experimental annotations (package-info.java) so
this can easily count for 50 duplicates that should probably be trimmed for the
same person who is covering the corresponding files in the package. With all
these adjustments we will be easily below 250 matches.

Regards,
Ismaël

[1] 
https://lists.apache.org/thread.html/r73d3b19506ea435ee6be568ccc32065e36cd873dbbcf2a3e9049254e%40%3Cdev.beam.apache.org%3E



On Fri, Mar 6, 2020 at 11:54 PM Kenneth Knowles <[email protected]> wrote:
>
> OK I tried to make a tiny bit of progress on this, with `grep --ignore-case 
> --line-number --recursive '@experimental' .` there are 578 occurrences 
> (includes website and comments). Via `| cut -d ':' -f 1 | sort | uniq | wc 
> -l` there are 377 distinct code files.
>
> So that's a big project but easily scales to the contributors. I suggest we 
> need to crowdsource a bit.
>
> I created 
> https://docs.google.com/spreadsheets/d/1T98I7tFoUgwW2tegS5xbNRjaVDvZiBBLn7jg0Ef_IwU/edit?usp=sharing
>  where you can suggest/comment adding your name to a file to volunteer to own 
> going through the file.
>
> I have not checked git history to try to find owners.
>
> Kenn
>
> On Mon, Dec 2, 2019 at 10:26 AM Alexey Romanenko <[email protected]> 
> wrote:
>>
>> Thank you Kenn for starting this discussion.
>>
>> As I see, for now, the main goal for “@Experimental" annotation is to relive 
>> and be useful in the sense as it’s name says (this is obviously not a case 
>> for the moment). I'd suggest a bit more simplified scenario for this:
>>
>> 1. We do a revision of all “@Experimental" annotation uses now. For the code 
>> (IOs/libs/etc) that we 100% know that has been used in production for a long 
>> time with current stable API, we just take this annotation away since it’s 
>> no needed anymore.
>>
>> 2. For the code, that is left after p.1, we leave as “@Experimental”, wait 
>> for N releases (N=3 ?) and then take it away if there are no breaking 
>> changes happened. We may want to add new argument for “@Experimental” to 
>> keep track release number when it was added.
>>
>> 3. We would need to have a regular “Experimental annotation report” (like we 
>> have for dependencies) sending to dev@ and it will allow us to track new and 
>> out-dated annotation.
>>
>> 4. And on course we update contributors documentation about that.
>>
>> Idea of graduation by voting seems a bit complicated - for me it means that 
>> all added new user APIs should go through this process and I’m afraid that 
>> in the end, we potentially can be overwhelmed with number of such polls. I 
>> think that several releases of maturation and final decision of the 
>> person(2) responsible for the component should be enough.
>>
>> In the same time, I like the Andrew’s idea about checking a breaking changes 
>> through external tool. So, it could guarantee us to to remove experimental 
>> state without any fear to break API.
>>
>> In case of breaking changes of stable API, that won’t be possible to avoid, 
>> we still can use @Deprecated and wait for 3 release to remove (as we already 
>> did before). So, having up-to-date @Experimental and  @Deprecated  
>> annotations won’t be confusing for users.
>>
>>
>>
>>
>>
>> On 28 Nov 2019, at 04:48, Kenneth Knowles <[email protected]> wrote:
>>
>>
>>
>> On Wed, Nov 27, 2019 at 1:04 PM Elliotte Rusty Harold <[email protected]> 
>> wrote:
>>>
>>> On Wed, Nov 27, 2019 at 1:12 PM Kenneth Knowles <[email protected]> wrote:
>>> >
>>>
>>> > *Opt-in*: This is a powerful idea that I think changes everything.
>>> >    - for an experimental new IO, a separate artifact; this way we can 
>>> > also see downloads
>>> >    - for experimental code fragments, add checkState that the relevant 
>>> > experiment is turned on via flags
>>>
>>> To be clear the experimental artifact would have the same group ID and
>>> artifact ID but a different version than the non-experimental
>>> artifacts?  E.g.
>>> org.apache.beam:beam-runners-gcp-gcemd:2.4.0-experimental
>>>
>>> That could work. Changing the artifact ID or the package name would
>>> risk split package issues and diamond dependency problems. We'd still
>>> need to be careful about mixing experimental and non-experimental
>>> artifacts.
>>
>>
>> That's clever! I think using the classifier might be better than a modified 
>> version number, e.g. org.apache.beam:beam-io-mydb:2.4.0:experimental
>>
>> My prior idea was much less clever: for any version 2.X there would either 
>> be beam-io-mydb-experimental or beam-io-mydb (after graduation) so no 
>> problem with a split package. There would be no "same artifact id" concern.
>>
>> Your idea would allow us to ship two variants of the library, if we 
>> developed the tooling for it. I think doing the stripping of experimental 
>> bits and ensuring they both compile might be tricky unless we are stripping 
>> rather disjoint piece of the library.
>>
>> Kenn
>>
>>

Re: [DISCUSS] @Experimental annotations - processes and alternatives

Reply via email to