Awesome Ash ??, expecting beta2. And don't forget some minor bug still in beta1
https://issues.apache.org/jira/browse/AIRFLOW-3623?focusedCommentId=16803866&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16803866 Best wish. -- Jiajie ________________________________ From: Ash Berlin-Taylor <a...@apache.org> Sent: Friday, March 29, 2019 23:16 To: dev@airflow.apache.org Subject: Re: API Reference - current confusion and improvement plan It took pulling in about another 30 commits to get it without conflicts but I've pulled this in to the v1-10-stable branch so it will be in the 1.10.3 too! (There are 68 commits to the branch already since 1.10.3b1. Time for a beta2 I think) -a > On 29 Mar 2019, at 13:28, Jiajie Zhong <zhongjiajie...@hotmail.com> wrote: > > Thanks Kamil, really a great change in out documentation > > > Best wish. > -- Jiajie > > ________________________________ > From: Driesprong, Fokko <fo...@driesprong.frl> > Sent: Friday, March 29, 2019 19:16 > To: dev@airflow.apache.org > Subject: Re: API Reference - current confusion and improvement plan > > Awesome work Kamil. Thanks for giving some love to the documentation. It > really needed some :-) > > Don't forget to remove the line from the Github template: When adding new > operators/hooks/sensors, the autoclass documentation generation needs to be > added. > https://github.com/apache/airflow/blob/master/.github/PULL_REQUEST_TEMPLATE.md > > Cheers, Fokko > > Op wo 27 mrt. 2019 om 05:59 schreef Kamil Breguła <kamil.breg...@polidea.com >> : > >> Hi. >> >> Work on this has been completed. >> New documentation is available: >> https://airflow.readthedocs.io/en/latest/_api/index.html >> >> Greetings >> Kamil Breguła >> >> On Wed, Feb 27, 2019 at 12:51 PM Kamil Breguła >> <kamil.breg...@polidea.com> wrote: >>> >>> Hi. >>> >>> Me and Jarek Potiuk have recently worked to finish these changes. As a >> result, a PR series was created: >>> >>> - [AIRFLOW-XXX][1/3] Syntax docs improvements - >> https://github.com/apache/airflow/pull/4789 >>> - [AIRFLOW-3968][2/3] Refactor base GCP hook - >> https://github.com/apache/airflow/pull/4790 >>> - [AIRFLOW-3811][3/3] Add automatic generation of API Reference - >> https://github.com/apache/airflow/pull/4788 >>> >>> I invite you to review. Preview is available in the description of each >> PR >>> >>> Greets, >>> Kamil Breguła >>> >>> On Wed, Feb 6, 2019 at 2:09 PM Szymon Przedwojski < >> szymon.przedwoj...@polidea.com> wrote: >>>> >>>> +1 >>>> I also like the new docs layout and the big win is that it’s generated >> automatically from all files and we won’t have to modify code.rst / >> integration.rst manually anymore. >>>> >>>> Szymon Przedwojski >>>> Polidea | Software Engineer >>>> >>>> M: +48 500 330 790 >>>> E: szymon.przedwoj...@polidea.com >>>> >>>>> On 5 Feb 2019, at 21:33, Ash Berlin-Taylor <a...@apache.org> wrote: >>>>> >>>>> I have idly wondered about something like this as a layout >>>>> >>>>> from airflow.$something.aws.operators import EmrAddStepOperator >>>>> >>>>> - Grouping by service provider is more helpful >>>>> - Having more than one operator per module >>>>> - Not having `_operator` (etc.) suffix on the modue, and the class, >> and the module path >>>>> >>>>> Perhaps a bigger change - though to make it much less painful on our >> users we could keep the old names with a deprecation warning or two (even >> past 2.0, to say 2.1) Out of scope for current discussion. >>>>> >>>>> -ash >>>>> >>>>>> On 5 Feb 2019, at 20:22, Kamil Breguła <kamil.breg...@polidea.com> >> wrote: >>>>>> >>>>>> I think that we should group operators by service (ex. Amazon Web >> Service: >>>>>> Simple Cloud Storage). One module to one service. it will be much >> easier to >>>>>> navigate through them. A similar problem occurs with the Google Cloud >>>>>> Storage service, but we have a solution (PR: >>>>>> https://github.com/apache/airflow/pull/3000 ). A large part and >> future >>>>>> operators, which are written in accordance with the recommendations ( >>>>>> >> https://lists.apache.org/thread.html/e8534d82be611ae7bcb21ba371546a4278aad117d5e50361fd8f14fe@%3Cdev.airflow.apache.org%3E >> ), >>>>>> follow these rules. >>>>>> >>>>>> The problem will be with operators that integrate two services at >> the same >>>>>> time. I think that we can leave them in a separate module and link >> to this >>>>>> class in the description of the module. >>>>>> >>>>>> However, this is not a current problem. I just wanted to mark future >>>>>> improvements, which is possible if we introduce the proposed >> solution. >>>>>> >>>>>> On Tue, Feb 5, 2019 at 8:57 PM Ash Berlin-Taylor <a...@apache.org> >> wrote: >>>>>> >>>>>>> I like the API reference v2 layout a lot! Much easier to navigate >> and see >>>>>>> what classes are available, for me at least >>>>>>> >>>>>>> Documenting modules will help somewhat with a few things but, lets >> say the >>>>>>> "AWS" section of the integration doc is across the following >> modules: >>>>>>> >>>>>>> airflow.contrib.operators.aws_athena_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/aws_athena_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.awsbatch_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/awsbatch_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.ecs_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/ecs_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.emr_add_steps_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/emr_add_steps_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.emr_create_job_flow_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/emr_create_job_flow_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.emr_terminate_job_flow_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/emr_terminate_job_flow_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.s3_copy_object_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/s3_copy_object_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.s3_delete_objects_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/s3_delete_objects_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.s3_list_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/s3_list_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.s3_to_gcs_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/s3_to_gcs_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.s3_to_gcs_transfer_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/s3_to_gcs_transfer_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.s3_to_sftp_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/s3_to_sftp_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.sagemaker_base_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/sagemaker_base_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.sagemaker_endpoint_config_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/sagemaker_endpoint_config_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.sagemaker_endpoint_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/sagemaker_endpoint_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.sagemaker_model_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/sagemaker_model_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.sagemaker_training_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/sagemaker_training_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.sagemaker_transform_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/sagemaker_transform_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.sagemaker_tuning_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/sagemaker_tuning_operator/index.html >>>>>>>> >>>>>>> airflow.contrib.operators.segment_track_event_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/operators/segment_track_event_operator/index.html >>>>>>>> >>>>>>> airflow.operators.redshift_to_s3_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/operators/redshift_to_s3_operator/index.html >>>>>>>> >>>>>>> airflow.operators.s3_file_transform_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/operators/s3_file_transform_operator/index.html >>>>>>>> >>>>>>> airflow.operators.s3_to_hive_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/operators/s3_to_hive_operator/index.html >>>>>>>> >>>>>>> airflow.operators.s3_to_redshift_operator < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/operators/s3_to_redshift_operator/index.html >>>>>>>> >>>>>>> airflow.sensors.s3_key_sensor < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/sensors/s3_key_sensor/index.html >>>>>>>> >>>>>>> airflow.sensors.s3_prefix_sensor < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/sensors/s3_prefix_sensor/index.html >>>>>>>> >>>>>>> airflow.contrib.sensors.emr_base_sensor < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/sensors/emr_base_sensor/index.html >>>>>>>> >>>>>>> airflow.contrib.sensors.emr_job_flow_sensor < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/sensors/emr_job_flow_sensor/index.html >>>>>>>> >>>>>>> airflow.contrib.sensors.emr_step_sensor < >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/contrib/sensors/emr_step_sensor/index.html >>>>>>>> >>>>>>> >>>>>>> And that was just before I got bored of looking for them :) >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> On 5 Feb 2019, at 16:25, Kamil Breguła <kamil.breg...@polidea.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> I already have a POC: :-) >>>>>>>> >>>>>>>> Available at: http://level-can.surge.sh/html/autoapi/index.html >>>>>>>> >>>>>>>> I would like to point out that in addition to class documentation, >> you >>>>>>> can >>>>>>>> also document modules. >>>>>>>> >>>>>>> >> http://level-can.surge.sh/html/autoapi/airflow/executors/local_executor/index.html >>>>>>>> Currently, the `howto/operators.rst` file is used for this >> (Related link: >>>>>>>> >>>>>>> >> https://airflow.readthedocs.io/en/latest/howto/operator.html#cloudsqlqueryoperator >>>>>>>> ) >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Feb 5, 2019 at 5:18 PM Ash Berlin-Taylor <a...@apache.org> >> wrote: >>>>>>>> >>>>>>>>>> We want to rewrite the `integration.rst` file so that it does not >>>>>>> contain >>>>>>>>>> duplicates from `code.rst ' (API Reference). In the next step, >>>>>>> introduce >>>>>>>>>> the reference API generation based on the source code that will >> replace >>>>>>>>> the >>>>>>>>>> `code.rst` file. >>>>>>>>> >>>>>>>>> :100: Yes please! >>>>>>>>> >>>>>>>>> >>>>>>>>> Given a number of integrations are across multiple files (n >> operators, >>>>>>> and >>>>>>>>> m hooks) my first thought is that something in integration.rst, >> or a >>>>>>> file >>>>>>>>> elsewhere in the docs/ tree is the place to put this. >>>>>>>>> >>>>>>>>> On epydoc vs a sphinx extension I lean very heavily towards the >> sphinx >>>>>>>>> extension, as we are already using much of sphinx. >>>>>>>>> >>>>>>>>> Can you create a _small_ example of what you'd propse for no.4 (I >> don't >>>>>>>>> want you to do a lot of work that might be wasted) >>>>>>>>> >>>>>>>>> -ash >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 5 Feb 2019, at 15:55, Kamil Breguła < >> kamil.breg...@polidea.com> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hello community, >>>>>>>>>> >>>>>>>>>> While working on the documentation for the GCP operators, my >> team at >>>>>>>>>> Polidea encountered some confusion related to the structure of >> the >>>>>>>>>> documentation. >>>>>>>>>> >>>>>>>>>> Short story: >>>>>>>>>> >>>>>>>>>> We want to rewrite the `integration.rst` file so that it does not >>>>>>> contain >>>>>>>>>> duplicates from `code.rst ' (API Reference). In the next step, >>>>>>> introduce >>>>>>>>>> the reference API generation based on the source code that will >> replace >>>>>>>>> the >>>>>>>>>> `code.rst` file. >>>>>>>>>> >>>>>>>>>> Long story: >>>>>>>>>> >>>>>>>>>> Currently, the documentation contains two places where the >> description >>>>>>> of >>>>>>>>>> classes related to operators is included. They are `code.rst` and >>>>>>>>>> `integration.rst` files. >>>>>>>>>> >>>>>>>>>> The `integration.rst` file contains information about >> integration, in >>>>>>>>>> particular for Azure: Microsoft Azure, AWS: Amazon Web Services, >>>>>>>>>> Databricks, GCP: Google Cloud Platform, Qubole. Other >> integrations, >>>>>>>>>> however, do not have descriptions. >>>>>>>>>> >>>>>>>>>> The `code.rst` file contains “API Reference” which contains >> information >>>>>>>>>> about *all* classes including those included in the file >>>>>>>>> `integration.rst`. >>>>>>>>>> >>>>>>>>>> Such duplication, however, is problematic for several reasons: >>>>>>>>>> >>>>>>>>>> 1. >>>>>>>>>> >>>>>>>>>> Users may feel lost and may not know which section they should >> look >>>>>>>>> into. >>>>>>>>>> 2. >>>>>>>>>> >>>>>>>>>> Changes must be made in many places which leads to >> desynchronization. >>>>>>>>>> Most often, changes are made only in the source code, so they do >> not >>>>>>>>> appear >>>>>>>>>> in the generated documentation. >>>>>>>>>> 3. >>>>>>>>>> >>>>>>>>>> Linking to classes using the `class` directive for Sphinx is >>>>>>>>>> inconclusive - if the code is embedded both in `integration.rst` >> and >>>>>>>>>> `code.rst` using the `autoclass` directive, we’re not sure where >> the >>>>>>>>> user >>>>>>>>>> will be navigated. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> There are several solutions:: >>>>>>>>>> >>>>>>>>>> 1. >>>>>>>>>> >>>>>>>>>> Leave it as is. Then we need to agree on which `autoclass` >> directive >>>>>>>>>> should have the `no-index` flags. >>>>>>>>>> 2. >>>>>>>>>> >>>>>>>>>> Delete duplicates from the `code.rst` file and add a note about >> the >>>>>>>>>> `integration.rst` file in the `code.rst` file. >>>>>>>>>> 3. >>>>>>>>>> >>>>>>>>>> Delete duplicates from the `integration.rst` file and add a note >> about >>>>>>>>>> the `code.rst` file in the `integration.rst` file. >>>>>>>>>> 4. >>>>>>>>>> >>>>>>>>>> Delete information from both files and generate the API >> documentation >>>>>>>>>> always based only on the source code. This solution means that we >>>>>>> would >>>>>>>>>> have to write less documentation. >>>>>>>>>> There are ready tools that we can use: >>>>>>>>>> 1. >>>>>>>>>> >>>>>>>>>> epydoc - http://epydoc.sourceforge.net/ ; >>>>>>>>>> 2. >>>>>>>>>> >>>>>>>>>> autoapi extension for Sphinx - >>>>>>>>> https://github.com/rtfd/sphinx-autoapi >>>>>>>>>> ; >>>>>>>>>> 3. >>>>>>>>>> >>>>>>>>>> other - https://wiki.python.org/moin/DocumentationTools >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The first, second, third solution does not solve all problems. In >>>>>>>>>> particular, we still need to complete the `code.rst` and >>>>>>>>> `integration.rst` >>>>>>>>>> files. The fourth solution solves all problems, but is the most >>>>>>> complex. >>>>>>>>> It >>>>>>>>>> is worth noting that mixing solutions is possible. For example, >> we can >>>>>>>>>> delete information from the file `integration.rst` as short term >>>>>>> solution >>>>>>>>>> and start working on creating another form of documentation as a >> long >>>>>>>>> term >>>>>>>>>> solution. This is the best option in our opinion. >>>>>>>>>> >>>>>>>>>> I’ve recently done a few activities related to this topic. >>>>>>>>>> >>>>>>>>>> First, I added the noindex flag to autoclass directives for all >>>>>>> operators >>>>>>>>>> in `integration.rst` file. In rare cases (If any), this caused >> links >>>>>>> that >>>>>>>>>> were previously directed to the file `integration.rst` to be >> redirected >>>>>>>>> to >>>>>>>>>> the `code.rst` file. Elements had to be linked using `:class:` >> instead >>>>>>>>> of a >>>>>>>>>> section link. Each operator is included in the new section in >> this >>>>>>> file. >>>>>>>>>> >>>>>>>>>> PR: https://github.com/apache/airflow/pull/4585 >>>>>>>>>> <https://github.com/apache/airflow/pull/4585/files> >>>>>>>>>> >>>>>>>>>> Second, I completed the `code.rst` file with the missing classes. >>>>>>>>>> >>>>>>>>>> PR: https://github.com/apache/airflow/pull/4644 >>>>>>>>>> >>>>>>>>>> I would like to ask which solution is the best in your opinion? >> What >>>>>>>>> steps >>>>>>>>>> should we take to make the documentation more enjoyable? >>>>>>>>>> >>>>>>>>>> Greetings >>>>>>>>>> >>>>>>>>>> Kamil Breguła >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Kamil Breguła >>>>>>>> Polidea <https://www.polidea.com/> | Software Engineer >>>>>>>> >>>>>>>> M: +48 505 458 451 <+48505458451> >>>>>>>> E: kamil.breg...@polidea.com >>>>>>>> [image: Polidea] <https://www.polidea.com/> >>>>>>>> >>>>>>>> We create human & business stories through technology. >>>>>>>> Check out our projects! <https://www.polidea.com/our-work> >>>>>>>> [image: Github] <https://github.com/Polidea> [image: Facebook] >>>>>>>> <https://www.facebook.com/Polidea.Software> [image: Twitter] >>>>>>>> <https://twitter.com/polidea> [image: Linkedin] >>>>>>>> <https://www.linkedin.com/company/polidea> [image: Instagram] >>>>>>>> <https://instagram.com/polidea> [image: Behance] >>>>>>>> <https://www.behance.net/polidea> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Kamil Breguła >>>>>> Polidea <https://www.polidea.com/> | Software Engineer >>>>>> >>>>>> M: +48 505 458 451 <+48505458451> >>>>>> E: kamil.breg...@polidea.com >>>>>> [image: Polidea] <https://www.polidea.com/> >>>>>> >>>>>> We create human & business stories through technology. >>>>>> Check out our projects! <https://www.polidea.com/our-work> >>>>>> [image: Github] <https://github.com/Polidea> [image: Facebook] >>>>>> <https://www.facebook.com/Polidea.Software> [image: Twitter] >>>>>> <https://twitter.com/polidea> [image: Linkedin] >>>>>> <https://www.linkedin.com/company/polidea> [image: Instagram] >>>>>> <https://instagram.com/polidea> [image: Behance] >>>>>> <https://www.behance.net/polidea> >>>>> >>>> >>> >>> >>> -- >>> >>> Kamil Breguła >>> Polidea | Software Engineer >>> >>> M: +48 505 458 451 >>> E: kamil.breg...@polidea.com >>> >>> We create human & business stories through technology. >>> Check out our projects! >> >> >> >> -- >> >> Kamil Breguła >> Polidea | Software Engineer >> >> M: +48 505 458 451 >> E: kamil.breg...@polidea.com >> >> We create human & business stories through technology. >> Check out our projects! >>