I believe that beam.apache.org is populated from the asf-site branch of the
apache/beam-site repo. (gitpubsub:
https://www.apache.org/dev/project-site.html#intro)
If we move the markdown-based docs to apache/beam, leave generated javadoc
and pydoc in apache/beam-site, and point gitpubsub to apache/beam, then
javadoc and pydoc will not get pushed to the website.

Is there some place where we can push javadoc and pydoc files? Or perhaps
there an alternative way to push updates to beam.apache.org? (not requiring
the asf-site branch)

On Fri, Sep 21, 2018 at 6:40 PM Thomas Weise <t...@apache.org> wrote:

> Hi Scott,
>
> Thanks for bringing the discussion back here.
>
> I agree that we should separate the changes for hosting of generated
> java/pydocs from the rest of website automation so that we can make the
> switch and fix the contributor headache soon.
>
> But perhaps we can avoid adding 4m lines of generated code to the main
> beam repository (and keep on adding with every release) if we continue to
> serve the site from the old beam-site repo? (I left a comment the doc.)
>
> About trying buildbot, as mentioned earlier I would be happy to help with
> it. I prefer a setup that keeps the docs separate from the web site.
>
> Thomas
>
>
> On Fri, Sep 21, 2018 at 10:28 AM Scott Wegner <sc...@apache.org> wrote:
>
>> Re-opening this thread as it came up today in the discussion for PR#6458
>> [1]. This PR is part of the work for Beam-Site Automation Reliability
>> improvements; design doc here: https://s.apache.org/beam-site-automation
>>
>> The current plan is to keep generated javadoc/pydoc sources only on the
>> asf-site branch, which is necessary for the current githubpubsub publishing
>> mechanism. This maintains our current approach, the only change being that
>> we're moving the asf-site branch from the retiring apache/beam-site
>> repository into a new apache/beam repo branch.
>>
>> The concern for committing generated content is the extra overhead during
>> git fetch. I did some analysis to measure the impact [2], and found that
>> fetching a week of source + generated content history from apache/beam-site
>> took 0.39 seconds.
>>
>> I like the idea of publishing javadoc/pydoc snapshots to an external
>> location like Flink does with buildbot, but that work is separable and
>> shouldn't be a prerequisite for this effort. The goal of this work is to
>> improve the reliability of automation for contributing website changes. At
>> last measure, only about half of beam-site PR merges use Mergebot without
>> experiencing some reliability issue [3].
>>
>> I've opened BEAM-5459 [4] to track moving our generated docs out of git.
>> Thomas, would you have bandwidth to look into this?
>>
>> [1] https://github.com/apache/beam/pull/6458#issuecomment-423406643
>> [2]
>> https://docs.google.com/document/d/1lfbMhdIyDzIaBTgc9OUByhSwR94kfOzS_ozwKWTVl5U/edit#heading=h.uqzivheohd7j
>> [3]
>> https://docs.google.com/document/d/1lfbMhdIyDzIaBTgc9OUByhSwR94kfOzS_ozwKWTVl5U/edit#heading=h.a208cwi78xmu
>> [4] https://issues.apache.org/jira/browse/BEAM-5459
>>
>> On Fri, Aug 24, 2018 at 11:48 AM Thomas Weise <t...@apache.org> wrote:
>>
>>> Hi Udi,
>>>
>>> Good to know you will continue this work.
>>>
>>> Let me know if you want to try the buildbot route (which does not
>>> require generated documentation to be checked into the repo). Happy to help
>>> with that.
>>>
>>> Thomas
>>>
>>> On Fri, Aug 24, 2018 at 11:36 AM Udi Meiri <eh...@google.com> wrote:
>>>
>>>> I'm picking up the website migration. The plan is to not include
>>>> generated files in the master branch.
>>>>
>>>> However, I've been told that even putting generated files a separate
>>>> branch could blow up the git repository for all (e.g. make git pulls a lot
>>>> longer?).
>>>> Not sure if this is a real issue or not.
>>>>
>>>> On Mon, Aug 20, 2018 at 2:53 AM Robert Bradshaw <rober...@google.com>
>>>> wrote:
>>>>
>>>>> On Sun, Aug 5, 2018 at 5:28 AM Thomas Weise <t...@apache.org> wrote:
>>>>> >
>>>>> > Yes, I think the separation of generated code will need to occur
>>>>> prior to completing the merge and switching the web site to the main repo.
>>>>> >
>>>>> > There should be no reason to check generated documentation into
>>>>> either of the repos/branches.
>>>>>
>>>>> Huge +1 to this. Thomas, would have time to set something like this up
>>>>> for Beam? If not, could anyone else pick this up?
>>>>>
>>>>> > Please see as an example how this was solved in Flink, using the ASF
>>>>> buildbot infrastructure.
>>>>> >
>>>>> > Documentation per version/release, for example:
>>>>> >
>>>>> > https://ci.apache.org/projects/flink/flink-docs-release-1.5/
>>>>> >
>>>>> > The buildbot configuration is here (requires committer access):
>>>>> >
>>>>> >
>>>>> https://svn.apache.org/repos/infra/infrastructure/buildbot/aegis/buildmaster/master1/projects/flink.conf
>>>>> >
>>>>> > Thanks,
>>>>> > Thomas
>>>>> >
>>>>> > On Thu, Aug 2, 2018 at 6:46 PM Mikhail Gryzykhin <mig...@google.com>
>>>>> wrote:
>>>>> >>
>>>>> >> Last time I talked with Scott I brought this idea in. I believe the
>>>>> plan was either to publish compiled site to website directly, or keep it 
>>>>> in
>>>>> separate storage from apache/beam repo.
>>>>> >>
>>>>> >> One of the main reasons not to check in compiled version of website
>>>>> is that every developer will have to pull all the versions of website 
>>>>> every
>>>>> time they clone repo, which is not that good of an idea to do.
>>>>> >>
>>>>> >> Regards,
>>>>> >> --Mikhail
>>>>> >>
>>>>> >> Have feedback?
>>>>> >>
>>>>> >>
>>>>> >> On Thu, Aug 2, 2018 at 6:42 PM Udi Meiri <eh...@google.com> wrote:
>>>>> >>>
>>>>> >>> Pablo, the docs are generated into versioned paths, e.g.,
>>>>> https://beam.apache.org/documentation/sdks/javadoc/2.5.0/ so tags are
>>>>> not necessary?
>>>>> >>> Also, once apache/beam-site is merged with apache/beam the release
>>>>> branch should have the relevant docs (although perhaps it's better to put
>>>>> them in a different repo or storage system).
>>>>> >>>
>>>>> >>> Thomas, I would very much like to not have javadoc/pydoc
>>>>> generation be part of the website review process, as it takes up a lot of
>>>>> time when changes are staged (10s of thousands of files), especially when 
>>>>> a
>>>>> PR is updated and existing staged files need to be deleted.
>>>>> >>>
>>>>> >>>
>>>>> >>> On Thu, Aug 2, 2018 at 1:15 PM Mikhail Gryzykhin <
>>>>> mig...@google.com> wrote:
>>>>> >>>>
>>>>> >>>> +1 For removing old documentation.
>>>>> >>>>
>>>>> >>>> @Thomas: Migration work is in backlog and will be picked up in
>>>>> near time.
>>>>> >>>>
>>>>> >>>> --Mikhail
>>>>> >>>>
>>>>> >>>> Have feedback?
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> On Thu, Aug 2, 2018 at 12:54 PM Thomas Weise <t...@apache.org>
>>>>> wrote:
>>>>> >>>>>
>>>>> >>>>> +1 for removing pre 2.0 documentation (as well as the entries
>>>>> from https://beam.apache.org/get-started/downloads/)
>>>>> >>>>>
>>>>> >>>>> Isn't it part of the beam-site changes that we will no longer
>>>>> check in generated documentation into the repository? Those can be
>>>>> generated and deployed independently (when a commit to a branch occurs),
>>>>> such as done in the Apex and Flink projects.
>>>>> >>>>>
>>>>> >>>>> I was told that Scott who was working in the beam-site changes
>>>>> is on leave now and the migration is still pending (see note at
>>>>> https://github.com/apache/beam/tree/master/website). Is anyone else
>>>>> going to pick it up?
>>>>> >>>>>
>>>>> >>>>> Thanks,
>>>>> >>>>> Thomas
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> On Thu, Aug 2, 2018 at 12:33 PM Pablo Estrada <
>>>>> pabl...@google.com> wrote:
>>>>> >>>>>>
>>>>> >>>>>> Is it worth adding a tag / branch to the repositories every
>>>>> time we make a release, so that people are able to dive in and find the
>>>>> docs?
>>>>> >>>>>> Best
>>>>> >>>>>> -P.
>>>>> >>>>>>
>>>>> >>>>>> On Thu, Aug 2, 2018 at 12:09 PM Ahmet Altay <al...@google.com>
>>>>> wrote:
>>>>> >>>>>>>
>>>>> >>>>>>> I would guess that users are still using some of these old
>>>>> releases. It is unclear from Beam website which releases are still
>>>>> supported or not. It probably makes sense to drop documentation for
>>>>> releases < 2.0. (I would suggest keeping docs for 2.0). For the future I
>>>>> can work on updating the Beam website to clarify the state of each 
>>>>> release.
>>>>> >>>>>>>
>>>>> >>>>>>> On Thu, Aug 2, 2018 at 12:06 PM, Udi Meiri <eh...@google.com>
>>>>> wrote:
>>>>> >>>>>>>>
>>>>> >>>>>>>> The older docs are not directly linked to and are in Github
>>>>> commit history.
>>>>> >>>>>>>>
>>>>> >>>>>>>> If there are no objections I'm going to delete javadocs and
>>>>> pydocs for releases older than 1 year,
>>>>> >>>>>>>> meaning 2.0.0 and older (going by the dates here).
>>>>> >>>>>>>>
>>>>> >>>>>>>> On Thu, Aug 2, 2018 at 11:51 AM Daniel Oliveira <
>>>>> danolive...@google.com> wrote:
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> The older docs should be recorded in the commit history of
>>>>> the website repository, right? If they're not currently used in the 
>>>>> website
>>>>> and they're in the commit history then I don't see a reason to save them.
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> On Tue, Jul 31, 2018 at 1:51 PM Udi Meiri <eh...@google.com>
>>>>> wrote:
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Hi all,
>>>>> >>>>>>>>>> I'm writing a PR for apache/beam-site and
>>>>> beam_PreCommit_Website_Stage is timing out after 100 minutes, because it's
>>>>> trying to deletes 22k files and then copy 22k files (warning large file).
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> It seems that we could save a lot of time by deleting the
>>>>> older javadoc and pydoc files for older versions. Is there a good reason 
>>>>> to
>>>>> keep around this kind of documentation for older versions (say 1 year 
>>>>> back)?
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>> --
>>>>> >>>>>> Got feedback? go/pabloem-feedback
>>>>> <https://goto.google.com/pabloem-feedback>
>>>>>
>>>>
>>
>> --
>>
>>
>>
>>
>> Got feedback? tinyurl.com/swegner-feedback
>>
>

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to