My thought was to leave the asf-site branch in the beam-site repository, add generated docs to that branch (until we have a better solution), and have only sources in the beam repo.
Scott had filed https://issues.apache.org/jira/browse/BEAM-5459 - it would eliminate the need to place generated docs into git repos. On Mon, Sep 24, 2018 at 11:06 AM Udi Meiri <eh...@google.com> wrote: > I believe that beam.apache.org is populated from the asf-site branch of > the apache/beam-site repo. (gitpubsub: > https://www.apache.org/dev/project-site.html#intro) > If we move the markdown-based docs to apache/beam, leave generated javadoc > and pydoc in apache/beam-site, and point gitpubsub to apache/beam, then > javadoc and pydoc will not get pushed to the website. > > Is there some place where we can push javadoc and pydoc files? Or perhaps > there an alternative way to push updates to beam.apache.org? (not > requiring the asf-site branch) > > On Fri, Sep 21, 2018 at 6:40 PM Thomas Weise <t...@apache.org> wrote: > >> Hi Scott, >> >> Thanks for bringing the discussion back here. >> >> I agree that we should separate the changes for hosting of generated >> java/pydocs from the rest of website automation so that we can make the >> switch and fix the contributor headache soon. >> >> But perhaps we can avoid adding 4m lines of generated code to the main >> beam repository (and keep on adding with every release) if we continue to >> serve the site from the old beam-site repo? (I left a comment the doc.) >> >> About trying buildbot, as mentioned earlier I would be happy to help with >> it. I prefer a setup that keeps the docs separate from the web site. >> >> Thomas >> >> >> On Fri, Sep 21, 2018 at 10:28 AM Scott Wegner <sc...@apache.org> wrote: >> >>> Re-opening this thread as it came up today in the discussion for PR#6458 >>> [1]. This PR is part of the work for Beam-Site Automation Reliability >>> improvements; design doc here: https://s.apache.org/beam-site-automation >>> >>> The current plan is to keep generated javadoc/pydoc sources only on the >>> asf-site branch, which is necessary for the current githubpubsub publishing >>> mechanism. This maintains our current approach, the only change being that >>> we're moving the asf-site branch from the retiring apache/beam-site >>> repository into a new apache/beam repo branch. >>> >>> The concern for committing generated content is the extra overhead >>> during git fetch. I did some analysis to measure the impact [2], and found >>> that fetching a week of source + generated content history from >>> apache/beam-site took 0.39 seconds. >>> >>> I like the idea of publishing javadoc/pydoc snapshots to an external >>> location like Flink does with buildbot, but that work is separable and >>> shouldn't be a prerequisite for this effort. The goal of this work is to >>> improve the reliability of automation for contributing website changes. At >>> last measure, only about half of beam-site PR merges use Mergebot without >>> experiencing some reliability issue [3]. >>> >>> I've opened BEAM-5459 [4] to track moving our generated docs out of git. >>> Thomas, would you have bandwidth to look into this? >>> >>> [1] https://github.com/apache/beam/pull/6458#issuecomment-423406643 >>> [2] >>> https://docs.google.com/document/d/1lfbMhdIyDzIaBTgc9OUByhSwR94kfOzS_ozwKWTVl5U/edit#heading=h.uqzivheohd7j >>> [3] >>> https://docs.google.com/document/d/1lfbMhdIyDzIaBTgc9OUByhSwR94kfOzS_ozwKWTVl5U/edit#heading=h.a208cwi78xmu >>> [4] https://issues.apache.org/jira/browse/BEAM-5459 >>> >>> On Fri, Aug 24, 2018 at 11:48 AM Thomas Weise <t...@apache.org> wrote: >>> >>>> Hi Udi, >>>> >>>> Good to know you will continue this work. >>>> >>>> Let me know if you want to try the buildbot route (which does not >>>> require generated documentation to be checked into the repo). Happy to help >>>> with that. >>>> >>>> Thomas >>>> >>>> On Fri, Aug 24, 2018 at 11:36 AM Udi Meiri <eh...@google.com> wrote: >>>> >>>>> I'm picking up the website migration. The plan is to not include >>>>> generated files in the master branch. >>>>> >>>>> However, I've been told that even putting generated files a separate >>>>> branch could blow up the git repository for all (e.g. make git pulls a lot >>>>> longer?). >>>>> Not sure if this is a real issue or not. >>>>> >>>>> On Mon, Aug 20, 2018 at 2:53 AM Robert Bradshaw <rober...@google.com> >>>>> wrote: >>>>> >>>>>> On Sun, Aug 5, 2018 at 5:28 AM Thomas Weise <t...@apache.org> wrote: >>>>>> > >>>>>> > Yes, I think the separation of generated code will need to occur >>>>>> prior to completing the merge and switching the web site to the main >>>>>> repo. >>>>>> > >>>>>> > There should be no reason to check generated documentation into >>>>>> either of the repos/branches. >>>>>> >>>>>> Huge +1 to this. Thomas, would have time to set something like this up >>>>>> for Beam? If not, could anyone else pick this up? >>>>>> >>>>>> > Please see as an example how this was solved in Flink, using the >>>>>> ASF buildbot infrastructure. >>>>>> > >>>>>> > Documentation per version/release, for example: >>>>>> > >>>>>> > https://ci.apache.org/projects/flink/flink-docs-release-1.5/ >>>>>> > >>>>>> > The buildbot configuration is here (requires committer access): >>>>>> > >>>>>> > >>>>>> https://svn.apache.org/repos/infra/infrastructure/buildbot/aegis/buildmaster/master1/projects/flink.conf >>>>>> > >>>>>> > Thanks, >>>>>> > Thomas >>>>>> > >>>>>> > On Thu, Aug 2, 2018 at 6:46 PM Mikhail Gryzykhin <mig...@google.com> >>>>>> wrote: >>>>>> >> >>>>>> >> Last time I talked with Scott I brought this idea in. I believe >>>>>> the plan was either to publish compiled site to website directly, or keep >>>>>> it in separate storage from apache/beam repo. >>>>>> >> >>>>>> >> One of the main reasons not to check in compiled version of >>>>>> website is that every developer will have to pull all the versions of >>>>>> website every time they clone repo, which is not that good of an idea to >>>>>> do. >>>>>> >> >>>>>> >> Regards, >>>>>> >> --Mikhail >>>>>> >> >>>>>> >> Have feedback? >>>>>> >> >>>>>> >> >>>>>> >> On Thu, Aug 2, 2018 at 6:42 PM Udi Meiri <eh...@google.com> wrote: >>>>>> >>> >>>>>> >>> Pablo, the docs are generated into versioned paths, e.g., >>>>>> https://beam.apache.org/documentation/sdks/javadoc/2.5.0/ so tags >>>>>> are not necessary? >>>>>> >>> Also, once apache/beam-site is merged with apache/beam the >>>>>> release branch should have the relevant docs (although perhaps it's >>>>>> better >>>>>> to put them in a different repo or storage system). >>>>>> >>> >>>>>> >>> Thomas, I would very much like to not have javadoc/pydoc >>>>>> generation be part of the website review process, as it takes up a lot of >>>>>> time when changes are staged (10s of thousands of files), especially >>>>>> when a >>>>>> PR is updated and existing staged files need to be deleted. >>>>>> >>> >>>>>> >>> >>>>>> >>> On Thu, Aug 2, 2018 at 1:15 PM Mikhail Gryzykhin < >>>>>> mig...@google.com> wrote: >>>>>> >>>> >>>>>> >>>> +1 For removing old documentation. >>>>>> >>>> >>>>>> >>>> @Thomas: Migration work is in backlog and will be picked up in >>>>>> near time. >>>>>> >>>> >>>>>> >>>> --Mikhail >>>>>> >>>> >>>>>> >>>> Have feedback? >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> On Thu, Aug 2, 2018 at 12:54 PM Thomas Weise <t...@apache.org> >>>>>> wrote: >>>>>> >>>>> >>>>>> >>>>> +1 for removing pre 2.0 documentation (as well as the entries >>>>>> from https://beam.apache.org/get-started/downloads/) >>>>>> >>>>> >>>>>> >>>>> Isn't it part of the beam-site changes that we will no longer >>>>>> check in generated documentation into the repository? Those can be >>>>>> generated and deployed independently (when a commit to a branch occurs), >>>>>> such as done in the Apex and Flink projects. >>>>>> >>>>> >>>>>> >>>>> I was told that Scott who was working in the beam-site changes >>>>>> is on leave now and the migration is still pending (see note at >>>>>> https://github.com/apache/beam/tree/master/website). Is anyone else >>>>>> going to pick it up? >>>>>> >>>>> >>>>>> >>>>> Thanks, >>>>>> >>>>> Thomas >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> On Thu, Aug 2, 2018 at 12:33 PM Pablo Estrada < >>>>>> pabl...@google.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> Is it worth adding a tag / branch to the repositories every >>>>>> time we make a release, so that people are able to dive in and find the >>>>>> docs? >>>>>> >>>>>> Best >>>>>> >>>>>> -P. >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Aug 2, 2018 at 12:09 PM Ahmet Altay <al...@google.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>> >>>>>>> I would guess that users are still using some of these old >>>>>> releases. It is unclear from Beam website which releases are still >>>>>> supported or not. It probably makes sense to drop documentation for >>>>>> releases < 2.0. (I would suggest keeping docs for 2.0). For the future I >>>>>> can work on updating the Beam website to clarify the state of each >>>>>> release. >>>>>> >>>>>>> >>>>>> >>>>>>> On Thu, Aug 2, 2018 at 12:06 PM, Udi Meiri <eh...@google.com> >>>>>> wrote: >>>>>> >>>>>>>> >>>>>> >>>>>>>> The older docs are not directly linked to and are in Github >>>>>> commit history. >>>>>> >>>>>>>> >>>>>> >>>>>>>> If there are no objections I'm going to delete javadocs and >>>>>> pydocs for releases older than 1 year, >>>>>> >>>>>>>> meaning 2.0.0 and older (going by the dates here). >>>>>> >>>>>>>> >>>>>> >>>>>>>> On Thu, Aug 2, 2018 at 11:51 AM Daniel Oliveira < >>>>>> danolive...@google.com> wrote: >>>>>> >>>>>>>>> >>>>>> >>>>>>>>> The older docs should be recorded in the commit history of >>>>>> the website repository, right? If they're not currently used in the >>>>>> website >>>>>> and they're in the commit history then I don't see a reason to save them. >>>>>> >>>>>>>>> >>>>>> >>>>>>>>> On Tue, Jul 31, 2018 at 1:51 PM Udi Meiri <eh...@google.com> >>>>>> wrote: >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> Hi all, >>>>>> >>>>>>>>>> I'm writing a PR for apache/beam-site and >>>>>> beam_PreCommit_Website_Stage is timing out after 100 minutes, because >>>>>> it's >>>>>> trying to deletes 22k files and then copy 22k files (warning large file). >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> It seems that we could save a lot of time by deleting the >>>>>> older javadoc and pydoc files for older versions. Is there a good reason >>>>>> to >>>>>> keep around this kind of documentation for older versions (say 1 year >>>>>> back)? >>>>>> >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Got feedback? go/pabloem-feedback >>>>>> <https://goto.google.com/pabloem-feedback> >>>>>> >>>>> >>> >>> -- >>> >>> >>> >>> >>> Got feedback? tinyurl.com/swegner-feedback >>> >>