Just to be clear, generated html for javadoc and pydoc will be put in apache/beam-site, but generated html for .md files will be put in apache/beam under the asf-site branch.
On Wed, Sep 26, 2018 at 9:34 AM Thomas Weise <t...@apache.org> wrote: > Looks like the is agreement that all sources should be in the main beam > repository, the remaining discussion was where the generated content should > be served from, specifically the generated docs. > > If the setup that Alan found allows us to keep using the beam-site > repository for the generated stuff and that does not unreasonably > complicate the CI process, then I'm in favor of that. It looks cleaner to > not mingle source and generated files in the same repo. Otherwise we can do > the asf-site branch in the main repo and get rid of docs from it once we > found a better solution. > > > On Wed, Sep 26, 2018 at 7:09 AM Robert Bradshaw <rober...@google.com> > wrote: > >> OK, thanks. That link was very helpful. Of the three options we must use, >> checking into git seems preferable than checking into svn let alone the >> CMS. Keeping the same repo means that it's harder to generate the docs for >> version X while head is checked out. >> >> I'm in favor of moving forward with this in the short term, but we should >> expore other options (like Flink has) for the longer term. >> >> >> >> On Wed, Sep 26, 2018 at 3:53 PM Scott Wegner <sc...@apache.org> wrote: >> >>> Yes. There are few options for publishing your ASF website, described >>> here: https://www.apache.org/dev/project-site.html. We can publish from >>> a Git repo, SVN, or a UI-based CMS interface. >>> >>> On Wed, Sep 26, 2018 at 9:45 AM Robert Bradshaw <rober...@google.com> >>> wrote: >>> >>>> I am also definitely in favor of a single repository. Perhaps I'm just >>>> misunderstanding why the generated must be put in a git repository at >>>> all--is it because that's the easiest way to serve them? >>>> >>>> On Wed, Sep 26, 2018 at 3:39 PM Scott Wegner <sc...@apache.org> wrote: >>>> >>>>> Alan found the place where website publishing is configured [1], which >>>>> has examples of project sites being configured with more than one git >>>>> root. >>>>> This is great for us because it allows us to leave generated >>>>> javadocs/pydocs in the beam-site repository and publish website markdown >>>>> content from the main repo. >>>>> >>>>> Alan has a PR ready to publish generated HTML in a post-commit job >>>>> [2]. Once that goes through the last step is to upgrade the publishing >>>>> config. >>>>> >>>>> [1] >>>>> https://github.com/apache/infrastructure-puppet/blob/deployment/modules/gitwcsub/files/config/gitwcsub.cfg >>>>> [2] https://github.com/apache/beam/pull/6431 >>>>> >>>>> On Mon, Sep 24, 2018 at 4:35 PM Scott Wegner <sweg...@google.com> >>>>> wrote: >>>>> >>>>>> > We could add a new default branch (master?) and keep all the >>>>>> non-generated files (src/) there, and put generated files (content/) in >>>>>> the >>>>>> asf-site branch (like we already do). >>>>>> >>>>>> I'm strongly in favor of having sources in a single repository. We >>>>>> have significant process and infrastructure built up for the apache/beam >>>>>> repo (for build, PR, CI, release, etc.) that we can take advantage of by >>>>>> putting website sources in the same repo. The current beam-site repo PR >>>>>> automation is flaky because it was custom-built and not given the same >>>>>> level of attention as the main repo. >>>>>> >>>>>> The caveat to consolidating website sources in the main repo is that >>>>>> it incentivizes putting the generated sources branch on the same repo. >>>>>> I've >>>>>> documented a few of the reasons in the Appendix of the design doc [1]: >>>>>> - It's easier to maintain a single repository; easily apply existing >>>>>> tooling/infrastructure >>>>>> - Jenkins tooling for publishing generated HTML may not work >>>>>> cross-repo [2] >>>>>> >>>>>> My preference is to move forward with the migration of sources to >>>>>> apache/beam [master], and website generated HTML to apache/beam >>>>>> [asf-site]. >>>>>> I like the idea of separating the publishing/hosting of generated >>>>>> javadocs/pydocs since they add so much cruft, but it should not hold up >>>>>> the >>>>>> migration. >>>>>> >>>>>> [1] >>>>>> https://docs.google.com/document/d/1lfbMhdIyDzIaBTgc9OUByhSwR94kfOzS_ozwKWTVl5U/edit#heading=h.wqwi2jpoiiuc >>>>>> >>>>>> [2] >>>>>> https://stackoverflow.com/questions/14843696/checkout-multiple-git-repos-into-same-jenkins-workspace >>>>>> >>>>>> On Mon, Sep 24, 2018 at 2:33 PM Udi Meiri <eh...@google.com> wrote: >>>>>> >>>>>>> Staying on beam-site SGTM. We could add a new default branch >>>>>>> (master?) and keep all the non-generated files (src/) there, and put >>>>>>> generated files (content/) in the asf-site branch (like we already do). >>>>>>> That way there's no confusion as to which files you should update. >>>>>>> (This is of course assuming we still place generated docs in git >>>>>>> repos.) >>>>>>> >>>>>>> On Mon, Sep 24, 2018 at 11:23 AM Thomas Weise <t...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> My thought was to leave the asf-site branch in the beam-site >>>>>>>> repository, add generated docs to that branch (until we have a better >>>>>>>> solution), and have only sources in the beam repo. >>>>>>>> >>>>>>>> Scott had filed https://issues.apache.org/jira/browse/BEAM-5459 - >>>>>>>> it would eliminate the need to place generated docs into git repos. >>>>>>>> >>>>>>>> On Mon, Sep 24, 2018 at 11:06 AM Udi Meiri <eh...@google.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> I believe that beam.apache.org is populated from the asf-site >>>>>>>>> branch of the apache/beam-site repo. (gitpubsub: >>>>>>>>> https://www.apache.org/dev/project-site.html#intro) >>>>>>>>> If we move the markdown-based docs to apache/beam, leave generated >>>>>>>>> javadoc and pydoc in apache/beam-site, and point gitpubsub to >>>>>>>>> apache/beam, >>>>>>>>> then javadoc and pydoc will not get pushed to the website. >>>>>>>>> >>>>>>>>> Is there some place where we can push javadoc and pydoc files? Or >>>>>>>>> perhaps there an alternative way to push updates to >>>>>>>>> beam.apache.org? (not requiring the asf-site branch) >>>>>>>>> >>>>>>>>> On Fri, Sep 21, 2018 at 6:40 PM Thomas Weise <t...@apache.org> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Scott, >>>>>>>>>> >>>>>>>>>> Thanks for bringing the discussion back here. >>>>>>>>>> >>>>>>>>>> I agree that we should separate the changes for hosting of >>>>>>>>>> generated java/pydocs from the rest of website automation so that we >>>>>>>>>> can >>>>>>>>>> make the switch and fix the contributor headache soon. >>>>>>>>>> >>>>>>>>>> But perhaps we can avoid adding 4m lines of generated code to the >>>>>>>>>> main beam repository (and keep on adding with every release) if we >>>>>>>>>> continue >>>>>>>>>> to serve the site from the old beam-site repo? (I left a comment the >>>>>>>>>> doc.) >>>>>>>>>> >>>>>>>>>> About trying buildbot, as mentioned earlier I would be happy to >>>>>>>>>> help with it. I prefer a setup that keeps the docs separate from the >>>>>>>>>> web >>>>>>>>>> site. >>>>>>>>>> >>>>>>>>>> Thomas >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Sep 21, 2018 at 10:28 AM Scott Wegner <sc...@apache.org> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Re-opening this thread as it came up today in the discussion for >>>>>>>>>>> PR#6458 [1]. This PR is part of the work for Beam-Site Automation >>>>>>>>>>> Reliability improvements; design doc here: >>>>>>>>>>> https://s.apache.org/beam-site-automation >>>>>>>>>>> >>>>>>>>>>> The current plan is to keep generated javadoc/pydoc sources only >>>>>>>>>>> on the asf-site branch, which is necessary for the current >>>>>>>>>>> githubpubsub >>>>>>>>>>> publishing mechanism. This maintains our current approach, the only >>>>>>>>>>> change >>>>>>>>>>> being that we're moving the asf-site branch from the retiring >>>>>>>>>>> apache/beam-site repository into a new apache/beam repo branch. >>>>>>>>>>> >>>>>>>>>>> The concern for committing generated content is the extra >>>>>>>>>>> overhead during git fetch. I did some analysis to measure the >>>>>>>>>>> impact [2], >>>>>>>>>>> and found that fetching a week of source + generated content >>>>>>>>>>> history from >>>>>>>>>>> apache/beam-site took 0.39 seconds. >>>>>>>>>>> >>>>>>>>>>> I like the idea of publishing javadoc/pydoc snapshots to an >>>>>>>>>>> external location like Flink does with buildbot, but that work is >>>>>>>>>>> separable >>>>>>>>>>> and shouldn't be a prerequisite for this effort. The goal of this >>>>>>>>>>> work is >>>>>>>>>>> to improve the reliability of automation for contributing website >>>>>>>>>>> changes. >>>>>>>>>>> At last measure, only about half of beam-site PR merges use Mergebot >>>>>>>>>>> without experiencing some reliability issue [3]. >>>>>>>>>>> >>>>>>>>>>> I've opened BEAM-5459 [4] to track moving our generated docs out >>>>>>>>>>> of git. Thomas, would you have bandwidth to look into this? >>>>>>>>>>> >>>>>>>>>>> [1] >>>>>>>>>>> https://github.com/apache/beam/pull/6458#issuecomment-423406643 >>>>>>>>>>> [2] >>>>>>>>>>> https://docs.google.com/document/d/1lfbMhdIyDzIaBTgc9OUByhSwR94kfOzS_ozwKWTVl5U/edit#heading=h.uqzivheohd7j >>>>>>>>>>> [3] >>>>>>>>>>> https://docs.google.com/document/d/1lfbMhdIyDzIaBTgc9OUByhSwR94kfOzS_ozwKWTVl5U/edit#heading=h.a208cwi78xmu >>>>>>>>>>> [4] https://issues.apache.org/jira/browse/BEAM-5459 >>>>>>>>>>> >>>>>>>>>>> On Fri, Aug 24, 2018 at 11:48 AM Thomas Weise <t...@apache.org> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Udi, >>>>>>>>>>>> >>>>>>>>>>>> Good to know you will continue this work. >>>>>>>>>>>> >>>>>>>>>>>> Let me know if you want to try the buildbot route (which does >>>>>>>>>>>> not require generated documentation to be checked into the repo). >>>>>>>>>>>> Happy to >>>>>>>>>>>> help with that. >>>>>>>>>>>> >>>>>>>>>>>> Thomas >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Aug 24, 2018 at 11:36 AM Udi Meiri <eh...@google.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> I'm picking up the website migration. The plan is to not >>>>>>>>>>>>> include generated files in the master branch. >>>>>>>>>>>>> >>>>>>>>>>>>> However, I've been told that even putting generated files a >>>>>>>>>>>>> separate branch could blow up the git repository for all (e.g. >>>>>>>>>>>>> make git >>>>>>>>>>>>> pulls a lot longer?). >>>>>>>>>>>>> Not sure if this is a real issue or not. >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Aug 20, 2018 at 2:53 AM Robert Bradshaw < >>>>>>>>>>>>> rober...@google.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> On Sun, Aug 5, 2018 at 5:28 AM Thomas Weise <t...@apache.org> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > Yes, I think the separation of generated code will need to >>>>>>>>>>>>>> occur prior to completing the merge and switching the web site >>>>>>>>>>>>>> to the main >>>>>>>>>>>>>> repo. >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > There should be no reason to check generated documentation >>>>>>>>>>>>>> into either of the repos/branches. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Huge +1 to this. Thomas, would have time to set something >>>>>>>>>>>>>> like this up >>>>>>>>>>>>>> for Beam? If not, could anyone else pick this up? >>>>>>>>>>>>>> >>>>>>>>>>>>>> > Please see as an example how this was solved in Flink, >>>>>>>>>>>>>> using the ASF buildbot infrastructure. >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > Documentation per version/release, for example: >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.5/ >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > The buildbot configuration is here (requires committer >>>>>>>>>>>>>> access): >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> https://svn.apache.org/repos/infra/infrastructure/buildbot/aegis/buildmaster/master1/projects/flink.conf >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > Thanks, >>>>>>>>>>>>>> > Thomas >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > On Thu, Aug 2, 2018 at 6:46 PM Mikhail Gryzykhin < >>>>>>>>>>>>>> mig...@google.com> wrote: >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> Last time I talked with Scott I brought this idea in. I >>>>>>>>>>>>>> believe the plan was either to publish compiled site to website >>>>>>>>>>>>>> directly, >>>>>>>>>>>>>> or keep it in separate storage from apache/beam repo. >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> One of the main reasons not to check in compiled version >>>>>>>>>>>>>> of website is that every developer will have to pull all the >>>>>>>>>>>>>> versions of >>>>>>>>>>>>>> website every time they clone repo, which is not that good of an >>>>>>>>>>>>>> idea to do. >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> Regards, >>>>>>>>>>>>>> >> --Mikhail >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> Have feedback? >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> On Thu, Aug 2, 2018 at 6:42 PM Udi Meiri <eh...@google.com> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>> >>>>>>>>>>>>>> >>> Pablo, the docs are generated into versioned paths, e.g., >>>>>>>>>>>>>> https://beam.apache.org/documentation/sdks/javadoc/2.5.0/ so >>>>>>>>>>>>>> tags are not necessary? >>>>>>>>>>>>>> >>> Also, once apache/beam-site is merged with apache/beam >>>>>>>>>>>>>> the release branch should have the relevant docs (although >>>>>>>>>>>>>> perhaps it's >>>>>>>>>>>>>> better to put them in a different repo or storage system). >>>>>>>>>>>>>> >>> >>>>>>>>>>>>>> >>> Thomas, I would very much like to not have javadoc/pydoc >>>>>>>>>>>>>> generation be part of the website review process, as it takes up >>>>>>>>>>>>>> a lot of >>>>>>>>>>>>>> time when changes are staged (10s of thousands of files), >>>>>>>>>>>>>> especially when a >>>>>>>>>>>>>> PR is updated and existing staged files need to be deleted. >>>>>>>>>>>>>> >>> >>>>>>>>>>>>>> >>> >>>>>>>>>>>>>> >>> On Thu, Aug 2, 2018 at 1:15 PM Mikhail Gryzykhin < >>>>>>>>>>>>>> mig...@google.com> wrote: >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> >>>> +1 For removing old documentation. >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> >>>> @Thomas: Migration work is in backlog and will be picked >>>>>>>>>>>>>> up in near time. >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> >>>> --Mikhail >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> >>>> Have feedback? >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> >>>> On Thu, Aug 2, 2018 at 12:54 PM Thomas Weise < >>>>>>>>>>>>>> t...@apache.org> wrote: >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> +1 for removing pre 2.0 documentation (as well as the >>>>>>>>>>>>>> entries from https://beam.apache.org/get-started/downloads/) >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> Isn't it part of the beam-site changes that we will no >>>>>>>>>>>>>> longer check in generated documentation into the repository? >>>>>>>>>>>>>> Those can be >>>>>>>>>>>>>> generated and deployed independently (when a commit to a branch >>>>>>>>>>>>>> occurs), >>>>>>>>>>>>>> such as done in the Apex and Flink projects. >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> I was told that Scott who was working in the beam-site >>>>>>>>>>>>>> changes is on leave now and the migration is still pending (see >>>>>>>>>>>>>> note at >>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/website). Is >>>>>>>>>>>>>> anyone else going to pick it up? >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> Thanks, >>>>>>>>>>>>>> >>>>> Thomas >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> On Thu, Aug 2, 2018 at 12:33 PM Pablo Estrada < >>>>>>>>>>>>>> pabl...@google.com> wrote: >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> >>>>>> Is it worth adding a tag / branch to the repositories >>>>>>>>>>>>>> every time we make a release, so that people are able to dive in >>>>>>>>>>>>>> and find >>>>>>>>>>>>>> the docs? >>>>>>>>>>>>>> >>>>>> Best >>>>>>>>>>>>>> >>>>>> -P. >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> >>>>>> On Thu, Aug 2, 2018 at 12:09 PM Ahmet Altay < >>>>>>>>>>>>>> al...@google.com> wrote: >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>>> I would guess that users are still using some of >>>>>>>>>>>>>> these old releases. It is unclear from Beam website which >>>>>>>>>>>>>> releases are >>>>>>>>>>>>>> still supported or not. It probably makes sense to drop >>>>>>>>>>>>>> documentation for >>>>>>>>>>>>>> releases < 2.0. (I would suggest keeping docs for 2.0). For the >>>>>>>>>>>>>> future I >>>>>>>>>>>>>> can work on updating the Beam website to clarify the state of >>>>>>>>>>>>>> each release. >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>>> On Thu, Aug 2, 2018 at 12:06 PM, Udi Meiri < >>>>>>>>>>>>>> eh...@google.com> wrote: >>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>> >>>>>>>> The older docs are not directly linked to and are in >>>>>>>>>>>>>> Github commit history. >>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>> >>>>>>>> If there are no objections I'm going to delete >>>>>>>>>>>>>> javadocs and pydocs for releases older than 1 year, >>>>>>>>>>>>>> >>>>>>>> meaning 2.0.0 and older (going by the dates here). >>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>> >>>>>>>> On Thu, Aug 2, 2018 at 11:51 AM Daniel Oliveira < >>>>>>>>>>>>>> danolive...@google.com> wrote: >>>>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>> The older docs should be recorded in the commit >>>>>>>>>>>>>> history of the website repository, right? If they're not >>>>>>>>>>>>>> currently used in >>>>>>>>>>>>>> the website and they're in the commit history then I don't see a >>>>>>>>>>>>>> reason to >>>>>>>>>>>>>> save them. >>>>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>> On Tue, Jul 31, 2018 at 1:51 PM Udi Meiri < >>>>>>>>>>>>>> eh...@google.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>> I'm writing a PR for apache/beam-site and >>>>>>>>>>>>>> beam_PreCommit_Website_Stage is timing out after 100 minutes, >>>>>>>>>>>>>> because it's >>>>>>>>>>>>>> trying to deletes 22k files and then copy 22k files (warning >>>>>>>>>>>>>> large file). >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> It seems that we could save a lot of time by >>>>>>>>>>>>>> deleting the older javadoc and pydoc files for older versions. >>>>>>>>>>>>>> Is there a >>>>>>>>>>>>>> good reason to keep around this kind of documentation for older >>>>>>>>>>>>>> versions >>>>>>>>>>>>>> (say 1 year back)? >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>> -- >>>>>>>>>>>>>> >>>>>> Got feedback? go/pabloem-feedback >>>>>>>>>>>>>> <https://goto.google.com/pabloem-feedback> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Got feedback? tinyurl.com/swegner-feedback >>>>>>>>>>> >>>>>>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Got feedback? tinyurl.com/swegner-feedback >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> >>>>> >>>>> >>>>> Got feedback? tinyurl.com/swegner-feedback >>>>> >>>> >>> >>> -- >>> >>> >>> >>> >>> Got feedback? tinyurl.com/swegner-feedback >>> >>
smime.p7s
Description: S/MIME Cryptographic Signature