FYI tracking issue has been created: https://github.com/apache/pulsar/issues/19064
I plan to finish it by the end of next month. Best, tison. tison <wander4...@gmail.com> 于2022年12月21日周三 11:33写道: > Thanks for your feedback! > > @Yu > > Thanks for sharing the previous thread. I looped in @michaeljmarshall > here. > > @Jun > > It's possible but causes a new shortcoming: Now you should tell the > contributor that the versioned docs are different from the NEXT version > docs, lol. > > If our developers don't complain about these separated sources. Like @Asaf > comment: > > > We can take, let's say, five features and see if they were actually done > in > > the same PR or separate PR. I guess that most documentation is actually > > updated separately. Thus, from that perspective, maybe it’s not a con. > > Then we can do this refactor thoroughgoing. > > Also, if we keep, somehow several sources in the main repo. We still have > shortcomings: > > 1. Duplicated CI workflows. > 2. Cumbersome preview scaffolding in the main repo. > > ... which is the original purpose I'd like to overcome. > > Best, > tison. > > > Jun Ma <momoma...@hotmail.com> 于2022年12月21日周三 11:19写道: > >> Is it possible to come up with a compromised solution that has the pros >> of both sides but minimizes the side effect? I'm thinking maybe it's not >> necessary to sacrifice the current contribution process, as long as it can >> greatly reduce the load of back-end actions and source size. For example, >> if we only move out the versioned docs to the site repo but keep the source >> of the NEXT docs in the pulsar repo, does this help to win a large >> proportion of those pros when people can still contribute as usual? >> >> ________________________________ >> From: Jiaqi Shen <gleiphir2...@gmail.com> >> Sent: Tuesday, December 20, 2022 17:15 >> To: dev@pulsar.apache.org <dev@pulsar.apache.org> >> Subject: Re: [PROPOSAL] Website precommit and move the source of docs to >> the site repo >> >> +1, it makes sense to me. >> >> Thanks, >> Jiaqi Shen >> >> >> Yu <li...@apache.org> 于2022年12月19日周一 20:57写道: >> >> > Hi tison, >> > >> > Thanks for raising this up! >> > >> > Our community had a similar discussion previously and chose to "keep" >> the >> > doc repo stay in the Pulsar main repo at that time. >> > >> > [1] lists the pros and cons of "keep" and "not keep" solutions. >> > >> > I'm +0 on this proposal because I think the total scores of these two >> > solutions are almost equal after weighing the pros and cons. >> > >> > ~~~~~~~~~~~~~~~~~~~~ >> > >> > [1] https://lists.apache.org/thread/mf2xwntfgn84dq78ksqv22jk3drq6xb3 >> > >> > >> > On Mon, Dec 19, 2022 at 5:40 PM tison <wander4...@gmail.com> wrote: >> > >> > > Thanks for your feedback! >> > > >> > > @Asaf >> > > >> > > > pre-commit >> > > >> > > I mean CI checks before merging a patch. Currently, we don't run >> checks >> > for >> > > the content before merging them. This causes a series of syntax errors >> > and >> > > broken links issues. If we hold docs under site2 folder in the main >> repo >> > > and then copied to the site repo, we have two places to build such CI >> > > checks. What's worse, the checks for the main repo will be quite >> > > cumbersome (that you do some if-else logic in the whole Pulsar CI >> > > workflows, and do the sync sequentially in that workflow). >> > > >> > > If we hold the source of docs only in the site repo, we can extend the >> > > "precommit" workflow[1] I added recently to check for syntax errors >> and >> > > broken links also. >> > > >> > > > What does the apache/pulsar-site repo contain today? >> > > >> > > It should be covered by the documentation guide page[2]. It holds the >> > > source of the official website and the user docs are synced from the >> main >> > > repo. >> > > >> > > > What content do we have today in the pulsar repo related to the >> site? >> > > >> > > After issue-18014[3] is done, we host only user docs and some JSON >> > metadata >> > > in the main repo, which is synced by site_syncer.py[4]. >> > > >> > > > Can you explain that better? Are you saying pulsar source JARs >> contain >> > > the documentation? >> > > >> > > No. Source JARs contain only the Java files and necessary copyrights >> > info. >> > > The source release is, for example, >> > > >> > > >> > >> https://archive.apache.org/dist/pulsar/pulsar-2.10.2/apache-pulsar-2.10.2-src.tar.gz >> > > , >> > > which is extracted to 173M where 129M is occupied by the site2 folder. >> > > >> > > This also affects when developers do git clone to clone the repo. >> > > >> > > > I mean, if you wish to document a bug fix in 2.9.x, for example, >> would >> > > you do it in the 2.9.x branch under site2/docs or >> > > site2/website/versioned_docs/2.9.5? >> > > >> > > This is another question. Ideally, we should have hosted versioned >> docs >> > > associated with the specific version to that branch, like Apache Flink >> > does >> > > as I mentioned[5]. But we do not, and actually the situation is we >> update >> > > the versioned docs under the master branch and thus, the docs can be >> > synced >> > > properly. >> > > >> > > See also the "Alternatives" section in the original email. >> > > >> > > @All >> > > >> > > Since we don't have objections to the possible cons listed above or >> any >> > new >> > > ones, I'm going to create a tracking issue later this week and show >> what >> > > will be changed in PRs for further review. >> > > >> > > Best, >> > > tison. >> > > >> > > [1] >> > > >> > > >> > >> https://github.com/apache/pulsar-site/blob/f7abc615d57d9846ed093922d24bff952dc0e838/.github/workflows/ci-precommit.yml >> > > [2] >> > > >> > > >> > >> https://pulsar.apache.org/contribute/document-contribution/#source-repositories >> > > [3] https://github.com/apache/pulsar/issues/18014 >> > > [4] >> > > >> > > >> > >> https://github.com/apache/pulsar-site/blob/f7abc615d57d9846ed093922d24bff952dc0e838/tools/pytools/lib/execute/site_syncer.py >> > > [5] https://github.com/apache/flink/tree/master/docs >> > > >> > > >> > > PengHui Li <peng...@apache.org> 于2022年12月19日周一 16:26写道: >> > > >> > > > +1 >> > > > >> > > > I support moving them to the website repo. >> > > > >> > > > Thanks, >> > > > Penghui >> > > > >> > > > On Mon, Dec 19, 2022 at 12:04 PM Yunze Xu >> <y...@streamnative.io.invalid >> > > >> > > > wrote: >> > > > >> > > > > +1. The most significant point to me is that we can preview all >> the >> > > > > content of the website without synchronizing contents from the >> > > > > apache/pulsar repo. >> > > > > >> > > > > Thanks, >> > > > > Yunze >> > > > > >> > > > > On Mon, Dec 19, 2022 at 9:53 AM Li Li <urf...@apache.org> wrote: >> > > > > > >> > > > > > +1, That’s a good idea. >> > > > > > >> > > > > > > On Dec 16, 2022, at 07:07, tison <wander4...@gmail.com> >> wrote: >> > > > > > > >> > > > > > > Hi, >> > > > > > > >> > > > > > > After several works around the build flow of our official >> > > > > website[1][2][3], >> > > > > > > the content sync and site build flow is debuggable and >> > reproducible >> > > > > now. >> > > > > > > >> > > > > > > However, compared to other Apache projects' websites' project >> > > layouts >> > > > > and >> > > > > > > workflow, we still meet two challenges on the Pulsar site: >> > > > > > > >> > > > > > > 1. We don't have a pre-commit workflow for any website-related >> > > > changes. >> > > > > > > Thus, we don't detect broken links or syntax errors when >> > reviewing >> > > > new >> > > > > > > patches[4][5][6]. >> > > > > > > 2. The website's content is two-level down in >> > `site2/website-next` >> > > > for >> > > > > > > historical reasons, which is confusing for contributors. >> > > > > > > >> > > > > > > To overcome these two shortcomings, I propose the following: >> > > > > > > >> > > > > > > 1. Move the website's content to the root level, then we have >> a >> > > > > first-class >> > > > > > > Docu&yarn-based JS project layout. It's more convenient and >> > > familiar >> > > > to >> > > > > > > related developers. >> > > > > > > 2. Host the source of docs in the site repo >> (apache/pulsar-site) >> > > > > instead of >> > > > > > > under `site2` folder in the main repo and do content sync. >> > > > > > > >> > > > > > > Below are the pros and cons: >> > > > > > > >> > > > > > > Pros >> > > > > > > >> > > > > > > 1. Obviously, we have the pre-commit workflow now. And since >> we >> > > host >> > > > > the >> > > > > > > source of docs in one repo, we don't have to run the >> pre-commit >> > > > > workflow in >> > > > > > > two places, which can be quite cumbersome to implement. >> > > > > > > 2. The size of the source release of the main repo can be >> > reduced a >> > > > > lot. >> > > > > > > Currently, 63MB out of 140MB of the sources are taken by the >> > site2 >> > > > > folder, >> > > > > > > which we can remove totally. In addition, we carry out >> > > full-versioned >> > > > > docs >> > > > > > > every release. >> > > > > > > 3. We can clean up a large portion of "integration" to debug >> the >> > > site >> > > > > > > brittlely on the main repo[7] (etc.) and redundant >> contribution >> > > > > guide[8]. >> > > > > > > This way, when updating docs, we can preview the result in one >> > repo >> > > > > instead >> > > > > > > of actually doing the sync on the fly. In addition, this >> > > integration >> > > > > blocks >> > > > > > > we move the website content to the top level since it makes >> > strong >> > > > > > > assumptions about the relative layout. >> > > > > > > >> > > > > > > Cons >> > > > > > > >> > > > > > > The most significant con is that we cannot update the code and >> > docs >> > > > in >> > > > > one >> > > > > > > patch against apache/pulsar now. You must open a new pull >> request >> > > to >> > > > > > > apache/pulsar-site, cross-reference each other and manage the >> > merge >> > > > > order >> > > > > > > (synchronization). >> > > > > > > >> > > > > > > Alternatives: >> > > > > > > >> > > > > > > To resolve the versioned docs issue, an alternative is to host >> > only >> > > > the >> > > > > > > user docs along with each version, like Flink does[9]. But it >> > both >> > > > > detaches >> > > > > > > from the Docu framework and requires significant development >> > > efforts. >> > > > > > > >> > > > > > > Since it can explicitly change the development flow (that is, >> you >> > > > > should >> > > > > > > now update docs separately), I am starting this discussion >> here >> > to >> > > > > reach >> > > > > > > for your feedback. >> > > > > > > >> > > > > > > Welcome to leave your comments! >> > > > > > > >> > > > > > > Best, >> > > > > > > tison. >> > > > > > > >> > > > > > > [1] https://pulsar.apache.org/ >> > > > > > > [2] https://github.com/apache/pulsar-site >> > > > > > > [3] https://github.com/apache/pulsar/issues/18014 >> > > > > > > [4] https://github.com/apache/pulsar/issues/17599 >> > > > > > > [5] >> > > > https://github.com/apache/pulsar/pull/17863#discussion_r990174850 >> > > > > > > [6] >> > > > https://github.com/apache/pulsar/pull/17853#discussion_r991803704 >> > > > > > > [7] >> > > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/pulsar/blob/b1f9e351fa4d5aba197d33cfc0c536516b55b61f/site2/website/start.sh >> > > > > > > [8] >> > > > > > > >> > > > > >> > > > >> > > >> > >> https://pulsar.apache.org/contribute/document-preview/#preview-documentation-changes >> > > > > > > [9] https://github.com/apache/flink/tree/master/docs >> > > > > > >> > > > > >> > > > >> > > >> > >> >