Re: [openstack-dev] [doc][i18n][infra][tc] plan for PDF and translation builds for documentation
Excerpts from Matthew Treinish's message of 2018-09-13 23:21:43 -0400: > On Fri, Sep 14, 2018 at 10:09:26AM +0900, Ian Y. Choi wrote: > > When I test PDF builds on current nova repo with master branch, it seems > > that the rst document is too big > > (876 pages with error) and more dealing with overcoming memory problems > > was needed. > > I would like to think how to overcome this, but it would be also nice if > > someone shares advices or comments on this. > > Hmm, I wasn't able to even get that far. When I tried a vanilla pdf build > from nova master it only compiled 540 pages before it errored out on capacity > exceeded. I know that the limit is adjustable in a config file, but I'm not > sure if there is a more dynamic method for adjusting it. The content is organized into sections based on audience/purpose now (install, user, api, etc.). The latex builder supports extracting different sections of the content to create separate output files by starting at a different root file. I wonder if that's another reasonable approach for us to take here? Doug __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [doc][i18n][infra][tc] plan for PDF and translation builds for documentation
On Fri, Sep 14, 2018 at 10:09:26AM +0900, Ian Y. Choi wrote: > First of all, thanks a lot for nice summary - I would like to deeply > read and put comments later. > > And @mtreinish, please see my reply inline: > > Matthew Treinish wrote on 9/14/2018 5:09 AM: > > On Thu, Sep 13, 2018 at 07:23:53AM -0600, Doug Hellmann wrote: > >> Excerpts from Michel Peterson's message of 2018-09-13 10:04:27 +0300: > >>> On Thu, Sep 13, 2018 at 1:09 AM, Doug Hellmann > >>> wrote: > >>> > The longer version is that we want to continue to use the existing > tox environment in each project as the basis for the job, since > that allows teams to control the version of python used, the > dependencies installed, and add custom steps to their build (such > as for pre-processing the documentation). So, the new or updated > job will start by running "tox -e docs" as it does today. Then it > will run Sphinx again with the instructions to build PDF output, > and copy the results into the directory that the publish job will > use to sync to the web server. And then it will run the scripts to > build translated versions of the documentation as HTML, and copy > the results into place for publishing. > > >>> Just a question out of curiosity. You mention that we still want to use > >>> the > >>> docs environment because it allows fine grained control over how the > >>> documentation is created. However, as I understand, the PDF output will > >>> happen in a more standardized way and outside of that fine grained > >>> control, > >>> right? That couldn't lead to differences in both documentations? Do we > >>> have > >>> to even worry about that? > >> Good question. The idea is to run "tox -e docs" to get the regular > >> HTML, then something like > >> > >>.tox/docs/bin/sphinx-build -b latex doc/build doc/build/latex > >>cd doc/build/latex > >>make > >>cp doc/build/latex/*.pdf doc/build/html > > To be fair, I've looked at this several times in the past, and sphinx's > > latex > > generation is good enough for the simple case, but on more complex documents > > it doesn't really work too well. For example, on nova I added this a while > > ago: > > > > https://github.com/openstack/nova/blob/master/tools/build_latex_pdf.sh > > After seeing what the script is doing, I wanna divide into several parts > and would like to tell with some generic approach: > > - svg -> png > : PDF builds ideally convert all svg files into PDF with no problems, > but there are some realistic problems > such as problems on determining bounding sbox size on vector svg > files, and big memory problems with lots of tags in svg files. > : Maybe it would be solved if we check all svg files with correct > formatting, > or if all svg files are converted to png files with temporal changes > on rst file (.svg -> .png), wouldn't it? Yeah we will have to do either. In my experience just converting to png images is normally easier. > > - non-latin code problems: > : By default, Sphinx uses latex builder, which doesn't support > non-latin codes and customized fonts [1]. > Documentation team tried to make use of xelatex instead of latex in > Sphinx configuration and now it is overridden > on openstackdocstheme >=1.20. So non-latin code would not generate > problems if you use openstackdocstheme >=1.20. Ok sure, using XeTex will solve this problem. I typically still just use pdflatex so back when I pushed that script (which was over 3 years ago) I was trying to fix it by converting the non-latin characters by using latex symbol equivalents for those characters. (which is a feature built-in to sphinx, but it just misses a lot of symbols) > > - other things > : I could not capture the background on other changes such as > additional packages. > If u provide more background on other things, I would like to > investigate on how to approach by changing a rst file > to make compatible with pdf builds or how to support all pdf builds > on many project repos as much as possible. The extra packages were part of the attempt to fix the non-latin characters using latex symbols. Those packages are just added there so you can call \checkmark and \ding{54} instead of ✔ and ✖. > > When I test PDF builds on current nova repo with master branch, it seems > that the rst document is too big > (876 pages with error) and more dealing with overcoming memory problems > was needed. > I would like to think how to overcome this, but it would be also nice if > someone shares advices or comments on this. Hmm, I wasn't able to even get that far. When I tried a vanilla pdf build from nova master it only compiled 540 pages before it errored out on capacity exceeded. I know that the limit is adjustable in a config file, but I'm not sure if there is a more dynamic method for adjusting it. -Matt Treinish > > > [1] https://tug.org/pipermail/xetex/2011-September/021324.html > [2] https://review.openst
Re: [openstack-dev] [doc][i18n][infra][tc] plan for PDF and translation builds for documentation
First of all, thanks a lot for nice summary - I would like to deeply read and put comments later. And @mtreinish, please see my reply inline: Matthew Treinish wrote on 9/14/2018 5:09 AM: > On Thu, Sep 13, 2018 at 07:23:53AM -0600, Doug Hellmann wrote: >> Excerpts from Michel Peterson's message of 2018-09-13 10:04:27 +0300: >>> On Thu, Sep 13, 2018 at 1:09 AM, Doug Hellmann >>> wrote: >>> The longer version is that we want to continue to use the existing tox environment in each project as the basis for the job, since that allows teams to control the version of python used, the dependencies installed, and add custom steps to their build (such as for pre-processing the documentation). So, the new or updated job will start by running "tox -e docs" as it does today. Then it will run Sphinx again with the instructions to build PDF output, and copy the results into the directory that the publish job will use to sync to the web server. And then it will run the scripts to build translated versions of the documentation as HTML, and copy the results into place for publishing. >>> Just a question out of curiosity. You mention that we still want to use the >>> docs environment because it allows fine grained control over how the >>> documentation is created. However, as I understand, the PDF output will >>> happen in a more standardized way and outside of that fine grained control, >>> right? That couldn't lead to differences in both documentations? Do we have >>> to even worry about that? >> Good question. The idea is to run "tox -e docs" to get the regular >> HTML, then something like >> >>.tox/docs/bin/sphinx-build -b latex doc/build doc/build/latex >>cd doc/build/latex >>make >>cp doc/build/latex/*.pdf doc/build/html > To be fair, I've looked at this several times in the past, and sphinx's latex > generation is good enough for the simple case, but on more complex documents > it doesn't really work too well. For example, on nova I added this a while > ago: > > https://github.com/openstack/nova/blob/master/tools/build_latex_pdf.sh After seeing what the script is doing, I wanna divide into several parts and would like to tell with some generic approach: - svg -> png : PDF builds ideally convert all svg files into PDF with no problems, but there are some realistic problems such as problems on determining bounding sbox size on vector svg files, and big memory problems with lots of tags in svg files. : Maybe it would be solved if we check all svg files with correct formatting, or if all svg files are converted to png files with temporal changes on rst file (.svg -> .png), wouldn't it? - non-latin code problems: : By default, Sphinx uses latex builder, which doesn't support non-latin codes and customized fonts [1]. Documentation team tried to make use of xelatex instead of latex in Sphinx configuration and now it is overridden on openstackdocstheme >=1.20. So non-latin code would not generate problems if you use openstackdocstheme >=1.20. - other things : I could not capture the background on other changes such as additional packages. If u provide more background on other things, I would like to investigate on how to approach by changing a rst file to make compatible with pdf builds or how to support all pdf builds on many project repos as much as possible. When I test PDF builds on current nova repo with master branch, it seems that the rst document is too big (876 pages with error) and more dealing with overcoming memory problems was needed. I would like to think how to overcome this, but it would be also nice if someone shares advices or comments on this. With many thanks, /Ian [1] https://tug.org/pipermail/xetex/2011-September/021324.html [2] https://review.openstack.org/#/c/552070/5/openstackdocstheme/ext.py@227 > To work around some issues with this workflow. It was enough to get the > generated latex to actually compile back then. But, that script has bitrotted > and needs to be updated, because the latex from sphinx for nova's docs no > longer compiles. (also I submitted a patch to sphinx in the meantime to > fix the check mark latex output) I'm afraid that it'll be a constant game > of cat and mouse trying to get everything to build. > > I think that we'll find that on most projects' documentation we will need > to massage the latex output from sphinx to build pdfs. > > -Matt Treinish > >> We would run the HTML translation builds in a similar way by invoking >> sphinx-build from the virtualenv repeatedly with different locale >> settings based on what translations exist. >> >> In my earlier comment, I was thinking of the case where a team runs >> a script to generate rst content files before invoking sphinx to >> build the HTML. That script would have been run before the PDF >> generation happens, so the content should be the same. That also >> applies for anyone using sphinx add-ons, which will be avai
Re: [openstack-dev] [doc][i18n][infra][tc] plan for PDF and translation builds for documentation
On Thu, Sep 13, 2018 at 07:23:53AM -0600, Doug Hellmann wrote: > Excerpts from Michel Peterson's message of 2018-09-13 10:04:27 +0300: > > On Thu, Sep 13, 2018 at 1:09 AM, Doug Hellmann > > wrote: > > > > > The longer version is that we want to continue to use the existing > > > tox environment in each project as the basis for the job, since > > > that allows teams to control the version of python used, the > > > dependencies installed, and add custom steps to their build (such > > > as for pre-processing the documentation). So, the new or updated > > > job will start by running "tox -e docs" as it does today. Then it > > > will run Sphinx again with the instructions to build PDF output, > > > and copy the results into the directory that the publish job will > > > use to sync to the web server. And then it will run the scripts to > > > build translated versions of the documentation as HTML, and copy > > > the results into place for publishing. > > > > > > > Just a question out of curiosity. You mention that we still want to use the > > docs environment because it allows fine grained control over how the > > documentation is created. However, as I understand, the PDF output will > > happen in a more standardized way and outside of that fine grained control, > > right? That couldn't lead to differences in both documentations? Do we have > > to even worry about that? > > Good question. The idea is to run "tox -e docs" to get the regular > HTML, then something like > >.tox/docs/bin/sphinx-build -b latex doc/build doc/build/latex >cd doc/build/latex >make >cp doc/build/latex/*.pdf doc/build/html To be fair, I've looked at this several times in the past, and sphinx's latex generation is good enough for the simple case, but on more complex documents it doesn't really work too well. For example, on nova I added this a while ago: https://github.com/openstack/nova/blob/master/tools/build_latex_pdf.sh To work around some issues with this workflow. It was enough to get the generated latex to actually compile back then. But, that script has bitrotted and needs to be updated, because the latex from sphinx for nova's docs no longer compiles. (also I submitted a patch to sphinx in the meantime to fix the check mark latex output) I'm afraid that it'll be a constant game of cat and mouse trying to get everything to build. I think that we'll find that on most projects' documentation we will need to massage the latex output from sphinx to build pdfs. -Matt Treinish > > We would run the HTML translation builds in a similar way by invoking > sphinx-build from the virtualenv repeatedly with different locale > settings based on what translations exist. > > In my earlier comment, I was thinking of the case where a team runs > a script to generate rst content files before invoking sphinx to > build the HTML. That script would have been run before the PDF > generation happens, so the content should be the same. That also > applies for anyone using sphinx add-ons, which will be available > to the latex builder because we'll be using the version of sphinx > installed in the virtualenv managed by tox. > signature.asc Description: PGP signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [doc][i18n][infra][tc] plan for PDF and translation builds for documentation
Excerpts from Michel Peterson's message of 2018-09-13 10:04:27 +0300: > On Thu, Sep 13, 2018 at 1:09 AM, Doug Hellmann > wrote: > > > The longer version is that we want to continue to use the existing > > tox environment in each project as the basis for the job, since > > that allows teams to control the version of python used, the > > dependencies installed, and add custom steps to their build (such > > as for pre-processing the documentation). So, the new or updated > > job will start by running "tox -e docs" as it does today. Then it > > will run Sphinx again with the instructions to build PDF output, > > and copy the results into the directory that the publish job will > > use to sync to the web server. And then it will run the scripts to > > build translated versions of the documentation as HTML, and copy > > the results into place for publishing. > > > > Just a question out of curiosity. You mention that we still want to use the > docs environment because it allows fine grained control over how the > documentation is created. However, as I understand, the PDF output will > happen in a more standardized way and outside of that fine grained control, > right? That couldn't lead to differences in both documentations? Do we have > to even worry about that? Good question. The idea is to run "tox -e docs" to get the regular HTML, then something like .tox/docs/bin/sphinx-build -b latex doc/build doc/build/latex cd doc/build/latex make cp doc/build/latex/*.pdf doc/build/html We would run the HTML translation builds in a similar way by invoking sphinx-build from the virtualenv repeatedly with different locale settings based on what translations exist. In my earlier comment, I was thinking of the case where a team runs a script to generate rst content files before invoking sphinx to build the HTML. That script would have been run before the PDF generation happens, so the content should be the same. That also applies for anyone using sphinx add-ons, which will be available to the latex builder because we'll be using the version of sphinx installed in the virtualenv managed by tox. Doug __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [doc][i18n][infra][tc] plan for PDF and translation builds for documentation
Excerpts from Andreas Jaeger's message of 2018-09-13 08:14:30 +0200: > I like the plan, thanks! One suggestion below: > > The translated documents are build for releasenotes already in a similar > way as proposed. here we update the index.rst file on the fly to link to > all translated versions, see the bottom of e.g. > https://docs.openstack.org/releasenotes/openstack-manuals/ > > I suggest that you look also as part of building how to link to PDFs and > translated documents. > > For reference, the logic for releasenotes translation is here: > http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/build-releasenotes/tasks/main.yaml#n10 Ah, yes, that's a detail I left out. I want to add a directive to the theme to list those other formats and versions, so we don't have to edit the content of the file on the fly. > I would appreciate if you handle releasenotes the same way as other > documents, so if you want to change releasenotes in the end, please do Good idea. Doug __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [doc][i18n][infra][tc] plan for PDF and translation builds for documentation
On Thu, Sep 13, 2018 at 1:09 AM, Doug Hellmann wrote: > The longer version is that we want to continue to use the existing > tox environment in each project as the basis for the job, since > that allows teams to control the version of python used, the > dependencies installed, and add custom steps to their build (such > as for pre-processing the documentation). So, the new or updated > job will start by running "tox -e docs" as it does today. Then it > will run Sphinx again with the instructions to build PDF output, > and copy the results into the directory that the publish job will > use to sync to the web server. And then it will run the scripts to > build translated versions of the documentation as HTML, and copy > the results into place for publishing. > Just a question out of curiosity. You mention that we still want to use the docs environment because it allows fine grained control over how the documentation is created. However, as I understand, the PDF output will happen in a more standardized way and outside of that fine grained control, right? That couldn't lead to differences in both documentations? Do we have to even worry about that? __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [doc][i18n][infra][tc] plan for PDF and translation builds for documentation
I like the plan, thanks! One suggestion below: The translated documents are build for releasenotes already in a similar way as proposed. here we update the index.rst file on the fly to link to all translated versions, see the bottom of e.g. https://docs.openstack.org/releasenotes/openstack-manuals/ I suggest that you look also as part of building how to link to PDFs and translated documents. For reference, the logic for releasenotes translation is here: http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/build-releasenotes/tasks/main.yaml#n10 I would appreciate if you handle releasenotes the same way as other documents, so if you want to change releasenotes in the end, please do Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [doc][i18n][infra][tc] plan for PDF and translation builds for documentation
Ian has been working for a while on enabling PDF and translation support for our documentation build job [1][2]. After exploring a few different approaches, today at the PTG I think we were able to agree on a plan that will let us move ahead. The tl;dr version is that we want to add some extra steps to the existing openstack-tox-docs job (or make a new job that includes those steps and change the PTI project template so projects start using it transparently). The changes to the job will be made so that if the PDF and translation builds work the results will be published and if they fail that will not trigger a job failure. The longer version is that we want to continue to use the existing tox environment in each project as the basis for the job, since that allows teams to control the version of python used, the dependencies installed, and add custom steps to their build (such as for pre-processing the documentation). So, the new or updated job will start by running "tox -e docs" as it does today. Then it will run Sphinx again with the instructions to build PDF output, and copy the results into the directory that the publish job will use to sync to the web server. And then it will run the scripts to build translated versions of the documentation as HTML, and copy the results into place for publishing. There are a few settings that can/should be configured via the Sphinx conf.py file, but rather than trying to push updates into all of the project repos we will look into passing the options on the command line or making the openstackdocstheme inject those settings. This would apply to the setting to control the latex processor as well as fonts or other settings that control the output. Those things all relate to the format of the output, so it seems appropriate to have the theme control them. To cut down on any delays caused by introducing several consecutive sphinx-build runs to the documentation job we plan to have the check and gate jobs only run the translation portion of the build if the message catalog files for a language are modified. Since this work will all happen inside the documentation build job, and it will be enabled for all teams automatically, we do not need to update the Project Testing Interface, and Ian can abandon his governance changes. Monty is going to work on the implementation with Ian using openstacksdk as a test bed [3]. As usual, please let me know if I've left out or mistaken any details. Doug [1] https://review.openstack.org/572559 [2] https://review.openstack.org/588110 [3] https://review.openstack.org/601659 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev