Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-18 Thread Knut Petersen

Am 19.09.2017 um 02:27 schrieb Perry Hutchison:

There is a tool for using this method of removing duplicate fonts.
https://www.ctan.org/pkg/extractpdfmark
https://packages.debian.org/stretch/extractpdfmark
http://packages.ubuntu.com/zesty/extractpdfmark

As I see it, the availability of a separate tool to do the same thing
is a reason to _not_ provide a duplicate capability in Ghostscript.
Those who want that processing (despite the risks that Ken mentioned)
can use extractpdfmark.


Masamichi described extractpdfmark in a misleading way. pdfmark is a helper program that generates a postscript file from a pdf document. Both, the pdf document and the ps generated by pdfmark, are then processed by ghostscript to generate a final pdf. extractpdfmark is only used if the source pdf 
containes links / hyperrefs that would otherwise be broken during the final pass of ghostscript.


A typical way to write a musicological document, a collection of songs, a 
lilypond manual etc is described below:

1. Write your music.
2. Use "lilypond --bigpdf" to generate pdfs. Internally lilypond generates 
postscript files and then runs ghostscript to generate pdfs.
3. Write the pdf(la)tex/lua(la)tex/xe(la)tex document that uses the pdfs 
generated in step 2
4. Use pdf(la)tex/lua(la)tex/xe(la)tex to generate a pdf.
5. If necessary, use extractpdfmark to extract pdfmarks from the pdf generated 
in step 4 (extractpdfmark generates a postscript file)
6. Use ghostscript to generate the final pdf from the pdf generated in step 4 
and the postscript file generated in step 5.

This sounds a bit complicated, but the reduction of file size is significant. In 2014 
this was discussed here on the ghostscript bugzilla. 


Without the use of lilyponds "--bigpdf" option our notation manual had a size 
of 26 MB after step 4.
With the introduction of the "--bigpdf" option and steps 5 and 6 the file size 
after step 4 increased to 116 MB, but the size of the pdf generated in step 6 was only 
5.9 MB. That means we were able to eliminate more than 20MB of duplicated fonts.

Another example is gotlandstoner, a collection of folk tunes from Sweden. If 
you remove the PDFDontUseFontObjectNum option book 3 has a file size of 
13.706.324 bytes.  If a ghostscript with the PDFDontUseFontObjectNum option 
enabled is used that boils down to 2.447.232 bytes.

In an earlier message in this thread Ken Sharp wrote: "Risking incorrect output for the minimal benefit of a slightly smaller file seems unwise to me." Yes, the default should be not to enable PDFDontUseFontObjectNum. But as I pointed out above: The benefit of the PDFDontUseFontObjectNum is not only 
a "slightly smaller file", the benefit is a very significant reduction of file size often exceeding 80%.


I understand why the default behavior of ghostscript changed. But could anyone 
who advocates to remove the PDFDontUseFontObjectNum be so kind to give a clear 
explanation why keeping it would be a bad idea?

Knut
___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-18 Thread Ken Sharp

At 00:31 19/09/2017 +0900, Masamichi Hosoda wrote:


When you create a PDF document using something like a TeX system
you may include many small PDF files in the main PDF file.
It is common for each of the small PDF files to use the same fonts.

If the small PDF files contain embedded full font sets,
the TeX system includes all of them in the main PDF.
The main PDF contains duplicates of the same full sets of fonts.
Therefore, `PDFDontUseFontObjectNum` can remove the duplicates.
This may considerably reduce the main PDF-file's size.


And if you have multiple subsets, badly named (eg OpenOffice output) then 
you get a final PDF file where some of the text is missing or garbled.


There's no real way to tell the difference, at least by using the font 
object numbers we guarantee correct output.




There is a tool for using this method of removing duplicate fonts.
https://www.ctan.org/pkg/extractpdfmark


Not our tool, we don't claim you can do this. Building a tool round an 
unintentional side effect seems less than ideal.




LilyPond has option `--bigpdfs` for unifying duplicate fonts in this method.


And your point is what ? That's not what the pdfwrite device is intended 
for, and we don't claim you can use it to do that.


As I said, if you think its that useful, then you can add the switch back 
in. In fact, provided you don't change SubsetFonts, the resulting file may 
well be smaller anyway, since the pdfwrite device will only embed that 
portion of each font (which you say is a
complete duplicate) so the resulting two fonts will be smaller than the 
original two fonts.


Risking incorrect output for the minimal benefit of a slightly smaller file 
seems unwise to me.


Ken


___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-18 Thread Perry Hutchison
Masamichi Hosoda  wrote:

> >>It seems that `-dPDFDontUseFontObjectNum` option does not work.
...
> There is a tool for using this method of removing duplicate fonts.
> https://www.ctan.org/pkg/extractpdfmark
> https://packages.debian.org/stretch/extractpdfmark
> http://packages.ubuntu.com/zesty/extractpdfmark

As I see it, the availability of a separate tool to do the same thing
is a reason to _not_ provide a duplicate capability in Ghostscript.
Those who want that processing (despite the risks that Ken mentioned)
can use extractpdfmark.

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-18 Thread David Kastrup
Ken Sharp  writes:

> At 00:31 19/09/2017 +0900, Masamichi Hosoda wrote:
>
>>When you create a PDF document using something like a TeX system
>>you may include many small PDF files in the main PDF file.
>>It is common for each of the small PDF files to use the same fonts.
>>
>>If the small PDF files contain embedded full font sets,
>>the TeX system includes all of them in the main PDF.
>>The main PDF contains duplicates of the same full sets of fonts.
>>Therefore, `PDFDontUseFontObjectNum` can remove the duplicates.
>>This may considerably reduce the main PDF-file's size.
>
> And if you have multiple subsets, badly named (eg OpenOffice output)
> then you get a final PDF file where some of the text is missing or
> garbled.

So?  Nobody forces anybody to use that option.

>>LilyPond has option `--bigpdfs` for unifying duplicate fonts in this
>>method.
>
> And your point is what ?

That we are talking about functionality that is considered useful?

> That's not what the pdfwrite device is intended for, and we don't
> claim you can use it to do that.
>
> As I said, if you think its that useful, then you can add the switch
> back in. In fact, provided you don't change SubsetFonts, the resulting
> file may well be smaller anyway, since the pdfwrite device will only
> embed that portion of each font (which you say is a complete
> duplicate) so the resulting two fonts will be smaller than the
> original two fonts.
>
> Risking incorrect output for the minimal benefit of a slightly smaller
> file seems unwise to me.

I think "slightly smaller" was something like a factor of 10.  We are
talking about files including literally thousands if not ten thousands
of graphics (manuals close to a thousand pages with lots of graphic
output included).

-- 
David Kastrup

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-18 Thread Masamichi Hosoda
>> > Please give them a try on your system if you're interested in helping
>> > test the release-in-progress. Your feedback is appreciated.
>>
>>It seems that `-dPDFDontUseFontObjectNum` option does not work.
> 
> It has been removed, as documented in History9.htm:
> 
> 2017-08-02 13:41:59 +0100
> Ken Sharp 
> ca1ec9b486ddba3f921355fd1d775f27f4871356
> 
> PDF interpreter - remove the PDFDontUseObjectNum switch
> 
> This was implemented to allow us to restore the default behaviour if
> it caused problems. No real problems reported, so lets get rid of
> (yet another) of our many, many command line switches.
> 
> 
>>`-dPDFDontUseFontObjectNum` is useful to unify duplicate fonts.
>>So I would like to use `-dPDFDontUseFontObjectNum`.
> 
> The fonts can't realistically be described as duplicates, if they have
> different object numbers. Or if they are then the original PDF file
> (this only works with PDF input) was poorly constructed. Using this
> option isn't really sensible, it was only ever present in case the
> Font numbering usage went awry. Consider creating the original PDF
> files more efficiently instead.
> 
> I'll discuss it with the other developers but I am not inclined to
> restore this. Obviously if you feel its important you can revert the
> change locally.

When you create a PDF document using something like a TeX system
you may include many small PDF files in the main PDF file.
It is common for each of the small PDF files to use the same fonts.

If the small PDF files contain embedded full font sets,
the TeX system includes all of them in the main PDF.
The main PDF contains duplicates of the same full sets of fonts.
Therefore, `PDFDontUseFontObjectNum` can remove the duplicates.
This may considerably reduce the main PDF-file's size.

There is a tool for using this method of removing duplicate fonts.
https://www.ctan.org/pkg/extractpdfmark
https://packages.debian.org/stretch/extractpdfmark
http://packages.ubuntu.com/zesty/extractpdfmark

LilyPond has option `--bigpdfs` for unifying duplicate fonts in this method.
http://lilypond.org/doc/v2.19/Documentation/usage/command_002dline-usage#basic-command-line-options-for-lilypond

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: ghostscript 9.22 will remove PDFDontUseFontObjectNum option

2017-09-18 Thread Masamichi Hosoda
>> I've noticed that ghostscript 9.22
>> will remove `PDFDontUseFontObjectNum` option.
>>
>> http://git.ghostscript.com/?p=ghostpdl.git;a=commit;h=ca1ec9b486ddba3f921355fd1d775f27f4871356
>>
>> So `--bigpdfs` will not work with gs-9.22.
>> It already did not work with gs-9.22rc1.
> 
> Did they give a rationale?  What kind of PostScript code would have a
> chance to work here instead?

I have no idea.
I've send a request that
I'd like to use `PDFDontUseFontObjectNum` option to gs-devel.
https://ghostscript.com/pipermail/gs-devel/2017-September/009987.html

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: ghostscript 9.22 will remove PDFDontUseFontObjectNum option

2017-09-18 Thread David Kastrup
Masamichi Hosoda  writes:

> I've noticed that ghostscript 9.22
> will remove `PDFDontUseFontObjectNum` option.
>
> http://git.ghostscript.com/?p=ghostpdl.git;a=commit;h=ca1ec9b486ddba3f921355fd1d775f27f4871356
>
> So `--bigpdfs` will not work with gs-9.22.
> It already did not work with gs-9.22rc1.

Did they give a rationale?  What kind of PostScript code would have a
chance to work here instead?

-- 
David Kastrup

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


ghostscript 9.22 will remove PDFDontUseFontObjectNum option

2017-09-18 Thread Masamichi Hosoda
I've noticed that ghostscript 9.22
will remove `PDFDontUseFontObjectNum` option.

http://git.ghostscript.com/?p=ghostpdl.git;a=commit;h=ca1ec9b486ddba3f921355fd1d775f27f4871356

So `--bigpdfs` will not work with gs-9.22.
It already did not work with gs-9.22rc1.

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel