Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-23 Thread suzuki toshiya
Dear Hosoda-san,

Thank you for posting very detailed package to reproduce the issue,
it was far detailed than I expected. Please let me spend some time
to check it, due to I'm absent from my office in next week...

Regards,
mpsuzuki

Masamichi Hosoda wrote:
>> In my impression, if
>>
>> $ gs -dSAFER -dEPSCrop -dCompatibilityLevel=1.4 -dNOPAUSE -dBATCH
>> -r1200
>> -dSubsetFonts=false -sDEVICE=pdfwrite -dAutoRotatePages=/None
>> -sOutputFile=filename.pdf -c.setpdfwrite -ffilename.eps
>>
>> does not embed the completed font into PDF, Ghostscript developers
>> would
>> regard it as a bug and they might be willing to fix it (at least, try
>> to
>> fix it).
>>
>> Could you provide some sample EPS files to reproduce the issue,
>> and the resulted PDF in your environment?
> 
> Here is sample files `20170922_lilypond_eps_pdf_examples.tar.xz`
> https://drive.google.com/file/d/0ByGBX3PDrqjsSFhVdXJfbjFjRlk/view?usp=sharing
> 
> It contains `README.md`.
> Would you read the file?
> 


___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-22 Thread Masamichi Hosoda
This discussion concerns two mailing lists,
some mails exist in both archives,
but other mail seems to exist only in one archive.
For convenience of people subscribing to only one mailing list,
both mailing list archive URLs are as follows.

http://lists.gnu.org/archive/html/lilypond-devel/2017-09/index.html
https://ghostscript.com/pipermail/gs-devel/2017-September/date.html

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-22 Thread Masamichi Hosoda
> In my impression, if
> 
> $ gs -dSAFER -dEPSCrop -dCompatibilityLevel=1.4 -dNOPAUSE -dBATCH
> -r1200
> -dSubsetFonts=false -sDEVICE=pdfwrite -dAutoRotatePages=/None
> -sOutputFile=filename.pdf -c.setpdfwrite -ffilename.eps
> 
> does not embed the completed font into PDF, Ghostscript developers
> would
> regard it as a bug and they might be willing to fix it (at least, try
> to
> fix it).
> 
> Could you provide some sample EPS files to reproduce the issue,
> and the resulted PDF in your environment?

Here is sample files `20170922_lilypond_eps_pdf_examples.tar.xz`
https://drive.google.com/file/d/0ByGBX3PDrqjsSFhVdXJfbjFjRlk/view?usp=sharing

It contains `README.md`.
Would you read the file?

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-19 Thread William Bader
>Then maybe you should complain to the software producing that content.

That is one place where TeX shows its age, and switching to a newer system like 
SILE might produce better output https://github.com/simoncozens/sile 
https://www.youtube.com/watch?v=5BIP_N9qQm4


William



From: gs-devel <gs-devel-boun...@ghostscript.com> on behalf of Ken Sharp 
<ken.sh...@artifex.com>
Sent: Tuesday, September 19, 2017 4:38 AM
To: David Kastrup
Cc: gs-de...@ghostscript.com; lilypond-devel@gnu.org
Subject: Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

At 20:38 18/09/2017 +0200, David Kastrup wrote:


>I think "slightly smaller" was something like a factor of 10.  We are
>talking about files including literally thousands if not ten thousands
>of graphics (manuals close to a thousand pages with lots of graphic
>output included).

Then maybe you should complain to the software producing that content.

I already said I would discuss this further, berating me will not induce me
to make changes.



 Ken

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-19 Thread Ken Sharp

At 07:48 19/09/2017 +0200, Knut Petersen wrote:

I understand why the default behavior of ghostscript changed. But could 
anyone who advocates to remove the PDFDontUseFontObjectNum be so kind to 
give a clear explanation why keeping it would be a bad idea?


We have, literally, hundreds of command line options. Its hard to remember 
them all, let alone keep them all straight, and its quite impossible to 
test any significant fraction of them.


Every time we make any change to Ghostscript its possible that we break one 
or more pieces of functionality, quite accidentally, simply because we 
aren't aware of the existence of that functionality.


As time goes on and we add more and more complexity, the chances that any 
given change will break some existing feature rises significantly. It takes 
longer for developers who do know of the existence of these features to 
make changes, and the changed code runs at a performance penalty because it 
needs to do extra checking or employ more complex logic.


Note that we have a query from a commercial customer opened last week about 
why the latest version of Ghostscript's PDF interpreter runs more slowly 
than two versions back so this is not a theoretical consideration.


So at every release I look for ways to reduce the clutter, discarding 
features intended to restore old behaviour in the event of a fault in new 
behaviour is an obvious target. Such code is only ever intended to be 
temporary, the additional checking of flags in multiple places, and 
multiple times in the course of a program does impact performance and its 
very likely that such fallback code will get broken, because its *never* 
tested.


I have already said I will discuss this with the other developers before 
making a final decision. It is, obviously, useful to know that people are 
using this behaviour and we will take that into consideration.


In that regard I do welcome the reasoned emails, such as this one, from 
additional users. I didn't enjoy the follow up email from the original 
reporter; its not like I said 'tough' or anything, I said I would discuss 
it internally right from the start.




Ken Sharp


___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-19 Thread Ken Sharp

At 17:27 18/09/2017 -0700, Perry Hutchison wrote:

Masamichi Hosoda  wrote:

> >>It seems that `-dPDFDontUseFontObjectNum` option does not work.
...
> There is a tool for using this method of removing duplicate fonts.
> https://www.ctan.org/pkg/extractpdfmark
> https://packages.debian.org/stretch/extractpdfmark
> http://packages.ubuntu.com/zesty/extractpdfmark

As I see it, the availability of a separate tool to do the same thing
is a reason to _not_ provide a duplicate capability in Ghostscript.
Those who want that processing (despite the risks that Ken mentioned)
can use extractpdfmark.


Despite the wording of the email (which fooled me initially) its not a 
separate tool, its just instructions for using Ghostscript and setting the 
removed flag.



Ken


___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-19 Thread Knut Petersen

Am 19.09.2017 um 04:12 schrieb William Bader:

pdfsizeopt is another pdf compression tool that can eliminate duplicate fonts 
and can sometimes merge subset fonts. 
https://github.com/pts/pdfsizeopt/blob/master/lib/pdfsizeopt/main.py


I checked pdfsizeopt more than once. Unfortunately it never worked as expected.

Knut

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-19 Thread William Bader
pdfsizeopt is another pdf compression tool that can eliminate duplicate fonts 
and can sometimes merge subset fonts. 
https://github.com/pts/pdfsizeopt/blob/master/lib/pdfsizeopt/main.py




From: gs-devel <gs-devel-boun...@ghostscript.com> on behalf of Perry Hutchison 
<per...@pluto.rain.com>
Sent: Monday, September 18, 2017 8:27 PM
To: truer...@trueroad.jp
Cc: gs-de...@ghostscript.com; lilypond-devel@gnu.org
Subject: Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

Masamichi Hosoda <truer...@trueroad.jp> wrote:

> >>It seems that `-dPDFDontUseFontObjectNum` option does not work.
...
> There is a tool for using this method of removing duplicate fonts.
> https://www.ctan.org/pkg/extractpdfmark

> https://packages.debian.org/stretch/extractpdfmark

> http://packages.ubuntu.com/zesty/extractpdfmark


As I see it, the availability of a separate tool to do the same thing
is a reason to _not_ provide a duplicate capability in Ghostscript.
Those who want that processing (despite the risks that Ken mentioned)
can use extractpdfmark.
___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-19 Thread Werner LEMBERG
>>> LilyPond has option `--bigpdfs` for unifying duplicate fonts in
>>> this method.
>>
>> And your point is what ? That's not what the pdfwrite device is
>> intended for, and we don't claim you can use it to do that.
>>
>> As I said, if you think its that useful, then you can add the
>> switch back in.
> 
> Ken: You are writing to one of the lilypond developers. So "add the
> switch back" is the logical equivalent of "add a fork of ghostscript
> to the lilypond source tree if you need the PDFDontUseFontObjectNum
> option". Was it really your intention to suggest a fork???

He rather meant that we should add a custom version of `pdf_font.ps'
to lilypond, I guess (or maybe just a custom version of the
`patch_font_XUID' function).


Werner

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-19 Thread Knut Petersen

Am 18.09.2017 um 20:20 schrieb Ken Sharp:

LilyPond has option `--bigpdfs` for unifying duplicate fonts in this method.


And your point is what ? That's not what the pdfwrite device is intended for, 
and we don't claim you can use it to do that.

As I said, if you think its that useful, then you can add the switch back in. 


Ken: You are writing to one of the lilypond developers. So "add the switch back" is the 
logical equivalent of "add a fork of ghostscript to the lilypond source tree if you need the 
PDFDontUseFontObjectNum option". Was it really your intention to suggest a fork???

Knut

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-18 Thread Knut Petersen

Am 19.09.2017 um 02:27 schrieb Perry Hutchison:

There is a tool for using this method of removing duplicate fonts.
https://www.ctan.org/pkg/extractpdfmark
https://packages.debian.org/stretch/extractpdfmark
http://packages.ubuntu.com/zesty/extractpdfmark

As I see it, the availability of a separate tool to do the same thing
is a reason to _not_ provide a duplicate capability in Ghostscript.
Those who want that processing (despite the risks that Ken mentioned)
can use extractpdfmark.


Masamichi described extractpdfmark in a misleading way. pdfmark is a helper program that generates a postscript file from a pdf document. Both, the pdf document and the ps generated by pdfmark, are then processed by ghostscript to generate a final pdf. extractpdfmark is only used if the source pdf 
containes links / hyperrefs that would otherwise be broken during the final pass of ghostscript.


A typical way to write a musicological document, a collection of songs, a 
lilypond manual etc is described below:

1. Write your music.
2. Use "lilypond --bigpdf" to generate pdfs. Internally lilypond generates 
postscript files and then runs ghostscript to generate pdfs.
3. Write the pdf(la)tex/lua(la)tex/xe(la)tex document that uses the pdfs 
generated in step 2
4. Use pdf(la)tex/lua(la)tex/xe(la)tex to generate a pdf.
5. If necessary, use extractpdfmark to extract pdfmarks from the pdf generated 
in step 4 (extractpdfmark generates a postscript file)
6. Use ghostscript to generate the final pdf from the pdf generated in step 4 
and the postscript file generated in step 5.

This sounds a bit complicated, but the reduction of file size is significant. In 2014 
this was discussed here on the ghostscript bugzilla. 


Without the use of lilyponds "--bigpdf" option our notation manual had a size 
of 26 MB after step 4.
With the introduction of the "--bigpdf" option and steps 5 and 6 the file size 
after step 4 increased to 116 MB, but the size of the pdf generated in step 6 was only 
5.9 MB. That means we were able to eliminate more than 20MB of duplicated fonts.

Another example is gotlandstoner, a collection of folk tunes from Sweden. If 
you remove the PDFDontUseFontObjectNum option book 3 has a file size of 
13.706.324 bytes.  If a ghostscript with the PDFDontUseFontObjectNum option 
enabled is used that boils down to 2.447.232 bytes.

In an earlier message in this thread Ken Sharp wrote: "Risking incorrect output for the minimal benefit of a slightly smaller file seems unwise to me." Yes, the default should be not to enable PDFDontUseFontObjectNum. But as I pointed out above: The benefit of the PDFDontUseFontObjectNum is not only 
a "slightly smaller file", the benefit is a very significant reduction of file size often exceeding 80%.


I understand why the default behavior of ghostscript changed. But could anyone 
who advocates to remove the PDFDontUseFontObjectNum be so kind to give a clear 
explanation why keeping it would be a bad idea?

Knut
___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-18 Thread Ken Sharp

At 00:31 19/09/2017 +0900, Masamichi Hosoda wrote:


When you create a PDF document using something like a TeX system
you may include many small PDF files in the main PDF file.
It is common for each of the small PDF files to use the same fonts.

If the small PDF files contain embedded full font sets,
the TeX system includes all of them in the main PDF.
The main PDF contains duplicates of the same full sets of fonts.
Therefore, `PDFDontUseFontObjectNum` can remove the duplicates.
This may considerably reduce the main PDF-file's size.


And if you have multiple subsets, badly named (eg OpenOffice output) then 
you get a final PDF file where some of the text is missing or garbled.


There's no real way to tell the difference, at least by using the font 
object numbers we guarantee correct output.




There is a tool for using this method of removing duplicate fonts.
https://www.ctan.org/pkg/extractpdfmark


Not our tool, we don't claim you can do this. Building a tool round an 
unintentional side effect seems less than ideal.




LilyPond has option `--bigpdfs` for unifying duplicate fonts in this method.


And your point is what ? That's not what the pdfwrite device is intended 
for, and we don't claim you can use it to do that.


As I said, if you think its that useful, then you can add the switch back 
in. In fact, provided you don't change SubsetFonts, the resulting file may 
well be smaller anyway, since the pdfwrite device will only embed that 
portion of each font (which you say is a
complete duplicate) so the resulting two fonts will be smaller than the 
original two fonts.


Risking incorrect output for the minimal benefit of a slightly smaller file 
seems unwise to me.


Ken


___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-18 Thread Perry Hutchison
Masamichi Hosoda  wrote:

> >>It seems that `-dPDFDontUseFontObjectNum` option does not work.
...
> There is a tool for using this method of removing duplicate fonts.
> https://www.ctan.org/pkg/extractpdfmark
> https://packages.debian.org/stretch/extractpdfmark
> http://packages.ubuntu.com/zesty/extractpdfmark

As I see it, the availability of a separate tool to do the same thing
is a reason to _not_ provide a duplicate capability in Ghostscript.
Those who want that processing (despite the risks that Ken mentioned)
can use extractpdfmark.

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

2017-09-18 Thread Masamichi Hosoda
>> > Please give them a try on your system if you're interested in helping
>> > test the release-in-progress. Your feedback is appreciated.
>>
>>It seems that `-dPDFDontUseFontObjectNum` option does not work.
> 
> It has been removed, as documented in History9.htm:
> 
> 2017-08-02 13:41:59 +0100
> Ken Sharp 
> ca1ec9b486ddba3f921355fd1d775f27f4871356
> 
> PDF interpreter - remove the PDFDontUseObjectNum switch
> 
> This was implemented to allow us to restore the default behaviour if
> it caused problems. No real problems reported, so lets get rid of
> (yet another) of our many, many command line switches.
> 
> 
>>`-dPDFDontUseFontObjectNum` is useful to unify duplicate fonts.
>>So I would like to use `-dPDFDontUseFontObjectNum`.
> 
> The fonts can't realistically be described as duplicates, if they have
> different object numbers. Or if they are then the original PDF file
> (this only works with PDF input) was poorly constructed. Using this
> option isn't really sensible, it was only ever present in case the
> Font numbering usage went awry. Consider creating the original PDF
> files more efficiently instead.
> 
> I'll discuss it with the other developers but I am not inclined to
> restore this. Obviously if you feel its important you can revert the
> change locally.

When you create a PDF document using something like a TeX system
you may include many small PDF files in the main PDF file.
It is common for each of the small PDF files to use the same fonts.

If the small PDF files contain embedded full font sets,
the TeX system includes all of them in the main PDF.
The main PDF contains duplicates of the same full sets of fonts.
Therefore, `PDFDontUseFontObjectNum` can remove the duplicates.
This may considerably reduce the main PDF-file's size.

There is a tool for using this method of removing duplicate fonts.
https://www.ctan.org/pkg/extractpdfmark
https://packages.debian.org/stretch/extractpdfmark
http://packages.ubuntu.com/zesty/extractpdfmark

LilyPond has option `--bigpdfs` for unifying duplicate fonts in this method.
http://lilypond.org/doc/v2.19/Documentation/usage/command_002dline-usage#basic-command-line-options-for-lilypond

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel