DO NOT REPLY [Bug 52477] FOP always uses the same prefix for embeded font

2012-01-18 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=52477

--- Comment #4 from quamis quamis+...@gmail.com 2012-01-18 08:16:33 UTC ---
(In reply to comment #3)
 2) We use the deterministic trait of the prefixes in our testing framework. 
 The
 value of having a comprehensive test suite is far greater than making the code
 change for this scenario.

That why i was saying that a command-line switch to disable the randomized
behavior should exist. The change seemed trivial enough.


 I understand that none of the above particularly helps you, but we can't very
 well go changing FOP to accommodate nuanced bugs in ghostscript.
 
 Mehdi

I understand that, but generating the same sequence over and over just seems to
be a compromise for easier automated testing, not for an actual workingtested
product.

For now we'll go on by using pdftk, which seems to handle multiple
fonts-same-name case correctly, but its too bad one would have to use 3
different applications all with their own quirks and bugs and usage patterns
simply because the standard isn't very clear for a specific issue, and that
issue could easily be fixed by any of the 2 applications involved in this
chain...

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


DO NOT REPLY [Bug 52477] FOP always uses the same prefix for embeded font

2012-01-18 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=52477

--- Comment #6 from Mehdi Houshmand med1...@gmail.com 2012-01-18 10:51:13 UTC 
---
(In reply to comment #5)
/snip
 An alternative approach that will also make it easier for applications to
 extract or de-duplicate font resources when merging multiple PDFs is to allow
 FOP to fully embed the font resources in the PDF, rather than creating a
 subset. I believe this is possible today for a limited use-case, by specifying
 encoding-mode=single-byte on the font element within the fop.xconf file. I
 say limited because that only works if no characters outside the ASCII range
 are required.

That wouldn't necessarily fix the issue here. Fully embedding a font means that
the pseudo-unique prefix isn't used, however this isn't necessarily a good
thing. A parser like ghostscript, could and apparently does assume that if 2
fonts have the same name (prefix or not) that they are the same font. This is
an assumption  that I've made previously and has proved manifestly naive. Also,
any implementation CANNOT clash within the same document. Using a glyph subset
idea, there could be a scenario in which the 2 fonts with the same glyph
subsets produce the same prefix.

We have to be careful what we're supporting here. There is no standardised
method to identify a font, since anyone can call any font by any name. I don't
agree that making the prefix more unique (not sure there is a scale by which
something can be measured unique, it's binary, it is or it isn't), would help
here, because given time, inevitably you'll get a clash. Then what?

The prefixes are 6 chars long, the guys at Adobe made no indication that they
wanted it to be unique in a global sense, only within a document.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


DO NOT REPLY [Bug 52477] FOP always uses the same prefix for embeded font

2012-01-18 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=52477

--- Comment #7 from quamis quamis+...@gmail.com 2012-01-18 11:27:19 UTC ---
(In reply to comment #6)

 That wouldn't necessarily fix the issue here. Fully embedding a font means 
 that
 the pseudo-unique prefix isn't used, however this isn't necessarily a good
 thing. A parser like ghostscript, could and apparently does assume that if 2
 fonts have the same name (prefix or not) that they are the same font. This is
 an assumption  that I've made previously and has proved manifestly naive. 
 Also,
 any implementation CANNOT clash within the same document. Using a glyph subset
 idea, there could be a scenario in which the 2 fonts with the same glyph
 subsets produce the same prefix.

But if 2 fonts have the same glyph subsets used within a document, then it
wouldn't be necessary to include them twice, so no clashing would occur. I
think that glyph subsets are a good idea, but i do realize that it would be
more complex to implement.

 
 We have to be careful what we're supporting here. There is no standardised
 method to identify a font, since anyone can call any font by any name. I don't
 agree that making the prefix more unique (not sure there is a scale by which
 something can be measured unique, it's binary, it is or it isn't), would help
 here, because given time, inevitably you'll get a clash. Then what?

Because the prefix is 6 chars long, its inevitably that one would eventually
get a clash, if he uses enough millions of different fonts within the same
file. But this is an acceptable limitation.

 The prefixes are 6 chars long, the guys at Adobe made no indication that they
 wanted it to be unique in a global sense, only within a document.

Yes, the Adobe guys probably meant that the prefix should be unique within the
same file and it would be the pdf reader/writer's job to handle duplicate fonts
coming from different fonts. It makes sense. This is why i think both fop and
gs handle this particular case wrong, as they both assumed things about that
prefix, and it seems that this assumptions are now proven wrong. 
gs in particular should warn about merging files with embedded fonts, either
when merging, or at least in the manual, or a known-issues page.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


DO NOT REPLY [Bug 52477] FOP always uses the same prefix for embeded font

2012-01-18 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=52477

--- Comment #8 from Pascal Sancho pascal.san...@takoma.fr 2012-01-18 12:11:52 
UTC ---
Since font files are versionned, how this will be handled when 2 subsets use
the same glyphes of the same font, but in different version?
subset reduction should take care of that.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


DO NOT REPLY [Bug 52477] FOP always uses the same prefix for embeded font

2012-01-17 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=52477

Mehdi Houshmand med1...@gmail.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||INVALID

--- Comment #1 from Mehdi Houshmand med1...@gmail.com 2012-01-17 16:18:54 UTC 
---
Hi,

This isn't a bug, the PDF specification doesn't mandate that the font prefixes
are unique outside scope of the document. The only mandate is:

The tag consists of exactly six uppercase letters; the choice of letters is
arbitrary, but different subsets in the same PDF file must have different
tags.
From Section 5.5.3 PDF v1.4 Reference.

As such this isn't a bug. Sorry to be dismissive, but as you said in your post
on the ghostscript bug report, making these unique doesn't solve the issue
since there could likely be clashes since the prefix is only 6 chars.

In my opinion, Ken Sharp is mistaken when he says If the font has the same
name and prefix then it is the same font, that isn't what the PDF
specification says (though understandably that's how it could be interpreted).
The spec only says that each subset has to be unique within the scope of a
document, which is what FOP already does.

Mehdi

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


DO NOT REPLY [Bug 52477] FOP always uses the same prefix for embeded font

2012-01-17 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=52477

--- Comment #2 from quamis quamis+...@gmail.com 2012-01-18 07:44:52 UTC ---
(In reply to comment #1)
 Hi,
 
 This isn't a bug, the PDF specification doesn't mandate that the font prefixes
 are unique outside scope of the document. The only mandate is:
 
 The tag consists of exactly six uppercase letters; the choice of letters is
 arbitrary, but different subsets in the same PDF file must have different
 tags.
 From Section 5.5.3 PDF v1.4 Reference.
 
 As such this isn't a bug. Sorry to be dismissive, but as you said in your post
 on the ghostscript bug report, making these unique doesn't solve the issue
 since there could likely be clashes since the prefix is only 6 chars.
 
 In my opinion, Ken Sharp is mistaken when he says If the font has the same
 name and prefix then it is the same font, that isn't what the PDF
 specification says (though understandably that's how it could be interpreted).
 The spec only says that each subset has to be unique within the scope of a
 document, which is what FOP already does.
 
 Mehdi

Yes, he might not be exactly wrong, but this doesn't mean that FOP shouldn't
try to be as arbitrary as possible. The algorithm used in the code is
completely predictable.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


DO NOT REPLY [Bug 52477] FOP always uses the same prefix for embeded font

2012-01-17 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=52477

--- Comment #3 from Mehdi Houshmand med1...@gmail.com 2012-01-18 07:57:11 UTC 
---
(In reply to comment #2)
/snip

There are 2 points to address there:
1) We can't arbitrarily make changes to FOP in order for it to better (not
even fully!!) support the client systems, in this case ghostscript. The bug is
in ghostscript, it should know if it is using a new PDF and change any prefixes
accordingly.

2) We use the deterministic trait of the prefixes in our testing framework. The
value of having a comprehensive test suite is far greater than making the code
change for this scenario.

I understand that none of the above particularly helps you, but we can't very
well go changing FOP to accommodate nuanced bugs in ghostscript.

Mehdi

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.