Re: [jira] [Assigned] (FOP-2210) [PATCH] Complex script IF to output missing glyphs

Vincent Hennebert Thu, 25 Apr 2013 02:31:47 -0700

On 25/04/13 10:35, Alexios Giotis wrote:
> For our use cases, it would be much better to add new child elements to IF or 
> do other similar extensions, that having to repeat part of the costly layout 
> process. Besides repeating, the FO -> IF is easily executed by multiple 
> threads, while the IF->PDF can not be parallelised (without big changes).


It doesn’t shock me to store text as text in the IF and to re-do the
glyph mapping when rendering it to the final output format. This is
actually how it is done ATM.

Sure it may become more costly when you start using complex scripts, but
that would have to be confirmed with some profiling first and foremost.
We might be surprised.

We should keep in mind that it’s a perfectly reasonable use case to add
text to the IF as part of a post-processing step. That text will have to
go through the glyph mapping code anyway.

Also, to have copy-paste work properly from PDF the original text must
be present in the IF.

Storing information about the private use area in the IF is exposing
internal implementation details of FOP. When going the direct FO to PDF
route, mapping glyphs to character codes to re-map them again into
glyphs when creating the PDF is sub-optimal. We might as well work with
the glyph indices all the way through.


Vincent


> On 25 Apr 2013, at 01:52, Glenn Adams <gl...@skynav.com> wrote:
> 
>> I see no option but to modify IF. We modified IF for 1.1 in the first place. 
>>  We have recently made quite a number of backward incompatible changes to 
>> the FOP public APIs. I expect the next release will need to bump the major 
>> version to 2 for FOP due to these changes, so there is little risk in making 
>> a change in IF. If there are other, useful changes to IF that have been 
>> postponed, then perhaps they should be reconsidered now as well.
>>
>>
>> On Wed, Apr 24, 2013 at 3:26 PM, Luis Bernardo <lmpmberna...@gmail.com> 
>> wrote:
>>
>> These are good suggestions. I am fully aware of the shortcomings that you 
>> pointed out, but the only other option seemed to be to codify the mappings 
>> in IF, similar to your first suggestion. However that would mean changing IF 
>> which is not something we are keen to do since that impacts applications 
>> that rely on the current format.
>>
>> Are you saying that with your second approach there is no need to change IF?
>>
>>
>> On 4/24/13 7:38 PM, Glenn Adams wrote:
>>> Sure. One way to do this would be to add child elements to the <font/> 
>>> element in IF output as follows:
>>>
>>> <font family="Lateef" style="normal" ...>
>>>   <pua code="0xE000" gid="139"/>
>>>   <pua code="0xE001" gid="481"/>
>>>   <pua code="0xE002" gid="219"/>
>>> </font>
>>>
>>> where these PUA mappings are collected by iterating over the characters of 
>>> TextAreas governed by the <font/> element. These characters might be 
>>> iterated upon invoking TextArea.add{Word,Space}, and collecting this info 
>>> in text areas.
>>>
>>> Alternatively, MultiByteFont.getUsedGlyphs() could be used to (1) determine 
>>> which glyph codes were referenced by the document, (2) given these used 
>>> codes, iterate of the the CMAP mappings to find which PUA codes were 
>>> generated for those glyph codes, then (3) output the <pua/> elements 
>>> (above) as required.
>>>
>>> Finally, when reading an IF file, these <pua/> elements would be used to 
>>> augment the font's CMAP (keeping in mind that when reading the font, 
>>> MultiByteFont.createPrivateUseMappings() may have already been called, and 
>>> thus the mappings in <pua/> elements may need to be replaced or merged.
>>>
>>> I can imagine various other optimizations on the above theme to make this 
>>> readily workable.
>>>
>>>
>>>
>>> On Wed, Apr 24, 2013 at 3:18 AM, Chris Bowditch 
>>> <bowditch_ch...@hotmail.com> wrote:
>>> Hi Glenn,
>>>
>>> Can you suggest an alternative approach please?
>>>
>>> Thanks,
>>>
>>> Chris
>>>
>>>
>>> On 24/04/2013 02:41, Glenn Adams wrote:
>>> I don't like this. It negates any additional processing that may have 
>>> occurred, such as letter spacing. It requires the IF to repeat part of the 
>>> layout process. Bad idea.
>>>
>>>
>>> On Tue, Apr 23, 2013 at 3:11 PM, Luis Bernardo <lmpmberna...@gmail.com 
>>> <mailto:lmpmberna...@gmail.com>> wrote:
>>>
>>>
>>>     With the approach implemented by Simon what gets written to the IF
>>>     file is the original sequence, not the mapped sequence. Then when
>>>     generating PDF from IF the same code that would generate the
>>>     synthesized mappings when generating PDF straight from FO is
>>>     called to recreate the mappings. So I don't think we can say there
>>>     is information about the mappings in the text nodes.
>>>
>>>
>>>     On 4/23/13 5:50 AM, Glenn Adams wrote:
>>>     Ah, I reread your earlier (private) message. I see the problem
>>>     has to do with the use of synthesized PUA mappings. Here, the
>>>     problem really is that the font should always have a CMAP entry
>>>     that maps to every glyph that can be produced by the GSUB
>>>     process. However, not all fonts do this, so in the case in point,
>>>     we have to synthesize some mapping, from which we have to turn to
>>>     PUA assignments. This works when we generate PDF since we
>>>     generate a subset font that contains the synthesized mappings.
>>>     However, I can see that if this is going to IF instead of PDF/PS,
>>>     then we need to find a way to recreate those synthesized mappings.
>>>
>>>     I think this information is really font-specific, and should not
>>>     be tied to specific text nodes though. So if Simon's fix uses
>>>     text nodes, then that is probably not the best approach.
>>>
>>>
>>>     On Mon, Apr 22, 2013 at 10:45 PM, Glenn Adams <gl...@skynav.com
>>>     <mailto:gl...@skynav.com>> wrote:
>>>
>>>         I'm presently at W3C WG meetings this week, but I'll try to
>>>         get on my schedule. I'm not sure what the IF->PS/PDF problem
>>>         is, since the IF->PDF path is clearly working from my tests.
>>>
>>>
>>>         On Mon, Apr 22, 2013 at 4:27 PM, Luis Bernardo
>>>         <lmpmberna...@gmail.com <mailto:lmpmberna...@gmail.com>> wrote:
>>>
>>>
>>>             Glenn,
>>>
>>>             Can you give your opinion about the approach used by
>>>             Simon? As I mentioned before (in a private message), the
>>>             IF -> PS/PDF route does not work in your original CS
>>>             patch (for the languages that CS targets) due to the
>>>             mapped sequences. Simon's approach works but requires
>>>             keeping the original sequences alongside the mapped ones.
>>>             I think it is a good approach but I would like to know if
>>>             you have a better suggestion before we apply the patch.
>>>
>>>             Thanks,
>>>             Luis
>>>
>>>
>>>             On 4/22/13 3:23 PM, Chris Bowditch (JIRA) wrote:
>>>
>>>                 [
>>>                 
>>> https://issues.apache.org/jira/browse/FOP-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>>>                 ]
>>>
>>>                 Chris Bowditch reassigned FOP-2210:
>>>                 -----------------------------------
>>>
>>>                      Assignee: Chris Bowditch
>>>
>>>                     [PATCH] Complex script IF to output missing glyphs
>>>                     --------------------------------------------------
>>>
>>>                                      Key: FOP-2210
>>>                                      URL:
>>>                     https://issues.apache.org/jira/browse/FOP-2210
>>>                                  Project: Fop
>>>                               Issue Type: Bug
>>>                                 Reporter: simon steiner
>>>                                 Assignee: Chris Bowditch
>>>                              Attachments: csspeedtrunk.patch,
>>>                     fop.xconf, test.fo <http://test.fo>
>>>
>>>
>>>                     fop test.fo <http://test.fo> -c fop.xconf -if
>>>
>>>                     application/pdf expected.if.xml
>>>                     fop -c fop.xconf -ifin expected.if.xml out.pdf
>>>
>>>                 --
>>>                 This message is automatically generated by JIRA.
>>>                 If you think it was sent incorrectly, please contact
>>>                 your JIRA administrators
>>>                 For more information on JIRA, see:
>>>                 http://www.atlassian.com/software/jira
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>

Re: [jira] [Assigned] (FOP-2210) [PATCH] Complex script IF to output missing glyphs

Reply via email to