Re: svn commit: r328381 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop: area/inline/ layoutmgr/inline/ render/ render/pdf/ render/xml/
Manuel Mall wrote: I have a question on this. You break in TextArea the text into words based on CharUtilities.isAnySpace. Is this guaranteed to be consistent with the breaking and adjustment calculations in TextLayoutManager? I am concerned we may be using different rules for word breaking in different places. As far as consistency is concerned, I agree with you: the handling of the different kinds of spaces (breaking, non-breaking, fixed width, ...) is still quite incomplete and dispersed over different classes. Just to add another example, the CharacterLM implicitly expects its character to be a non-space character and has its own lines of code concerning the creation of the elements, while it could share the methods already called by the TextLM. Having a single, centralized class taking care of the breaking (be it a Java utility class or a Fop one) and a single, shared method implementing the creation of the elements would surely increase consistency and clarity. Somehow it doesn't feel right to me that TextLayoutManager does all the breaking and calculations and then we give the whole chunk to TextArea and it breaks it again using a possibly different algorithm but still using the adjustment value calculated by TextLayoutManager. When I was trying to fix bug 36238 I initially started modifying TextLM#createTextArea(), using the AreaInfo objects to create WordAreas and SpaceAreas, but I then decided to move the string splitting inside TextArea because: 1) if WordAreas and SpaceAreas are not directly created by the LMs, there is no need to change a single line of code inside the classes creating TextAreas; this is not a real reason supporting the choice, just an handy consequence of it; 2) if TextArea still provides a getText() method, the renderers are not forced to render the text word by word and space by space if their word spacing treatment is not affected by multi-byte characters; but once again, this is not a real reason as we could provide this method anyway; 3) although both SpaceArea and WordArea hava an offset attribute it is ATM not used, so these areas does not carry any formatting information; their only purpose is to highlight spaces, thus allowing some specific renderer to handle them correctly regardless of their encoding; in other words, we are not losing braking and calculations, we simply do not need them anymore as we already know exactly which text will be placed in each line, and how wide it will be once it's correctly adjusted; 4) the text that will be placed in a line cannot be directly taken from textArray (in the TextLM), and the string str should be used instead anyway, as it may be different from the concatenation of the single pieces of text; at the moment the only difference concerns the hyphenation character - added at the end of the line, but I suspect that in different languages there could be other differences; so, we cannot simply create a WordAreas for each AreaInfo object. So, if you find it strange to break the text, put it together and split it again, me too! :-) But this initial feeling disappeared when I realized that the final splitting does not involve breaking in its proper sense, but just classification of characters. This is why I did what I did; if I did not manage to convince you ... you can try and convince me! :-) Regards Luca
DO NOT REPLY [Bug 36977] - [PATCH]TextLayoutManager CJK line break
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=36977. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=36977 --- Additional Comments From [EMAIL PROTECTED] 2005-10-26 12:31 --- It seems that the new method createElementsForLineBoundary() is called and appends elements even if there are no cjk characters, and I think this should not happen. When I tried applying the patch some days ago, the testcases concerning hyphenation failed too: the output had both missing and repeated pieces of text. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
Re: svn commit: r328381 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop: area/inline/ layoutmgr/inline/ render/ render/pdf/ render/xml/
On Wed, 26 Oct 2005 05:15 pm, Luca Furini wrote: Manuel Mall wrote: I have a question on this. You break in TextArea the text into words based on CharUtilities.isAnySpace. Is this guaranteed to be consistent with the breaking and adjustment calculations in TextLayoutManager? I am concerned we may be using different rules for word breaking in different places. As far as consistency is concerned, I agree with you: the handling of the different kinds of spaces (breaking, non-breaking, fixed width, ...) is still quite incomplete and dispersed over different classes. Just to add another example, the CharacterLM implicitly expects its character to be a non-space character and has its own lines of code concerning the creation of the elements, while it could share the methods already called by the TextLM. Having a single, centralized class taking care of the breaking (be it a Java utility class or a Fop one) and a single, shared method implementing the creation of the elements would surely increase consistency and clarity. Somehow it doesn't feel right to me that TextLayoutManager does all the breaking and calculations and then we give the whole chunk to TextArea and it breaks it again using a possibly different algorithm but still using the adjustment value calculated by TextLayoutManager. When I was trying to fix bug 36238 I initially started modifying TextLM#createTextArea(), using the AreaInfo objects to create WordAreas and SpaceAreas, but I then decided to move the string splitting inside TextArea because: 1) if WordAreas and SpaceAreas are not directly created by the LMs, there is no need to change a single line of code inside the classes creating TextAreas; this is not a real reason supporting the choice, just an handy consequence of it; 2) if TextArea still provides a getText() method, the renderers are not forced to render the text word by word and space by space if their word spacing treatment is not affected by multi-byte characters; but once again, this is not a real reason as we could provide this method anyway; 3) although both SpaceArea and WordArea hava an offset attribute it is ATM not used, so these areas does not carry any formatting information; their only purpose is to highlight spaces, thus allowing some specific renderer to handle them correctly regardless of their encoding; in other words, we are not losing braking and calculations, we simply do not need them anymore as we already know exactly which text will be placed in each line, and how wide it will be once it's correctly adjusted; 4) the text that will be placed in a line cannot be directly taken from textArray (in the TextLM), and the string str should be used instead anyway, as it may be different from the concatenation of the single pieces of text; at the moment the only difference concerns the hyphenation character - added at the end of the line, but I suspect that in different languages there could be other differences; so, we cannot simply create a WordAreas for each AreaInfo object. So, if you find it strange to break the text, put it together and split it again, me too! :-) But this initial feeling disappeared when I realized that the final splitting does not involve breaking in its proper sense, but just classification of characters. This is why I did what I did; if I did not manage to convince you ... you can try and convince me! :-) I must admit you haven't convinced me. The basic premise still is TextLayoutManager does all the calculations including determining the number of word spaces and the resulting adjustment, that means it must know where the word spaces are. Why should TextArea recalculate the positions (and wrong as well because isAnySpace() tests for 7 different UNICODE space values not all of them adjustables spaces while TextLayoutManager uses a much smaller set to calculate the adjustment values)? There is no need to expose creation of the Space/Word areas directly to TextLayoutManager either. TextArea could easily expose an addWord and an addSpace method instead of the monolithic setText. In the end it probably boils down to me arguing that the setText logic currently in TextArea IMO should be in TextLayoutManager (and probably based on its data structures) because it is an operation closely coupled to layout and not to areas. Regards Luca BTW, it would also be really nice to have test cases for this new feature even if just expanding existing test cases to test for the new areas created. It would make catching regressions down the track much easier. Cheers Manuel
DO NOT REPLY [Bug 37253] - At present rendering to TXT is unimplemented.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=37253. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=37253 --- Additional Comments From [EMAIL PROTECTED] 2005-10-26 13:33 --- Created an attachment (id=16812) -- (http://issues.apache.org/bugzilla/attachment.cgi?id=16812action=view) [PATCH] TXT rendering is supported -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
AW: RTF output
There's the intention to use the wrapper classes, which are already used by rest of FOP. Jeremias made a similiar suggestion on 4th Oct. I will see, if i can invest some time on that task this week-end. Kind regards, Peter Herweg -Ursprungliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Auftrag von Andreas L Delmelle Gesendet: Dienstag, 25. Oktober 2005 21:10 An: fop-users@xmlgraphics.apache.org Cc: fop-dev@xmlgraphics.apache.org Betreff: Re: RTF output On Oct 25, 2005, at 00:20, Tony Morris wrote: I don't have my test case with me, since I am at work at the moment. Otherwise, what I recall is setting the size of an external-graphic to the exact number of pixels (I think if I didn't, the RTF renderer wasn't happy), the image appeared scaled down, but if I set the image size to say, 10x the number of pixels, it would not appear 10x bigger than the scaled down image, but about the size I would expect normally. Granted, I was using MS Word 2003 for verification, which may well be the culprit. (cc'ing fop-dev, since the message contains pointers on the causes of this problem, and may help someone devise a solution for it) Well, we shouldn't be blaming M$ for everything --however tempting it may be ;-) All I can say is that the other renderers all use the same set of image library wrappers. The RTF renderer currently is the only exception (support for external-graphics was reintroduced for RTF about a month ago). AFAICT, in the long run, it's the intention of switching to the same set of wrappers for the RTF renderer. Doing so could mean that your problem disappears, I'm not sure. What is more than certain is that the current code in the RTF lib is not 100% correct, and even seems to make the same mistake in interpretation of the related properties (height/width) that FOP 0.20.5 made, namely interpreting the value of these properties as the dimensions of the image itself instead of taking them to be the dimensions of the image's surrounding box. Looking at the related code in the RTF library, it seems the 'height' and 'width' of the external-graphic are interpreted as 'desired height' and 'desired width', which is wrong if neither content-height nor content-width were specified as 'scale-to-fit'. One can define an external-graphic with height=10cm and still have the content take up only 3cm. Roughly, it seems line 952 in the RTFHandler: newGraphic.setWidth(eg.getWidth().getValue() / 1000f + pt); is too simplistic, and should at least become something like: if (eg.getWidth().getEnum() != Constants.EN_AUTO) { if (eg.getContentWidth().getEnum() == Constants.EN_SCALE_TO_FIT) { newGraphic.setWidth(eg.getWidth().getValue() / 1000f + pt); } ... ... } So, only if width is not specified as auto *and* content-width is specified as scale-to-fit (or is of length equal to the non-auto width) does the external-graphic's width become the desired width for the image. If, for instance, width=auto *and* content-width=auto, the following could be used (instrinsic width of the image): newGraphic.setWidth(100%); I don't think it's all that difficult to tweak the RTFHandler into handling these properties correctly, but then again, the question can be asked whether it's all worth it. If the RTF renderer is going to switch to the default image lib wrappers anyway, this effort would perhaps be completely in vain. Anyone? Cheers, Andreas
Re: svn commit: r328381 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop: area/inline/ layoutmgr/inline/ render/ render/pdf/ render/xml/
On 26.10.2005 13:05:26 Manuel Mall wrote: snip/ There is no need to expose creation of the Space/Word areas directly to TextLayoutManager either. TextArea could easily expose an addWord and an addSpace method instead of the monolithic setText. In the end it probably boils down to me arguing that the setText logic currently in TextArea IMO should be in TextLayoutManager (and probably based on its data structures) because it is an operation closely coupled to layout and not to areas. FWIW, I agree with Manuel that the new logic in TextArea shouldn't be there. The area tree should simply be a data structure, nothing more. Splitting functionality in too many places is dangerous. Regards Luca BTW, it would also be really nice to have test cases for this new feature even if just expanding existing test cases to test for the new areas created. It would make catching regressions down the track much easier. +1 to that. The test cases are very important to document what we can do besides checking for regressions. I know this is additional work, especially hard for those who have very little time available to them, but the tests are something that is extremely valuable to improve the quality of our package. Jeremias Maerki
Fixed block-containers. Was [Bug 37236] - Fix gradients and patterns
[Jeremias] ... the transformation list is still necessary to recreate the same state after a break out as needed when painting fixed block-containers. I haven't found a better way to handle this case, yet. Is there a reason for keeping areas from absolute and fixed block-containers in the flow of normal areas? IIUC absolute and fixed block-containers generates areas with an area-class of xsl-fixed and it is hinted that such areas is taken out of the flow and placed under the page-area: [7.5.1] Absolutely positioned areas are taken out of the normal flow. ... The area generated is a descendant of the page-area If I'm right about this, the break-out code can be avoided by placing the absolute and fixed block-container differently in the area tree. regards, finn
Re: Fixed block-containers. Was [Bug 37236] - Fix gradients and patterns
You're right, but it didn't occur to me at that time. Another thing to look at when we talk about this would be z-index. I assume this would play into the same corner. Well, another thing to keep in mind while we make progress. I'll make a not on the Wiki. On 26.10.2005 20:26:40 Finn Bock wrote: [Jeremias] ... the transformation list is still necessary to recreate the same state after a break out as needed when painting fixed block-containers. I haven't found a better way to handle this case, yet. Is there a reason for keeping areas from absolute and fixed block-containers in the flow of normal areas? IIUC absolute and fixed block-containers generates areas with an area-class of xsl-fixed and it is hinted that such areas is taken out of the flow and placed under the page-area: [7.5.1] Absolutely positioned areas are taken out of the normal flow. ... The area generated is a descendant of the page-area If I'm right about this, the break-out code can be avoided by placing the absolute and fixed block-container differently in the area tree. regards, finn Jeremias Maerki
Re: White space handling Wiki page
On Wed, 26 Oct 2005 06:22 am, Andreas L Delmelle wrote: On Oct 25, 2005, at 10:57, Manuel Mall wrote: /snip No, it talks about 'character flow objects', which makes me wonder... Are all characters to be considered 'character flow objects' or only those that were specified using fo:character? Not that it would make a big difference, I think. See bottom of page 3 (PDF version) and top of page 4 of the spec. There it talks about 'objectifying' the XML elements and attributes which includes converting characters into character FO's. From then on the spec always means the value of the character property of a fo:character object when talking about characters and their values. So the answer to your above question is: YES - all characters are 'character flow objects'. Side note: FOP doesn't quite do the same internally, i.e. a character explicitly specified using fo:character.../ is handled separately from 'plain text'. If someone would write a style sheet which does a transform of every character into a fo:character / object and would feed the output to FOP the formatting results would be lets say VERY DISAPPOINTING. Actually something like: fo:block background-color=yellowword1fo:character character=#10;/fo:character character= /word2fo:character character= /word3fo:character character=#10;//fo:block currently causes an exception! Cheers, Andreas Cheers Manuel
Re: White space handling Wiki page
On Wed, 26 Oct 2005 06:22 am, Andreas L Delmelle wrote: On Oct 25, 2005, at 10:57, Manuel Mall wrote: snip/ The right order in which the related properties should be dealt with seems to be: 1. white-space-treatment (property refinement) 2. linefeed-treatment (property refinement) 3. white-space-collapse (layout/area tree construction) 4. suppress-at-line-break (layout/area tree construction) We are very close here in our mutual opinions - if you look at my revised algorithm on the Wiki page it is nearly the same as your 4 steps above. THAT'S GOOD !!! And what do they say: Great minds think alike :-) Cheers, Andreas Cheers Manuel