Re: svn commit: r328381 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop: area/inline/ layoutmgr/inline/ render/ render/pdf/ render/xml/

2005-10-26 Thread Luca Furini

Manuel Mall wrote:

I have a question on this. You break in TextArea the text into words 
based on CharUtilities.isAnySpace. Is this guaranteed to be consistent 
with the breaking and adjustment calculations in TextLayoutManager? I am 
concerned we may be using different rules for word breaking in different 
places.


As far as consistency is concerned, I agree with you: the handling of the 
different kinds of spaces (breaking, non-breaking, fixed width, ...) is 
still quite incomplete and dispersed over different classes. Just to add 
another example, the CharacterLM implicitly expects its character to be 
a non-space character and has its own lines of code concerning the 
creation of the elements, while it could share the methods already called 
by the TextLM.


Having a single, centralized class taking care of the breaking (be it a 
Java utility class or a Fop one) and a single, shared method implementing 
the creation of the elements would surely increase consistency and 
clarity.


Somehow it doesn't feel right to me that TextLayoutManager does all the 
breaking and calculations and then we give the whole chunk to TextArea 
and it breaks it again using a possibly different algorithm but still 
using the adjustment value calculated by TextLayoutManager.


When I was trying to fix bug 36238 I initially started modifying 
TextLM#createTextArea(), using the AreaInfo objects to create WordAreas 
and SpaceAreas, but I then decided to move the string splitting inside 
TextArea because:


1) if WordAreas and SpaceAreas are not directly created by the LMs, there 
is no need to change a single line of code inside the classes creating 
TextAreas; this is not a real reason supporting the choice, just an 
handy consequence of it;


2) if TextArea still provides a getText() method, the renderers are not 
forced to render the text word by word and space by space if their word 
spacing treatment is not affected by multi-byte characters; but once 
again, this is not a real reason as we could provide this method anyway;


3) although both SpaceArea and WordArea hava an offset attribute it is 
ATM not used, so these areas does not carry any formatting information; 
their only purpose is to highlight spaces, thus allowing some specific 
renderer to handle them correctly regardless of their encoding; in other 
words, we are not losing braking and calculations, we simply do not need 
them anymore as we already know exactly which text will be placed in each 
line, and how wide it will be once it's correctly adjusted;


4) the text that will be placed in a line cannot be directly taken from 
textArray (in the TextLM), and the string str should be used instead 
anyway, as it may be different from the concatenation of the single pieces 
of text; at the moment the only difference concerns the hyphenation 
character - added at the end of the line, but I suspect that in 
different languages there could be other differences; so, we cannot simply 
create a WordAreas for each AreaInfo object.


So, if you find it strange to break the text, put it together and split it 
again, me too! :-) But this initial feeling disappeared when I realized 
that the final splitting does not involve breaking in its proper sense, 
but just classification of characters.


This is why I did what I did; if I did not manage to convince you ... you 
can try and convince me! :-)


Regards
Luca




DO NOT REPLY [Bug 36977] - [PATCH]TextLayoutManager CJK line break

2005-10-26 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=36977.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=36977





--- Additional Comments From [EMAIL PROTECTED]  2005-10-26 12:31 ---
It seems that the new method createElementsForLineBoundary() is called and
appends elements even if there are no cjk characters, and I think this should
not happen.

When I tried applying the patch some days ago, the testcases concerning
hyphenation failed too: the output had both missing and repeated pieces of 
text. 

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


Re: svn commit: r328381 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop: area/inline/ layoutmgr/inline/ render/ render/pdf/ render/xml/

2005-10-26 Thread Manuel Mall
On Wed, 26 Oct 2005 05:15 pm, Luca Furini wrote:
 Manuel Mall wrote:
  I have a question on this. You break in TextArea the text into
  words based on CharUtilities.isAnySpace. Is this guaranteed to be
  consistent with the breaking and adjustment calculations in
  TextLayoutManager? I am concerned we may be using different rules
  for word breaking in different places.

 As far as consistency is concerned, I agree with you: the handling of
 the different kinds of spaces (breaking, non-breaking, fixed width,
 ...) is still quite incomplete and dispersed over different
 classes. Just to add another example, the CharacterLM implicitly
 expects its character to be a non-space character and has its own
 lines of code concerning the creation of the elements, while it could
 share the methods already called by the TextLM.

 Having a single, centralized class taking care of the breaking (be it
 a Java utility class or a Fop one) and a single, shared method
 implementing the creation of the elements would surely increase
 consistency and clarity.

  Somehow it doesn't feel right to me that TextLayoutManager does all
  the breaking and calculations and then we give the whole chunk to
  TextArea and it breaks it again using a possibly different
  algorithm but still using the adjustment value calculated by
  TextLayoutManager.

 When I was trying to fix bug 36238 I initially started modifying
 TextLM#createTextArea(), using the AreaInfo objects to create
 WordAreas and SpaceAreas, but I then decided to move the string
 splitting inside TextArea because:

 1) if WordAreas and SpaceAreas are not directly created by the LMs,
 there is no need to change a single line of code inside the classes
 creating TextAreas; this is not a real reason supporting the
 choice, just an handy consequence of it;

 2) if TextArea still provides a getText() method, the renderers are
 not forced to render the text word by word and space by space if
 their word spacing treatment is not affected by multi-byte
 characters; but once again, this is not a real reason as we could
 provide this method anyway;

 3) although both SpaceArea and WordArea hava an offset attribute it
 is ATM not used, so these areas does not carry any formatting
 information; their only purpose is to highlight spaces, thus
 allowing some specific renderer to handle them correctly regardless
 of their encoding; in other words, we are not losing braking and
 calculations, we simply do not need them anymore as we already know
 exactly which text will be placed in each line, and how wide it will
 be once it's correctly adjusted;

 4) the text that will be placed in a line cannot be directly taken
 from textArray (in the TextLM), and the string str should be used
 instead anyway, as it may be different from the concatenation of the
 single pieces of text; at the moment the only difference concerns the
 hyphenation character - added at the end of the line, but I suspect
 that in different languages there could be other differences; so, we
 cannot simply create a WordAreas for each AreaInfo object.

 So, if you find it strange to break the text, put it together and
 split it again, me too! :-) But this initial feeling disappeared when
 I realized that the final splitting does not involve breaking in
 its proper sense, but just classification of characters.

 This is why I did what I did; if I did not manage to convince you ...
 you can try and convince me! :-)

I must admit you haven't convinced me. The basic premise still is 
TextLayoutManager does all the calculations including determining the 
number of word spaces and the resulting adjustment, that means it must 
know where the word spaces are. Why should TextArea recalculate the 
positions (and wrong as well because isAnySpace() tests for 7 different 
UNICODE space values not all of them adjustables spaces while 
TextLayoutManager uses a much smaller set to calculate the adjustment 
values)?

There is no need to expose creation of the Space/Word areas directly to 
TextLayoutManager either. TextArea could easily expose an addWord and 
an addSpace method instead of the monolithic setText. In the end it 
probably boils down to me arguing that the setText logic currently in 
TextArea IMO should be in TextLayoutManager (and probably based on its 
data structures) because it is an operation closely coupled to layout 
and not to areas.

 Regards
  Luca

BTW, it would also be really nice to have test cases for this new 
feature even if just expanding existing test cases to test for the new 
areas created. It would make catching regressions down the track much 
easier.

Cheers

Manuel


DO NOT REPLY [Bug 37253] - At present rendering to TXT is unimplemented.

2005-10-26 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=37253.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=37253





--- Additional Comments From [EMAIL PROTECTED]  2005-10-26 13:33 ---
Created an attachment (id=16812)
 -- (http://issues.apache.org/bugzilla/attachment.cgi?id=16812action=view)
[PATCH] TXT rendering is supported


-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


AW: RTF output

2005-10-26 Thread Peter Herweg
There's the intention to use the wrapper classes, which are already used by
rest of FOP.
Jeremias made a similiar suggestion on 4th Oct.

I will see, if i can invest some time on that task this week-end.

Kind regards,
Peter Herweg

-Ursprungliche Nachricht-
Von: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
Auftrag von Andreas L Delmelle
Gesendet: Dienstag, 25. Oktober 2005 21:10
An: fop-users@xmlgraphics.apache.org
Cc: fop-dev@xmlgraphics.apache.org
Betreff: Re: RTF output



On Oct 25, 2005, at 00:20, Tony Morris wrote:

 I don't have my test case with me, since I am at work at the moment.
 Otherwise, what I recall is setting the size of an external-graphic
 to the
 exact number of pixels (I think if I didn't, the RTF renderer
 wasn't happy),
 the image appeared scaled down, but if I set the image size to say,
 10x the
 number of pixels, it would not appear 10x bigger than the scaled
 down image,
 but about the size I would expect normally. Granted, I was using MS
 Word
 2003 for verification, which may well be the culprit.

(cc'ing fop-dev, since the message contains pointers on the causes of
this problem, and may help someone devise a solution for it)

Well, we shouldn't be blaming M$ for everything --however tempting it
may be ;-)
All I can say is that the other renderers all use the same set of
image library wrappers. The RTF renderer currently is the only
exception (support for external-graphics was reintroduced for RTF
about a month ago).
AFAICT, in the long run, it's the intention of switching to the same
set of wrappers for the RTF renderer. Doing so could mean that your
problem disappears, I'm not sure. What is more than certain is that
the current code in the RTF lib is not 100% correct, and even seems
to make the same mistake in interpretation of the related properties
(height/width) that FOP 0.20.5 made, namely interpreting the value of
these properties as the dimensions of the image itself instead of
taking them to be the dimensions of the image's surrounding box.
Looking at the related code in the RTF library, it seems the 'height'
and 'width' of the external-graphic are interpreted as 'desired
height' and 'desired width', which is wrong if neither content-height
nor content-width were specified as 'scale-to-fit'. One can define an
external-graphic with height=10cm and still have the content take
up only 3cm.

Roughly, it seems line 952 in the RTFHandler:

newGraphic.setWidth(eg.getWidth().getValue() / 1000f + pt);

is too simplistic, and should at least become something like:

if (eg.getWidth().getEnum() != Constants.EN_AUTO) {
 if (eg.getContentWidth().getEnum() == Constants.EN_SCALE_TO_FIT) {
 newGraphic.setWidth(eg.getWidth().getValue() / 1000f + pt);
 } ...
...
}

So, only if width is not specified as auto *and* content-width is
specified as scale-to-fit (or is of length equal to the non-auto
width) does the external-graphic's width become the desired width for
the image.

If, for instance, width=auto *and* content-width=auto, the
following could be used (instrinsic width of the image):

newGraphic.setWidth(100%);

I don't think it's all that difficult to tweak the RTFHandler into
handling these properties correctly, but then again, the question can
be asked whether it's all worth it. If the RTF renderer is going to
switch to the default image lib wrappers anyway, this effort would
perhaps be completely in vain.

Anyone?

Cheers,

Andreas



Re: svn commit: r328381 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop: area/inline/ layoutmgr/inline/ render/ render/pdf/ render/xml/

2005-10-26 Thread Jeremias Maerki

On 26.10.2005 13:05:26 Manuel Mall wrote:
snip/
 There is no need to expose creation of the Space/Word areas directly to 
 TextLayoutManager either. TextArea could easily expose an addWord and 
 an addSpace method instead of the monolithic setText. In the end it 
 probably boils down to me arguing that the setText logic currently in 
 TextArea IMO should be in TextLayoutManager (and probably based on its 
 data structures) because it is an operation closely coupled to layout 
 and not to areas.

FWIW, I agree with Manuel that the new logic in TextArea shouldn't be
there. The area tree should simply be a data structure, nothing more.
Splitting functionality in too many places is dangerous.

  Regards
   Luca
 
 BTW, it would also be really nice to have test cases for this new 
 feature even if just expanding existing test cases to test for the new 
 areas created. It would make catching regressions down the track much 
 easier.

+1 to that. The test cases are very important to document what we can do
besides checking for regressions. I know this is additional work,
especially hard for those who have very little time available to them,
but the tests are something that is extremely valuable to improve the
quality of our package.

Jeremias Maerki



Fixed block-containers. Was [Bug 37236] - Fix gradients and patterns

2005-10-26 Thread Finn Bock

[Jeremias]


... the
transformation list is still necessary to recreate the same state after a break
out as needed when painting fixed block-containers. I haven't found a better
way to handle this case, yet.


Is there a reason for keeping areas from absolute and fixed 
block-containers in the flow of normal areas?


IIUC absolute and fixed block-containers generates areas with an 
area-class of xsl-fixed and it is hinted that such areas is taken out of 
the flow and placed under the page-area:


 [7.5.1]
Absolutely positioned areas are taken out of the normal flow.
...
The area generated is a descendant of the page-area


If I'm right about this, the break-out code can be avoided by placing 
the absolute and fixed block-container differently in the area tree.


regards,
finn


Re: Fixed block-containers. Was [Bug 37236] - Fix gradients and patterns

2005-10-26 Thread Jeremias Maerki
You're right, but it didn't occur to me at that time. Another thing to
look at when we talk about this would be z-index. I assume this would
play into the same corner.

Well, another thing to keep in mind while we make progress. I'll make a
not on the Wiki.

On 26.10.2005 20:26:40 Finn Bock wrote:
 [Jeremias]
 
  ... the
  transformation list is still necessary to recreate the same state after a 
  break
  out as needed when painting fixed block-containers. I haven't found a 
  better
  way to handle this case, yet.
 
 Is there a reason for keeping areas from absolute and fixed 
 block-containers in the flow of normal areas?
 
 IIUC absolute and fixed block-containers generates areas with an 
 area-class of xsl-fixed and it is hinted that such areas is taken out of 
 the flow and placed under the page-area:
 
  [7.5.1]
 Absolutely positioned areas are taken out of the normal flow.
 ...
 The area generated is a descendant of the page-area
 
 
 If I'm right about this, the break-out code can be avoided by placing 
 the absolute and fixed block-container differently in the area tree.
 
 regards,
 finn



Jeremias Maerki



Re: White space handling Wiki page

2005-10-26 Thread Manuel Mall
On Wed, 26 Oct 2005 06:22 am, Andreas L Delmelle wrote:
 On Oct 25, 2005, at 10:57, Manuel Mall wrote:
/snip
 No, it talks about 'character flow objects', which makes me wonder...
 Are all characters to be considered 'character flow objects' or only
 those that were specified using fo:character? Not that it would make
 a big difference, I think.

See bottom of page 3 (PDF version) and top of page 4 of the spec. There 
it talks about 'objectifying' the XML elements and attributes which 
includes converting characters into character FO's. From then on the 
spec always means the value of the character property of a 
fo:character object when talking about characters and their values. 
So the answer to your above question is: YES - all characters are 
'character flow objects'.

Side note: FOP doesn't quite do the same internally, i.e. a character 
explicitly specified using fo:character.../ is handled separately 
from 'plain text'. If someone would write a style sheet which does a 
transform of every character into a fo:character / object and would 
feed the output to FOP the formatting results would be lets say VERY 
DISAPPOINTING. Actually something like:
fo:block background-color=yellowword1fo:character 
character=#10;/fo:character character=
 /word2fo:character character= /word3fo:character 
character=#10;//fo:block
currently causes an exception!

 Cheers,

 Andreas
Cheers

Manuel


Re: White space handling Wiki page

2005-10-26 Thread Manuel Mall
On Wed, 26 Oct 2005 06:22 am, Andreas L Delmelle wrote:
 On Oct 25, 2005, at 10:57, Manuel Mall wrote:

snip/
 The right order in which the related properties should be dealt with
 seems to be:
 1. white-space-treatment (property refinement)
 2. linefeed-treatment (property refinement)
 3. white-space-collapse (layout/area tree construction)
 4. suppress-at-line-break (layout/area tree construction)

We are very close here in our mutual opinions - if you look at my 
revised algorithm on the Wiki page it is nearly the same as your 4 
steps above. THAT'S GOOD !!!

And what do they say: Great minds think alike :-)

 Cheers,

 Andreas

Cheers

Manuel