Re: Remove fop.bat? [was: Re: JEuclid and FOP]

2010-11-16 Thread Vincent Hennebert
Thanks for your feedback. Hearing no objection, I’ll remove it shortly.

Vincent


On 10/11/10 09:42, Jeremias Maerki wrote:
> I think that's a good idea. I've locally removed the fop.bat and just
> called "fop". Worked fine on XP despite the ".cmd" extension. Less
> redundancy, improved usability. +1
> 
> On 04.11.2010 11:25:10 Vincent Hennebert wrote:
>> Hi,
>>
>> We've been bitten several times in the past already by the fact that
>> fop.bat doesn’t automatically pick jars in the lib directory. Even users
>> stumble upon that kind of issue.
>>
>> My question is: is it safe to remove it? Would one of the other two
>> scripts be a sane default on nowadays Windows platforms?
>>
>> Thanks,
>> Vincent
>>
>>
>> On 02/11/10 11:12, Simon Pepping wrote:
>>> On Mon, Nov 01, 2010 at 09:31:23PM +0100, J.Pietschmann wrote:
>>>> On 01.11.2010 13:20, Peter Hancock wrote:
>>>>> I am not a windows user and so there may be an environmental reason
>>>>> you are not have success but I do not think that is likely.
>>>>
>>>> The fop.bat for windows isn't nearly as intelligent as the fop
>>>> shell script used on Linux/Unix. In particular, on windows each
>>>> jar which has to be included into the classpath gets an explicit
>>>> line in the fop.bat, while the shell script automatically includes
>>>> every jar it finds in the lib subdirectory. The fop.cmd command
>>>> file should also automatically includes every jar in the lib subdir,
>>>> but usually the fop.bat command takes precedence.
>>>
>>> I created fop.js for the same purpose. It should also include all jar
>>> files in the lib subdirectory automatically. Since I do not use MS
>>> Windows myself, I did not use or test it recently.
>>>
>>> Simon
>>>
>>>> So in order to so in order to get FOP with JEuclid working on Windows
>>>> with the fop.bat command, the fop.bat file has to be modified to
>>>> add the JEuclid jars to LOCALCLASSPATH (this should'nt be too hard).
>>>> Or just call fop.cmd explicitely:
>>>>  fop.cmd mathml.fo mathml.pdf
>>>
>>> -
>>> To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
>>> For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
>>>
> 
> 
> 
> 
> Jeremias Maerki
> 


Re: TrueType Font Embedding

2010-11-09 Thread Vincent Hennebert
There may be an interest in fully embedding a font for PostScript
output. IIUC there may be a print manager that pre-processes PostScript
files, extracts embedded fonts to store them somewhere and re-use them
whenever needed. It can then strip the font off subsequent files and
substantially lighten them, speeding up the printing process.

What’s the purpose of the ‘encoding’ parameter? It looks to me like
users don’t care about what encoding is used in the PDF or PostScript
file. All they want to have is properly printed documents that use their
own fonts. I think that parameter should be removed in favour of Mehdi’s
proposal, which IMO makes much more sense from a user perspective.

Granted, there would be some redundancy with the referenced-fonts
element. But is the additional flexibility of regexp really useful in
the first place? I’m not too sure. Maybe that could be removed too.

Vincent


On 09/11/10 12:45, Jeremias Maerki wrote:
> Hi Mehdi,
> I'm against that since we already have mechanisms to control some of
> these traits and this would overlap with them. For example, we have the
> referenced-fonts element
> (http://xmlgraphics.apache.org/fop/trunk/fonts.html#embedding)
> which controls whether we embed or not. And we have the encoding-mode
> attribute on the font element to control if single-byte or cid mode
> should be used. Granted, that's not exactly what you're after, but I
> believe this already covers 95% of the use cases if not more.
> 
> The only thing you can't currently do is embed a full font in CID mode
> (or reference it). The problem here is the character map that should be
> used when in CID mode. I think that would require some research first so
> we know how best to handle this. For example, referencing only makes
> sense if a TrueType font can be installed directly on the printer. But
> then, the question is in which mode the characters can be addressed.
> Single-byte (like we currently fall back to) is probably not a problem
> unless you need to print Asian documents. Please note that we also don't
> support full TTF embedding/referencing in CID mode in PDF documents. So
> I'm not sure if we really need that at the moment.
> 
> If we do, I believe it would generally suffice to extend encoding-mode
> from (auto|single-byte|cid) to (auto|single-byte|cid|cid-full). We may
> need a "cmap" parameter then to change the default CMap (currently
> "Identity-H" like in PDF) since our subsetting code uses custom mappings,
> not Unicode or any other encoding scheme (like "90ms-RKSJ-H").
> 
> On 09.11.2010 12:08:36 mehdi houshmand wrote:
>> Hi,
>>
>> I'm working on making TTF subset embedding configurable such that a
>> user can opt for either full font embedding, subset embedding or just
>> referencing, this would be extending the work Jeremias submitted. I
>> was considering adding a parameter to the font configuration file
>> called "embedding" with 3 possible values "none", "subset" and "full".
>> This would allow the user to configure the embedding mode on a font by
>> font basis. What do people think about this proposal?
>>
>> Thanks
>>
>> Mehdi
> 
> 
> 
> 
> Jeremias Maerki
> 


Remove fop.bat? [was: Re: JEuclid and FOP]

2010-11-04 Thread Vincent Hennebert
Hi,

We've been bitten several times in the past already by the fact that
fop.bat doesn’t automatically pick jars in the lib directory. Even users
stumble upon that kind of issue.

My question is: is it safe to remove it? Would one of the other two
scripts be a sane default on nowadays Windows platforms?

Thanks,
Vincent


On 02/11/10 11:12, Simon Pepping wrote:
> On Mon, Nov 01, 2010 at 09:31:23PM +0100, J.Pietschmann wrote:
>> On 01.11.2010 13:20, Peter Hancock wrote:
>>> I am not a windows user and so there may be an environmental reason
>>> you are not have success but I do not think that is likely.
>>
>> The fop.bat for windows isn't nearly as intelligent as the fop
>> shell script used on Linux/Unix. In particular, on windows each
>> jar which has to be included into the classpath gets an explicit
>> line in the fop.bat, while the shell script automatically includes
>> every jar it finds in the lib subdirectory. The fop.cmd command
>> file should also automatically includes every jar in the lib subdir,
>> but usually the fop.bat command takes precedence.
> 
> I created fop.js for the same purpose. It should also include all jar
> files in the lib subdirectory automatically. Since I do not use MS
> Windows myself, I did not use or test it recently.
> 
> Simon
> 
>> So in order to so in order to get FOP with JEuclid working on Windows
>> with the fop.bat command, the fop.bat file has to be modified to
>> add the JEuclid jars to LOCALCLASSPATH (this should'nt be too hard).
>> Or just call fop.cmd explicitely:
>>  fop.cmd mathml.fo mathml.pdf
> 
> -
> To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
> 


Re: svn commit: r1003845 - /xmlgraphics/fop/trunk/src/java/org/apache/fop/hyphenation/Hyphenator.java

2010-10-04 Thread Vincent Hennebert
On 02/10/10 19:37, Simon Pepping wrote:
> On Sat, Oct 02, 2010 at 05:48:07PM -, spepp...@apache.org wrote:
>> Author: spepping
>> Date: Sat Oct  2 17:48:07 2010
>> New Revision: 1003845
>>
>> URL: http://svn.apache.org/viewvc?rev=1003845&view=rev
>> Log:
>> Remove unused methods from Hyphenator; this leaves a utility class
>>
>> Modified:
>> xmlgraphics/fop/trunk/src/java/org/apache/fop/hyphenation/Hyphenator.java
>>
> 
> When I wanted to add configurability for hyphenation pattern file
> names, I had to analyse the callers of a lot of Hyphenator methods, to
> see if they needed access to the configuration. It turned out that
> several methods were never called from within FOP. Even the Hyphenator
> constructor is not called from within FOP, so that there never is a
> Hyphenator object in FOP. I removed these methods because it
> facilitates working on FOP's code, and saves a lot of time. Of course,
> there is a remote possibility that these public methods are used by an
> external application, but that is so remote as to be beyond my
> horizon.

... and I couldn’t agree more. Thanks for this clean-up work :-)

Vincent


Re: DO NOT REPLY [Bug 49881] New: [PATCH] add maven build support

2010-09-10 Thread Vincent Hennebert

I’ve been following this discussion with interest. Thanks to Benson,
Craig and Glen for demystifying Maven a bit.

I wanted to share my thoughts about that before going offline for 10
days, but it looks like it’s going to have to wait.


Vincent


Le 06/09/2010 15:17, Jeremias Maerki a écrit :

Personally, I'd not be happy if we added a parallel build system. Given
that so much Ant code is necessary to handle some details shows how
inflexible Maven is. I haven't checked how much Ant code is duplicated
between the root-level build.xml and the files in the "maven"
subdirectory. IMO, this would be a maintenance head-ache since the two
always have to be kept in sync. If build.xml would be split into
re-usable sub-file (Ant is quite flexible), some duplication could be
avoided maybe. But that would still impose some level of redundancy. At
any rate, you probably can't count me in to help maintain the Maven side
due to my very bad experiences with it.

Also, I'm not sure if the  task will work as expected on
Windows.

On 04.09.2010 12:41:14 bugzilla wrote:

https://issues.apache.org/bugzilla/show_bug.cgi?id=49881

Summary: [PATCH] add maven build support
Product: Fop
Version: 1.1dev
   Platform: All
 OS/Version: All
 Status: NEW
   Severity: enhancement
   Priority: P2
  Component: general
 AssignedTo: fop-dev@xmlgraphics.apache.org
 ReportedBy: gl...@skynav.com


This patch adds support for building with maven 2.2.X or later. I have tested
it with the current version (2.2.1) on a JDK 1.6 platform.

There are no direct dependencies on JDK 1.4 or 1.5 features, but I have not
verified yet.

The patch creates a new top-level directory 'maven' in the FOP trunk directory.
See the file README-MAVEN.txt there for configuration and usage information.

Once downloaded to your home directory, this patch may be applied as follows:

cd ${FOP}/trunk
gzcat ~/patch-maven-build.diff.gz | patch -p0
svn add maven

Note that only the core fop.jar artifact is built at this time. In particular,
the fop-transcoder and fop-sandbox jar artifacts are not yet built.

This patch has been verified against repository version 992575.

Regards,
Glenn

--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.





Jeremias Maerki





Re: DO NOT REPLY [Bug 49881] New: [PATCH] add maven build support

2010-09-10 Thread Vincent Hennebert
Hi,

On 06/09/10 23:07, Benson Margulies wrote:
> Glenn,
> 
> FBOFW, it's clear that a number of core contributors (including the PMC
> chair!) in fop-land are exceedingly Maven-averse. It's not that rare of a
> viewpoint in the FOSS community.
> 
> All that dependency stuff can be done by borrowing maven dependencies in
> ant, either via the maven ant tools or via ivy. For CI, in my opinion you're
> completely correct, but you and I are completely outnumbered.

Just curious: What does Maven offer in terms of continuous integration
that is not available with other tools?



Thanks,
Vincent


Re: DO NOT REPLY [Bug 49881] New: [PATCH] add maven build support

2010-09-09 Thread Vincent Hennebert
On 07/09/10 10:10, Craig Ringer wrote:
> On 7/09/2010 4:40 PM, Jeremias Maerki wrote:

>> Anyway, I won't to stand in the way
>> if something is added to FOP that can help some users. [snip] just
>> because Maven
>> can't include a simple JAR that is not in a repository.
> 
> Not strictly true. One option is to use system with an
> explicit path to the jar.
> 
> Maven doesn't have a wild-card "include everything under lib/" though,
> and using system scope to fudge in local depencies is a bit of a hack.
> 
> http://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#System_Dependencies
> 
> Usually what you'd do if you have a jar you want to use - but no repo or
> pom for it - is drop the jar you want to use into your local ~/m2/ (or
> wherever you keep your local repository, ie download cache) then declare
> a dependency on it in your pom.xml. This is within "a repository" but
> it's only your local repo, it doesn't involve any network access or
> anything except putting a file in a particular place. Maven will look
> for the dependency in a location defined by the repo layout. So if I
> declared
> 
> 
>   local
>   somejar
>   2.2
> 
> 
> ... then it'd look for local/somejar-2.2.jar within my local repository.
> If I put the jar where it should be found, no problem.

Is that also how one would handle optional dependencies? For example
Jeuclid is an optional plug-in; how would I do if I wanted to
periodically enable/disable it?

At the moment I have a jeuclid.jar in my lib/ directory, that I can just
rename into jeuclid.jar.disabled if I want to exclude it.



Thanks,
Vincent


Re: DO NOT REPLY [Bug 49881] [PATCH] add maven build support

2010-09-09 Thread Vincent Hennebert
On 09/09/10 04:39, Glenn Adams wrote:
> Ah, ok. Off hand, I see three ways to handle this, one of which you mention:
> 
> (1) deploy xmlgraphics-commons-1.5-SNAPSHOT.jar to a public maven repo and
> update the maven/pom.xml to refer to this version;
> (2) install xmlgraphics-commons-1.5-SNAPSHOT.jar in your local maven repo
> and update the maven/pom.xml to refer to this version;
> (3) modify maven/pom.xml to exclude the dependency from the class path, but
> then add a reference to the local XGC jar to the classpath (for compiles and
> tests);
> 
> I would probably choose option (2), since that puts the onus on the user of
> the maven build config rather than on the updater of XGC (who may not be
> familiar with deploying a snapshot). The change to maven/pom.xml to use the
> snapshot version could be committed, and not just a local copy; instructions
> to set up the local repo copy of the snapshot would then be added to the
> maven readme file I created.

Can one have several local Maven repositories? What if I’m working on
several branches of FOP that all require different, snapshot versions of
XML Graphics Commons?

With option (2), how can we make sure that all the developers work with
the same snapshot jar of XGC?


Thanks,
Vincent


> G.
> 
> On Thu, Sep 9, 2010 at 1:30 AM, Simon Pepping wrote:
> 
>> I think I mean something different. When XGC adds something new and
>> FOP uses that, a new XGC jar file must be used by builds. We do that
>> by having a new jar file in /lib, typically called
>> xmlgraphics-commons-1.5-svn.jar (which may be updated a few times
>> during development of the next release). How would that be handled by
>> the maven build? Would it require the deployment of a snapshot to
>> Maven central? And would the version number in the pom file be
>> updated?
>>
>> Simon
>>
>> On Wed, Sep 08, 2010 at 05:13:07PM +0800, Glenn Adams wrote:
>>> If I understand you correctly, the answer is no. The file maven/pom.xml
>> in
>>> the patch explicitly references revision 1.7 of the batik artifacts. So
>> any
>>> use of upward revisions of those artifacts would require updating the
>>> pom.xml file to reflect use of a newer revision.
>>>
>>> At present, I worked around the headless problem (testWMF) by specifying
>>> java.awt.headless as false in the pom.xml configuration for those test
>>> suites that invoke testWMF. Of course, that also means that the this
>> patch
>>> will fail those tests when invoked on a truly headless platform.
>>>
>>> Does that answer your query? Or are you asking if I can adjust the
>>> configuration to make automatic use of snapshot updates?
>>>
>>> Regards,
>>> Glenn
>>>
>>> On Wed, Sep 8, 2010 at 3:47 PM, Simon Pepping >> wrote:
>>>
 Does this build system require us to deploy snapshots of
 xmlgraphics-commons and batik to the maven repository, whenever we use
 snapshot versions in our lib directory? We do routinely for xgc, and
 we may need to do so for batik if the headless problem is fixed (see
 https://issues.apache.org/bugzilla/show_bug.cgi?id=42408#c13).
>>
>> --
>> Simon Pepping
>> home page: http://www.leverkruid.eu
>>
> 


Re: base 14 font kerning

2010-09-08 Thread Vincent Hennebert
Hi,

I’m just remembering this bug, that may affect you:
https://issues.apache.org/bugzilla/show_bug.cgi?id=48766

Vincent


On 06/09/10 06:58, Glenn Adams wrote:
> Is there a reason that kerning of the base 14 fonts is disabled by default?
> 
> Furthermore, except by programmatic means, there does not seem to be a way
> to enable it except by using FontManager.setBase14KerningEnabled() or the
> deprecated method FopFactory.setBase14KerningEnabled(). This technique is
> used to enable it during testing in one test case:
> layoutengine/standard-testcases/kerning_1_on.xml, by means of special code
> in org.apache.fop.layoutengine.TestEnvironment.
> 
> However, there appears no way for a user to enable it via non-programmitc
> means. To support this (which I need in testing the new generalized position
> adjustments for text drawing), I'm adding a base14-kerning element to be
> placed in the top-level fop element in the FOP configuration file, e.g.,
> 
> 
>   ...
>   true
>   ...
> 
> 
> The rationale for making this a child of the top-level fop element is that
> the enable/disable state is presently maintained in the singleton
> FontManager instance, which is configured (in FontManagerConfigurator) from
> other top-level children of the fop element.
> 
> For consistency, it my be better to enable base14 kerning by default, then
> allow a user to disable it using the above mechanism. However, I have not
> made this latter change (yet).
> 
> Comments?
> 
> G.
> 


Re: TODO tag [was: Re: svn commit: r990148 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop: area/ fo/ fo/flow/ fo/flow/table/ fo/pagination/ fo/properties/ hyphenation/ layoutmgr/ layoutmgr/inline

2010-09-08 Thread Vincent Hennebert
Ok, let me summarise this:

• a @[asf.]todo tag marginally improves the formatting of a javadoc
  comment
• nobody really likes the idea of using a namespaced version of todo
  (@asf.todo)
• it is possible to tweak Checkstyle and the javadoc command to enable
  the use of @todo

That said:
• todo statements generally have little to do (sic) in a javadoc comment
  anyway
• TODO keywords are easily indexable by modern IDEs

Jeremias recommends the Felix way: using //TODO comments below the
javadoc. I’m also strongly in favour of this convention. OTOH, if I’m
correct nobody strongly feels that @todo tags are necessary.

So I think we have a consensus:
• from now on we stop using @todo in favour of the Felix convention;
• we will progressively remove TODO statements from javadoc comments and
  move them below in their own Java // comments
• I remove the definition of the custom tag from build.xml

Let me know if I missed anything.

Thanks,
Vincent


On 31/08/10 12:33, Vincent Hennebert wrote:
> Hi,
> 
> I just thought I would homogenize our usage of todo tags and match what
> seems to be the de facto standard (“TODO”) among current committers.
> Most @todo indeed come from very old commits. I didn’t realise that
> javadoc could do something with them, which is why that looked to me
> like a minor change that wasn’t needing prior discussion. Sorry about
> that.
> 
> Ok, so there is something that can be done out of @todo tags in javadoc
> comments. Now, having to use our own namespaced version is unfortunate
> and looks overkill to me. Just to have a slightly better formatted
> javadoc? Are such comments of any use to users of the API anyway? Most
> of them rather look like pure internal development issues and should
> probably not even appear in the javadoc.
> 
> Also, while @todo tags can be indexed, modern IDEs can index plain TODO
> tokens as well, so that reduces the advantage of @asf.todo IMO.
> 
> If there are strong feelings against the removal of @asf.todo, I’ll
> revert the change. Otherwise, I’ll actually complete it by removing the
> definition of the custom tag in build.xml, which I hadn’t spotted.
> 
> Vincent
> 
> 
> Simon Pepping wrote:
>> It would indeed have been better to first have a discussion and then
>> make the change. @asf.todo is specific enough that we could have
>> changed it at any time. That said, Glenn's change was also made
>> without a discussion. My javadoc does not complain about the @todo
>> tag, and I had not understood that this was a motivation.
>>
>> The javadoc documentation (of my sun-java6-jdk) is not clear about
>> this topic, and uses @todo liberally in its section about the -tag
>> option. Its most informative paragraph is this:
>>
>> "Avoiding Conflicts - If you want to slice out your own namespace, you
>> can use a dot-separated naming convention similar to that used for
>> packages: com.mycompany.todo. Sun will continue to create standard
>> tags whose names do not contain dots. Any tag you create will override
>> the behavior of a tag by the same name defined by Sun. In other words,
>> if you create a tag or taglet @todo, it will always have the same
>> behavior you define, even if Sun later creates a standard tag of the
>> same name."
>>
>> which does not even go so far as to discourage the @todo tag. It is
>> also not clear how a todo tag would be a specific asf tag, different
>> from the todo tag of any other organization. Everybody uses todo and
>> means the same with it.
>>
>> Using the widely recognized TODO keyword circumvents the tag question
>> altogether, but is outdated since the advent of tags.
>>
>> Let us discuss this and not waste effort on undoing each other's
>> expression of their point of view. Let us also not forget that working
>> in a team requires compromises; the code will never match your own
>> conventions and preferences as precisely as code in your very own
>> project. This is more so in an open project with a long history and a
>> large set of authors.
>>
>> Simon
>>
>> On Sat, Aug 28, 2010 at 09:28:06AM +0800, Glenn Adams wrote:
>>> Vincent,
>>>
>>> Could you explain your rationale for this change? Originally, these were all
>>> marked with a non-standard '@todo' javadoc tag, which javadoc complained
>>> about, indicating that for "non-standard" tags, there should be at least one
>>> '.' present in the tag name. I had fixed this by adding the "asf." prefix,
>>> which still allowed tracking these in javadoc more easily. However, your
>>> change now removes the utility of the tag.
>>>
>>&g

Re: base 14 font kerning

2010-09-08 Thread Vincent Hennebert
Hi,

On 07/09/10 10:00, Chris Bowditch wrote:
> Glenn Adams wrote:
> 
> Hi Glenn/Jeremias,
> 
>> I've already implemented in my complex scripts work, so it will make
>> it into trunk in due time. However, I think I'll leave the default
>> setting as it is for the time being. Users can explicitly enable it
>> via their config. We can take up the issue of whether to change the
>> default at a future time.
> 
> I do not like the idea of changing the default value of Kerning from off
> to on. The reason being that users who decide to upgrade their FOP
> version will suddenly find the appearance of their documents changing.
> Better to let users who are unsatisfied with the default inter character
> spacing to go and enable kerning than force users to regression test
> every document to make sure the changes to appearance is acceptable.

I disagree. New users don’t care whether kerning was enabled in previous
versions or not. They just want their documents to look good and don’t
want to be told about some obscure configuration option. In fact, they
may not even know what kerning is and they don’t want to be bothered
with that.

Kerning is something that font designers spend time to define in order
for their fonts to look good, and not handling kerning is a bug. Kerning
should be enabled by default.

For users who are upgrading their version of FOP, a warning in the
release notes should be enough. They /will/ have to regression test
their documents anyway.


Vincent


> Thanks,
> 
> Chris
> 
>>
>> G.
>>
>> On Mon, Sep 6, 2010 at 10:07 PM, Jeremias Maerki
>> mailto:d...@jeremias-maerki.ch>> wrote:
>>
>> I think that is for historical reasons. When this was implemented (I
>> think it was me) I guess we didn't want to change the layout
>> behaviour
>> for existing users. For a long time, kerning for base 14 fonts was
>> not
>> supported.
>>
>> http://svn.apache.org/viewvc?view=revision&revision=389086
>> 
>>
>> You're right: this setting doesn't seem to be tied into the
>> FontManagerConfigurator. It would be great if you added that.
>>
>> That said, I'm not sure if enabling that would be so bad. I guess I'm
>> not opposed to it.
>>
>> On 06.09.2010 07:58:41 Glenn Adams wrote:
>>  > Is there a reason that kerning of the base 14 fonts is disabled
>> by default?
>>  >
>>  > Furthermore, except by programmatic means, there does not seem to
>> be a way
>>  > to enable it except by using
>> FontManager.setBase14KerningEnabled() or the
>>  > deprecated method FopFactory.setBase14KerningEnabled(). This
>> technique is
>>  > used to enable it during testing in one test case:
>>  > layoutengine/standard-testcases/kerning_1_on.xml, by means of
>> special code
>>  > in org.apache.fop.layoutengine.TestEnvironment.
>>  >
>>  > However, there appears no way for a user to enable it via
>> non-programmitc
>>  > means. To support this (which I need in testing the new
>> generalized position
>>  > adjustments for text drawing), I'm adding a base14-kerning
>> element to be
>>  > placed in the top-level fop element in the FOP configuration
>> file, e.g.,
>>  >
>>  > 
>>  >   ...
>>  >   true
>>  >   ...
>>  > 
>>  >
>>  > The rationale for making this a child of the top-level fop
>> element is that
>>  > the enable/disable state is presently maintained in the singleton
>>  > FontManager instance, which is configured (in
>> FontManagerConfigurator) from
>>  > other top-level children of the fop element.
>>  >
>>  > For consistency, it my be better to enable base14 kerning by
>> default, then
>>  > allow a user to disable it using the above mechanism. However, I
>> have not
>>  > made this latter change (yet).
>>  >
>>  > Comments?
>>  >
>>  > G.
>>
>>
>>
>>
>> Jeremias Maerki
>>
>>
> 


Re: upcoming change to IFPainter.drawText()

2010-08-31 Thread Vincent Hennebert
Hi Glenn,

A dedicated class with meaningful fields (e.g., xPlacement, xAdvance)
would probably be preferable to an array of 4 int. This would be safer
and easier to understand and use.

For the rest, that sounds good.

Vincent


Glenn Adams wrote:
> Folks,
> 
> I'd like to mention a change I will implement on IFPainter#drawText method
> in order to accommodate complex scripts (as well as non-complex script usage
> in a variety of written languages). I'm bringing this up now so there can be
> discussion ahead of time if needed.
> 
> Basically, the change is to generalize the int[] dx parameter to be a two
> dimensional array of glyph placement/advancement adjustments.
> 
> The current interface is:
> 
> void drawText(int x, int y, int letterSpacing, int wordSpacing, int[] dx,
> String text) throws IFException;
> 
> The modified method interface would read as follows:
> 
> void drawText(int x, int y, int letterSpacing, int wordSpacing, int[][]
> adjustments, String text) throws IFException;
> 
> The adjustments array is optional (in which case it is null). If non-null,
> it is effectively typed as int[][4], i.e., an array of int[4] arrays, where
> the four elements of each row are:
> 
> a[0] = x placement adjustment
> a[1] = y placement adjustment
> a[2] = x advance adjustment
> a[3] = y advance adjustment
> 
> The [x,y] placement adjustments are added to the current point to determine
> the effective glyph (char) origin, and the [x,y] advance adjustments are
> applied to the current point after rendering the glyph (char) and performing
> the default (implicit) advance.
> 
> To be more explicit, the algorithm using these adjustments is effectively as
> follows (ignoring word and letter spacing for the moment):
> 
> int curPointX = x;
> int curPointY = y;
> for ( int i = 0, n = glyphs.length; i < n; i++ ) {
>   int g = glyphs [ i ];
>   int gx = curPointX;
>   int gy = curPointY;
>   int[] a = ( adjustments != null ) ? adjustments[i] : null;
>   if ( a != null ) {
> gx += a[0];
> gy += a[1];
>   }
>   drawGlyph ( g, gx, gy );
>   curPointX += font.getGlyphAdvanceX ( g );
>   curPointY += font.getGlyphAdvanceY ( g );
>   if ( a != null ) {
> curPointX += a[2];
> curPointY += a[3];
>   }
> }
> 
> It is mandatory to provide this generality in order to support not only
> complex scripts, but also non-complex scripts (e.g, Latin, Greek, Cyrillic,
> CJK, etc) when used with non-spacing marks (in many written languages) and
> also for other advanced typographic effects.
> 
> Attached is a simple example of the use of this feature in order to adjust
> the placement (and advance) of U+064E ARABIC FATHA and U+0650 ARABIC KASRA,
> respectively, the upper and lower non-spacing marks shown in this example.
> 
> Regards,
> Glenn
> 


Re: TODO tag [was: Re: svn commit: r990148 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop: area/ fo/ fo/flow/ fo/flow/table/ fo/pagination/ fo/properties/ hyphenation/ layoutmgr/ layoutmgr/inline

2010-08-31 Thread Vincent Hennebert
Hi,

I just thought I would homogenize our usage of todo tags and match what
seems to be the de facto standard (“TODO”) among current committers.
Most @todo indeed come from very old commits. I didn’t realise that
javadoc could do something with them, which is why that looked to me
like a minor change that wasn’t needing prior discussion. Sorry about
that.

Ok, so there is something that can be done out of @todo tags in javadoc
comments. Now, having to use our own namespaced version is unfortunate
and looks overkill to me. Just to have a slightly better formatted
javadoc? Are such comments of any use to users of the API anyway? Most
of them rather look like pure internal development issues and should
probably not even appear in the javadoc.

Also, while @todo tags can be indexed, modern IDEs can index plain TODO
tokens as well, so that reduces the advantage of @asf.todo IMO.

If there are strong feelings against the removal of @asf.todo, I’ll
revert the change. Otherwise, I’ll actually complete it by removing the
definition of the custom tag in build.xml, which I hadn’t spotted.

Vincent


Simon Pepping wrote:
> It would indeed have been better to first have a discussion and then
> make the change. @asf.todo is specific enough that we could have
> changed it at any time. That said, Glenn's change was also made
> without a discussion. My javadoc does not complain about the @todo
> tag, and I had not understood that this was a motivation.
> 
> The javadoc documentation (of my sun-java6-jdk) is not clear about
> this topic, and uses @todo liberally in its section about the -tag
> option. Its most informative paragraph is this:
> 
> "Avoiding Conflicts - If you want to slice out your own namespace, you
> can use a dot-separated naming convention similar to that used for
> packages: com.mycompany.todo. Sun will continue to create standard
> tags whose names do not contain dots. Any tag you create will override
> the behavior of a tag by the same name defined by Sun. In other words,
> if you create a tag or taglet @todo, it will always have the same
> behavior you define, even if Sun later creates a standard tag of the
> same name."
> 
> which does not even go so far as to discourage the @todo tag. It is
> also not clear how a todo tag would be a specific asf tag, different
> from the todo tag of any other organization. Everybody uses todo and
> means the same with it.
> 
> Using the widely recognized TODO keyword circumvents the tag question
> altogether, but is outdated since the advent of tags.
> 
> Let us discuss this and not waste effort on undoing each other's
> expression of their point of view. Let us also not forget that working
> in a team requires compromises; the code will never match your own
> conventions and preferences as precisely as code in your very own
> project. This is more so in an open project with a long history and a
> large set of authors.
> 
> Simon
> 
> On Sat, Aug 28, 2010 at 09:28:06AM +0800, Glenn Adams wrote:
>> Vincent,
>>
>> Could you explain your rationale for this change? Originally, these were all
>> marked with a non-standard '@todo' javadoc tag, which javadoc complained
>> about, indicating that for "non-standard" tags, there should be at least one
>> '.' present in the tag name. I had fixed this by adding the "asf." prefix,
>> which still allowed tracking these in javadoc more easily. However, your
>> change now removes the utility of the tag.
>>
>> On a more general point, wouldn't it be more useful to have a discussion
>> about stylistic changes prior to implementing them? Just so we can get on
>> the same page?
>>
>> Regards,
>> Glenn
>>
>> On Fri, Aug 27, 2010 at 9:31 PM,  wrote:
>>
>>> Author: vhennebert
>>> Date: Fri Aug 27 13:31:41 2010
>>> New Revision: 990148
>>>
>>> URL: http://svn.apache.org/viewvc?rev=990148&view=rev
>>> Log:
>>> Replaced @asf.todo with normal TODO comment
>>>
>>>
> 


Re: svn commit: r990144 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop: layoutmgr/ layoutmgr/inline/ layoutmgr/table/ pdf/ render/ render/afp/ render/intermediate/ render/java2d/ render/pcl/ rend

2010-08-31 Thread Vincent Hennebert
Simon Pepping wrote:
> On Fri, Aug 27, 2010 at 01:23:12PM -, vhenneb...@apache.org wrote:
>> Author: vhennebert
>> Date: Fri Aug 27 13:23:11 2010
>> New Revision: 990144
>>
>> URL: http://svn.apache.org/viewvc?rev=990144&view=rev
>> Log:
>> Fixed indentation
> In the output or in the source files?

In the source files. I’m basically going through commit #985537 and
fixing style issues I can spot.

Vincent


Re: DO NOT REPLY [Bug 49379] [PATCH] Enhancement to the include page segment functionality for AFP rendering

2010-08-25 Thread Vincent Hennebert
If a MO:DCA parser is added, then that should definitely be in
a separate sub-project of XML Graphics.

For the rest, knowing next to nothing about AFP, I have no opinion.

Vincent


Peter Hancock wrote:
> Hi Jeremias,
> 
> I totally agree with you here.  Time constraints did not allow me to
> create a proper parser/object model for the AFP resource but it is the
> only sensible way to read them safely - as your error reinforces.
> It would be great to use your MO:DCA parser to improve this feature,
> when you are ready to integrate it.
> 
> Thanks for your comments
> 
> Peter
> 
> On Fri, Aug 20, 2010 at 8:47 AM,   wrote:
>> https://issues.apache.org/bugzilla/show_bug.cgi?id=49379
>>
>> --- Comment #2 from Jeremias Maerki  2010-08-20 
>> 03:46:59 EDT ---
>> Peter, I've taken a look at your patch. I found that I get an IOException 
>> when
>> referencing the page segment "s1islogo.psg" that comes with IBM AFP 
>> Workbench:
>>
>> java.io.IOException: Malformed AFP resource with name 's1islogo':No Begin
>> structured field
>> at
>> org.apache.fop.afp.util.AFPResourceUtil.copyNamedResource(AFPResourceUtil.java:123)
>>
>> I have the impression that the method AFPResourceUtil.findStart() may not be
>> ideal to parse an MO:DCA file. I haven't investigated more closely why the
>> above file fails, but stepping through findStart() feels a bit weird in terms
>> of how that method looks for the requested resource. Some time ago I started 
>> a
>> rudimentary AFP parser I used to dump the Type 1 data from an outline font, 
>> or
>> to simply dump the basic structure of an AFP file. I could include that in 
>> FOP
>> and we build from there. It allows to return an object for each structured
>> field encountered. A generic MO:DCA parser would also allow future
>> functionality that involves parsing an AFP file. WDYT?
>>
>> --
>> Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
>> --- You are receiving this mail because: ---
>> You are the assignee for the bug.
>>


Re: findbugs results

2010-08-25 Thread Vincent Hennebert
Simon Pepping wrote:
> On Mon, Aug 16, 2010 at 05:29:10PM -0400, Benson Margulies wrote:
>> The people who make Sonar host Apache projects for free. Many Apache
>> projects have Sonar set up there, and can get findbugs and all sorts of
>> other useful data without individual contributors running these tools.
> 
> Quite a beast, that measures everything and some. It would be useful
> to have our code X-rayed, measured, inspected etc. at least once. But
> I do not think that we will rush in to fix the reported problems.
> 
> Is this analysis run often? I did not see how one can apply for
> inclusion of a project.
> 
> FOP committers, interested in having our code analysed there? See
> http://nemo.sonarsource.org/ for the reports about the included open
> source projects, among which quite a few ASF projects. See
> http://www.sonarsource.com/ for the analysis tool itself.

Just had a look. This looks interesting, although the results are likely
to not be encouraging. But if that can be done, I’m all for it.

Vincent


Re: Dropping Support for Old Renderers

2010-08-25 Thread Vincent Hennebert
Thanks for that, that helped.

This is now done.

Thanks,
Vincent


Jeremias Maerki wrote:
> Just to have an overview over all output formats:
> 
> AFP: Painter available
> AWT: Renderer only at the moment (depends on Java2D)
> PNG/TIFF: Painter available (depends on Java2D)
> Java2D: Painter available
> PCL: Painter available
> PDF: Painter available
> Print: Renderer only at the moment (depends on Java2D)
> PS: Painter available
> RTF: n/a (Does not use layout engine)
> TXT: Renderer only
> 
> AT XML: Renderer only but that has to be like that
> IF XML: Renderer to Painter adapter present (other painter impls rely on
> that)
> 
> Essentially, this means that the renderers for AFP, PCL, PDF and PS can
> be removed because they are fully replaced by their painter counterparts.
> All the other renderers have to stay in place: Java2D/AWT/Print, TXT,
> AT XML, IF XML. AWT, Print and TXT are candidates to be ported to the
> painter infrastructure.
> 
> On 17.08.2010 19:57:08 Vincent Hennebert wrote:
>> Ok, no objection to this so I'll proceed with the removal.
>>
>> Thanks,
>> Vincent
>>
>>
>> Vincent Hennebert wrote:
>>> Hi,
>>>
>>> The new rendering architecture based on a streamlined intermediate
>>> format and painters has been in place for more than a year now and
>>> hasn’t caused any big issue.
>>>
>>> New features are being added to the painters
>>> (PDFDocumentHandler/PDFPainter, PSDocumentHandler/PSPainter, etc.) and
>>> usually not backported to the renderers (PDFRenderer, PSRenderer, etc.).
>>> Also, the old renderers get in the way when changes and refactorings
>>> must be made to the libraries.
>>>
>>> Therefore, I propose to drop support for the old renderers and remove
>>> them from version control.
>>>
>>> Does anyone have any objection to that?
>>>
>>> Thanks,
>>> Vincent
> 
> 
> 
> 
> Jeremias Maerki
> 


Trunk Broken

2010-08-18 Thread Vincent Hennebert
Am I the only one to have a failing test suite?

junit-area-tree-xml-format:
 [echo] Running area tree XML format tests...
[junit] Testsuite: org.apache.fop.intermediate.AreaTreeXMLFormatTestSuite
[junit] Tests run: 501, Failures: 0, Errors: 1, Time elapsed: 32.263 sec
[junit]
[junit] - Standard Error -
[junit] 18-Aug-2010 11:57:06
org.apache.fop.intermediate.AreaTreeXMLFormatTestSuite$1 runTest
[junit] SEVERE: Error on fox_destination_1.xml
[junit] 18-Aug-2010 11:57:17 org.apache.fop.fonts.Typeface warnMissingGlyph
[junit] WARNING: Glyph 8721 (0x2211, summation) not available in font
Helvetica
[junit] 18-Aug-2010 11:57:17 org.apache.fop.fonts.Typeface warnMissingGlyph
[junit] WARNING: Glyph 115 (0x73, s) not available in font Symbol
[junit] 18-Aug-2010 11:57:17 org.apache.fop.fonts.Typeface warnMissingGlyph
[junit] WARNING: Glyph 121 (0x79, y) not available in font Symbol
[junit] 18-Aug-2010 11:57:17 org.apache.fop.fonts.Typeface warnMissingGlyph
[junit] WARNING: Glyph 109 (0x6d, m) not available in font Symbol
[junit] 18-Aug-2010 11:57:22 org.apache.fop.util.ColorSpaceCache get
[junit] WARNING: Color profile 'nonexistent.icc' not found.
[junit] 18-Aug-2010 11:57:22 org.apache.fop.util.ColorUtil parseAsFopRgbIcc
[junit] WARNING: Color profile 'nonexistent.icc' not found. Using rgb
replacement values.
[junit] 18-Aug-2010 11:57:22 org.apache.fop.pdf.PDFColor 
[junit] INFO: Adding PDFICCStream sRGB for
../../../src/java/org/apache/fop/pdf/sRGB Color Space Profile.icm
[junit] 18-Aug-2010 11:57:23 org.apache.fop.layoutmgr.table.ColumnSetup
computeTableUnit
[junit] WARNING: No space remaining to distribute over columns.
[junit] -  ---
[junit] Testcase:
fox_destination_1.xml(org.apache.fop.intermediate.AreaTreeXMLFormatTestSuite$1):
Caused an ERROR
[junit] Index: 27, Size: 27
[junit] java.lang.IndexOutOfBoundsException: Index: 27, Size: 27
[junit] at java.util.ArrayList.RangeCheck(ArrayList.java:547)
[junit] at java.util.ArrayList.set(ArrayList.java:337)
[junit] at
org.apache.fop.pdf.PDFDocument.outputTrailer(PDFDocument.java:1020)
[junit] at
org.apache.fop.render.pdf.PDFDocumentHandler.endDocument(PDFDocumentHandler.java:166)
[junit] at
org.apache.fop.render.intermediate.IFRenderer.stopRenderer(IFRenderer.java:285)
[junit] at
org.apache.fop.area.RenderPagesModel.endDocument(RenderPagesModel.java:256)
[junit] at
org.apache.fop.intermediate.AreaTreeParserTestCase.parseAndRender(AreaTreeParserTestCase.java:108)
[junit] at
org.apache.fop.intermediate.AbstractIntermediateTestCase.testParserToPDF(AbstractIntermediateTestCase.java:208)
[junit] at
org.apache.fop.intermediate.AreaTreeXMLFormatTestSuite$1.runTest(AreaTreeXMLFormatTestSuite.java:61)
[junit]
[junit]
[junit] Test org.apache.fop.intermediate.AreaTreeXMLFormatTestSuite FAILED

junit-intermediate-format:
 [echo] Running intermediate format tests...
[junit] Testsuite: org.apache.fop.intermediate.IntermediateFormatTestSuite
[junit] Tests run: 501, Failures: 0, Errors: 1, Time elapsed: 20.518 sec
[junit]
[junit] - Standard Error -
[junit] 18-Aug-2010 11:57:41
org.apache.fop.intermediate.IntermediateFormatTestSuite$1 runTest
[junit] SEVERE: Error on fox_destination_1.xml
[junit] 18-Aug-2010 11:57:53 org.apache.fop.util.ColorSpaceCache get
[junit] WARNING: Color profile 'nonexistent.icc' not found.
[junit] 18-Aug-2010 11:57:53 org.apache.fop.util.ColorUtil parseAsFopRgbIcc
[junit] WARNING: Color profile 'nonexistent.icc' not found. Using rgb
replacement values.
[junit] 18-Aug-2010 11:57:53 org.apache.fop.pdf.PDFColor 
[junit] INFO: Adding PDFICCStream sRGB for
../../../src/java/org/apache/fop/pdf/sRGB Color Space Profile.icm
[junit] 18-Aug-2010 11:57:53 org.apache.fop.layoutmgr.table.ColumnSetup
computeTableUnit
[junit] WARNING: No space remaining to distribute over columns.
[junit] -  ---
[junit] Testcase:
fox_destination_1.xml(org.apache.fop.intermediate.IntermediateFormatTestSuite$1):
Caused an ERROR
[junit] Index: 27, Size: 27
[junit] java.lang.IndexOutOfBoundsException: Index: 27, Size: 27
[junit] at java.util.ArrayList.RangeCheck(ArrayList.java:547)
[junit] at java.util.ArrayList.set(ArrayList.java:337)
[junit] at
org.apache.fop.pdf.PDFDocument.outputTrailer(PDFDocument.java:1020)
[junit] at
org.apache.fop.render.pdf.PDFDocumentHandler.endDocument(PDFDocumentHandler.java:166)
[junit] at
org.apache.fop.render.intermediate.IFParser$Handler$DocumentHandler.endElement(IFParser.java:397)
[junit] at
org.apache.fop.render.intermediate.IFParser$Handler.endElement(IFParser.java:352)
[junit] at
org.apache.xalan.transf

Re: Build errors

2010-08-17 Thread Vincent Hennebert
Hi Eric,

You have to add the build/gensrc folder to your Eclipse build path. That
folder is created when running ant.

HTH,
Vincent


Eric Douglas wrote:
> What am I missing now on this Java build?
> 
> Running the ant script shows me a "Build Successful" message, though the
> Problems tab in the Eclipse IDE shows missing classes for all the font
> references, on the CodePointMapping and the Courier, Helvetica, etc.
> These classes have no source in the IDE.  They have class files in the
> jar.  I try to call the jar methods and I get errors.
> A call to PDFRenderer.setupFontInfo(new FontInfo()) gives me a
> compilation error message on the invalid class reference.
> A call to setFontBaseURL("Fonts\") gives me an invalid path error
> (trying to find syntax to point to custom font files I put in a Fonts
> folder in another jar on the classpath).
> I take those out to see if it just works with no reference to my custom
> fonts, if it can just automatically find them in the classpath, and my
> program just hangs on the FopFactory.newFop(FOUserAgent) statement. 
> 
> 
> -Original Message-
> From: Jeremias Maerki [mailto:d...@jeremias-maerki.ch] 
> Sent: Thursday, August 12, 2010 8:35 AM
> To: fop-dev@xmlgraphics.apache.org
> Subject: Re: Build errors
> 
> linkmap.xml? I don't think we have a file with that name in FOP. Could
> that be coming from Apache Forrest somehow, maybe due to a buggy XML
> parser maybe? Maybe putting a current Xerces and Xalan in the JRE's
> lib/endorsed directory may change something. Otherwise, please provide a
> snippet from the output log.
> 
> On 10.08.2010 17:27:00 Eric Douglas wrote:
>> When I download the source for fop 1.0, the ant build shows 
>> successful, but if I try a regular build just to check for errors 
>> before running the ant build I get a bunch of error messages such as 
>> "the content of element type "li" must match..." (on linkmap.xml).  Is
> 
>> this normal or am I missing something?
> 
> 
> 
> 
> Jeremias Maerki
> 


Re: Dropping Support for Old Renderers

2010-08-17 Thread Vincent Hennebert
Ok, no objection to this so I'll proceed with the removal.

Thanks,
Vincent


Vincent Hennebert wrote:
> Hi,
> 
> The new rendering architecture based on a streamlined intermediate
> format and painters has been in place for more than a year now and
> hasn’t caused any big issue.
> 
> New features are being added to the painters
> (PDFDocumentHandler/PDFPainter, PSDocumentHandler/PSPainter, etc.) and
> usually not backported to the renderers (PDFRenderer, PSRenderer, etc.).
> Also, the old renderers get in the way when changes and refactorings
> must be made to the libraries.
> 
> Therefore, I propose to drop support for the old renderers and remove
> them from version control.
> 
> Does anyone have any objection to that?
> 
> Thanks,
> Vincent


Re: [Bug 49733] [PATCH] resolve compilation, checkstyle, javadoc warnings (a proposal for next steps)

2010-08-12 Thread Vincent Hennebert
Hi,

Jeremias Maerki wrote:
> I've now applied the patch locally and done a detailed review. I'm
> posting this a bit outside the context of recent discussions to simply
> state my present opinion after looking into the patch.
> 
> Generally, this is a big improvement. So thanks, Glenn, for your work
> here!
>
> I'm also not particularly happy about the //CS* comments. To a certain
> degree I think I could live with them. A count shows 279 usages. I think
> that may be a tad too much. Maybe we can find something in between, like
> making more use of the "error" severity. Most checks are just warnings
> right now. So using errors will make it easier to enforce at least the
> rules most important to us. I've also experimented with the regular
> expressions:
> 
> 
>   
>   
> 
> 
> This should already make several such //CS comments unnecessary. There
> are other comments referencing ConstantNameCheck where we should rather
> convert the name to upper case. That will cut down on these even further.
> Like Chris suggested, we could then even decide to live with a few
> warnings as long as we increase the severity of the most important rules
> and set up a no-go policy for "errors".

(This is precisely why I suggested that we agree on an improved
Checkstyle first, to avoid introducing unnecessary //CS comments.)

I don’t really have an opinion about that. Since zero-warning won’t be
achieved in the short term anyway, I suppose we could remove them for
now. Once we decide to enforce a zero-warning policy then they will
probably have to be used, along with a TODO warning indicated that this
is old code that needs refactoring; and thus make the difference with
new CSOK comments introduced later on with due care.


> I saw some changes in LineBreakPairTable.txt and LineBreakUtils.java.
> Glenn, was that an accidental overflow from your work on the new
> features?
> 
> I have no problem with the sometimes rather generic Javadoc comments.
> Every committer is invited to improve on those as he's passing over
> particular code parts. I know that we have quite a bit of outdated
> documentation already in our Javadocs. So these comments don't make the
> situation worse IMO. The only thing we can do is gradually improve. But
> at least the generic javadocs lets us cut down on the number of warnings
> so we can really focus to improve there rather than capitulate before
> thousands of warnings.

I’m ok with that. Some less generic comments will need to be
double-checked. Not that I don’t trust Glenn on that matter, but some
parts of the code (especially layout) are tricky and it’s very easy to
be mistaken. And I think a wrong Javadoc comment does more harm than no
comment at all.


> Finally a nit: some files have got method signatures with whitespace
> before and after the parantheses. We don't traditionally do that but the
> Checkstyle profile doesn't seem to catch that. I guess it would be safe
> to add that rule so we can fix those occurences.

+1


> I would suggest the following as our next steps:
> 
> 1. Clarify the thing with LineBreak*.
> 2. Decide (quickly, please) whether to remove the //CS comments or to
> allow them for now and optionally do something about them later. (I'm
> tending towards removing them but I don't have a problem if we do it the
> other way.)

+1 for removing them for now.


> 3. Commit the patch to Trunk more or less as is (pending //CS decision).

-1, among other things there are deprecated flags/methods that were
removed and I feel that that must be discussed first (mainly
Graphics2DAdapter.paintImage).


> 3. Adjust the Checkstyle profile to allow "log" and disallow whitespace
> before and after parantheses. Then remove "log"-related //CS constants
> and excessive whitespace.

+0, I would just put log in uppercase but I don’t really mind.


> 4. Merge the changes into the Temp_ComplexScripts parts.
> 5. Glenn could then provide a new patch against the branch which we
> could do a cursory review on. We apply that and experiment with what
> he's built. He can continue his work.
> 6. We continue to incrementally improve our coding standards.
> 
> I'm happy to do the grunt work. Like Glenn, I don't like to hold
> principle discussions right now because that holds up several people
> from doing day-to-day work.
> That doesn't mean we can't hold them, but I
> don't see why we have to do it as a precondition to processing this
> patch. The patch gets us further but doesn't preclude any futher
> improvements later.
> 
> Please, let's get this done.

I’m not happy with that approach. When this topic was first mentioned
[1] I did say that the Checkstyle file needed improvement and that until
then this would be premature to work on that. My advice was not followed
and now we should apply this patch ASAP without discussion? I’m not sure
that trying to force things is a good way to get involved. The
consensus-based approach inherent to any Apache project is not being
followed here.

[

Re: [Bug 49733] [PATCH] resolve compilation, checkstyle, javadoc warnings

2010-08-12 Thread Vincent Hennebert
This message lacks of courtesy, therefore I do not wish to continue the
discussion.

I’ll proceed as I explained in my previous message.

Or maybe it’s just me not being a native English speaker...


Vincent


Glenn Adams wrote:
> Inline below.
> 
> On Wed, Aug 11, 2010 at 7:45 PM, Vincent Hennebert 
> wrote:
> 
>> Suppressing all the warnings at build time is a great goal that I would
>> love to see achieved eventually. This gives us an automatic way to spot
>> violations introduced in new code, which is better than the informal
>> check that developers do (or not...) before committing. But as I said
>> trying to achieve that goal now is premature.
>>
>>
> once again, i disagree with your reasoning; i heard unanimous support for
> this patch from other commenters, your reticence does not seem warranted;
> Jeremias and Simon have both stated their support for taking action to clean
> up the code base;
> 
> it is not premature to rid the codebase of warnings; in fact, one might
> argue that it was premature to release FOP 1.0 with the existing warnings;
> 
> 
>> More or less everyone agrees that the current checkstyle file is not
>> satisfying. Jeremias says that he doesn’t apply some rules sometimes.
>> I’ve done the same myself in a few occasions. So new warnings are bound
>> to appear shortly after this patch is applied.
>>
>>
> to translate lack of satisfaction with the current checkstyles to mean lack
> of acceptance is unwarranted; there have been no objections to it as far as
> I can tell, so it is effectively accepted; I haven't heard you or others
> proposing any concrete chantes to it, so it is accepted by lazy consensus;
> moreover, you appear to believe (wrongly in my opinion), that there could
> exist some future checkstyle rules set that was uniformly satisfactory to
> all; that will never happen, and for you to claim it should occur before
> taking action is nothing more than an excuse to delay taking action;
> 
> 
>> Once we agree on a new checkstyle file two things will happen: Some
>> rules may be removed and that may result into clutter CSOK comments in
>> the code; Are you happy to re-visit the code and remove them afterwards?
>> Some new rules may be put in place and that will result into a whole
>> bunch of new warnings, and we’re back to square one.
>>
>> Globally disabling some Checkstyle rules by using CSOFF comments is not
>> an option to me. This kills the very purpose of a Checkstyle file, which
>> is to have a consistent coding style within the project and no
>> distracting variations.
>>
> 
> who said anything about using CSOFF to *globally* disable options? warning
> suppression is a reasonable tool when used with appropriately, and
> developers should be able to override rules as needed; the fact that the
> comment remains in the code means it is easy to audit for these, and use
> that information to evaluate divergence from norm and practice;
> 
> 
>> We’ve been living with loads of Checkstyle warnings for years, now what
>> is this sudden urge to wipe off them all? If the goal is to achieve and
>> enforce zero warning, then I don’t think this is doable in the short
>> term. If the goal is to improve the quality of the software, then
>> I don’t see how putting unhelpful javadoc comments or even disabling
>> Checkstyle in some places will allow to achieve that.
>>
>>
> You say it is not doable in the short term, but it would take you no more
> than five minutes to apply and commit this patch. Instead of offering
> excuses, why don't you actually do something about it to help matters.
> 
> As for improving quality, if you walk into a house that is infested with
> fleas, do you stop to wonder at the quality of the furniture? FOP is
> infested with fleas. Let's exterminate them and move on to other matters. Or
> would you rather sit with them and scratch all day?
> 
> 
>> Anyway, from the quick look I’ve had at the patch, there are a few
>> things I don’t agree with:
>> • some methods marked deprecated were removed: this can’t be done
>>  arbitrarily and must follow some policy. Maybe this is fine in the
>>  present case but that must be discussed first.
>>
> 
> why? what policy was followed to deprecate them in the first place? why were
> methods marked deprecated and then no alternative provided? why were
> deprecated methods left in place that are no longer referenced? if there are
> deprecated methods that are no longer referenced or the code that references
> them is dead code, then they can and should be removed? how is this
> different than removing old unused renderers? is there a "

Re: [Bug 49733] [PATCH] resolve compilation, checkstyle, javadoc warnings

2010-08-11 Thread Vincent Hennebert
all is to eliminate all warnings. Period. As fast
>> as possible. My patch does that, so please commit it without delay. We can
>> then, over time, decide if the existing rules are overly conservative or
>> overly liberal. But that is not going to be a useful way to spend our time,
>> it is much better to just use what is there, and when something goes outside
>> of that set, there are adequate mechanisms to deal with it, which I
>> described in my patch.
>>
>> The alternative is to merely continue to propagate the current warnings.
>> Frankly, I was and am very surprised at the apparent lack of particularity
>> with respect to treatment of warnings. One of the six principles of "The
>> Apache Way" is "consistently high quality software". For me, every warning
>> is a black mark against quality. Let's not continue to propagate this state
>> of affairs. Now that FOP 1.0 has been released is the best time to move
>> forward, so why delay now?
>>
>> Regards,
>> Glenn
>>
>>
>> On Tue, Aug 10, 2010 at 6:34 PM,  wrote:
>>
>>> https://issues.apache.org/bugzilla/show_bug.cgi?id=49733
>>>
>>> --- Comment #5 from Vincent Hennebert  2010-08-10
>>> 06:34:29 EDT ---
>>> Hi Glenn,
>>>
>>> Thanks for your patch. However, as I said we need to agree on a
>>> project-wide
>>> Checkstyle configuration first. Before enforcing a no-warning policy it is
>>> necessary to reach consensus among all the developers on a set of rules
>>> that
>>> everyone is happy to follow.
>>>
>>> We'll have a look at your patch once this is done. Meanwhile, I'll look at
>>> the
>>> parts that fix compilation warnings.
>>>
>>> Thanks,
>>> Vincent
>>>
>>> --
>>> Configure bugmail:
>>> https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
>>> --- You are receiving this mail because: ---
>>> You reported the bug.
>>>
> 
> 
> 
> 
> Jeremias Maerki
> 


Dropping Support for Old Renderers

2010-08-06 Thread Vincent Hennebert
Hi,

The new rendering architecture based on a streamlined intermediate
format and painters has been in place for more than a year now and
hasn’t caused any big issue.

New features are being added to the painters
(PDFDocumentHandler/PDFPainter, PSDocumentHandler/PSPainter, etc.) and
usually not backported to the renderers (PDFRenderer, PSRenderer, etc.).
Also, the old renderers get in the way when changes and refactorings
must be made to the libraries.

Therefore, I propose to drop support for the old renderers and remove
them from version control.

Does anyone have any objection to that?

Thanks,
Vincent


Re: Complex Script Support - Trac Site Access

2010-08-06 Thread Vincent Hennebert
Hi Glenn,

Glenn Adams wrote:
> more
> 
> On Thu, Aug 5, 2010 at 8:20 PM, Glenn Adams  wrote:
> 
>> On Thu, Aug 5, 2010 at 7:18 PM, Vincent Hennebert 
>> wrote:
>>
>>> Hi Glenn,
>>>
>>> From you first message I had the impression that you wanted to keep
>>> using that tool for future documentation
>>>
>> as i've said, i intend to transition the documentation to the FOP wiki over
>> a period of time, so that it will all end up on the latter
>>
> 
> we've pretty much exhausted this thread, but perhaps it would be clearer to
> you if you view the trac wiki site I am using as a "working copy", "sandbox"
> or as a "staging area" as I compose my ideas and documentation; i could have
> merely done it on my local drive or on a private server, but i preferred
> instead to make it visible early, even though it is in preliminary stage
> which does not meet the standards I would expect for documentation that is
> promoted to the FOP wiki; i very well may rip up, rewrite, replace,
> restructure, etc., the content i am composing on my trac wiki before i feel
> it is ready for the FOP wiki; as I say, i could be doing that privately, but
> then interested viewers would not have early access to my preliminary work;
> overall, it is to your and the communities benefit that i choose to expose
> this preliminary work, with all its defects plainly visible;

That’s fair enough, but you’re describing the very purpose of the FOP
wiki here :-)
Moreover, notification is sent to the fop-commits mailing list whenever
a change is made to the wiki, which allows us to easily follow people’s
progress.


Vincent


Re: Complex Script Support - Trac Site Access

2010-08-05 Thread Vincent Hennebert
Hi Glenn,

Thanks for the background information ;-)

>From you first message I had the impression that you wanted to keep
using that tool for future documentation, which I don’t think is the way
to go. If only because we don’t have write access to that site and can’t
update your doc with our own comments and suggestions.

While your patch is being processed, it would be good to progressively
transfer what’s already there to the FOP wiki; And write new
documentation directly on the FOP wiki. There’s no rush in it, it’s
going to take some time to process your patch anyway.

You said you also used git; now that would be something useful to us if
we had access to it (or at least the history of commits). That would
allow us to follow the progression you undertook, which isn’t as easily
done with a patch.

Thanks,
Vincent


Glenn Adams wrote:
> on a personal note, perhaps I should add that I've been writing code for 40
> years, and am nearing 60 myself, so i'm not quite as nimble as some in
> making transitions, especially with dev tool chains... :)
> 
> g.
> 
> On Wed, Aug 4, 2010 at 6:56 PM, Glenn Adams  wrote:
> 
>> i am using a combination of tools from codesion on this project...
> 


Re: Complex Script Support - Trac Site Access

2010-08-04 Thread Vincent Hennebert
Hi Glenn,

Well, this is hardly different from the previous situation. I’m curious:
what does this site provide that FOP’s wiki and Bugzilla don’t? The kind
of documentation that is there at the moment would perfectly match the
purpose of the DeveloperPages area on FOP’s wiki.

It would really be preferable to work in the ASF area...

Thanks,
Vincent


Glenn Adams wrote:
> Vincent, Chris, et al.
> 
> to clarify my earlier message, lest it be misread, i was not asking for
> (premature) commit access to FOP; i understand the ASF process, and have no
> issue with the fact that it will take time and contributions on my part to
> be deemed a candidate as committer; what i was doing was merely citing the
> fact that i am not a committer, and assumed (wrongly as it turns out) that
> one had to be a committer to add documentation to the FOP wiki;
> 
> my intention in setting up a trac database was to satisfy my immediate need
> for documenting my work, and further, to integrate this documentation with
> my ongoing issue resolution process, which is handled nicely by the system I
> refer to below; for me, it will be easier and more productive to use this
> system; in the interim, i've create a simple FOP wiki page which refers to
> this site;
> 
> as the work is completed, stabilized, and merged into the FOP trunk, i can
> easily migrate my design and other useful documentation into the FOP wiki
> directly;
> 
> is that acceptable?
> 
> G.
> 
> On Wed, Aug 4, 2010 at 9:09 AM, Glenn Adams  wrote:
> 
>> I added a Complex Scripts link under "Design Documents" heading in the
>> FOP's wiki.
>>
>> G.
>>
>> On Tue, Aug 3, 2010 at 11:14 PM, Chris Bowditch <
>> bowditch_ch...@hotmail.com> wrote:
>>
>>> Glenn Adams wrote:
>>>
>>> Hi Glenn,
>>>
>>>
>>>  i hear you, but I prefer to use the mechanism I described, since I have
>>>> control over it; until I am granted committer status, I don't have that
>>>> control within ASF; i would be happy to transition when I am granted
>>>> committer status; in the mean time, i will use the mechanism i described;
>>>>
>>> I don't want to discourage you from documenting your work but I agree with
>>> Vincent, using the Wiki is the preferred place to document your progress. I
>>> don't understand why committership status would matter? As Vincent already
>>> indicated anyone can view/modify the Wiki.
>>>
>>> Thanks,
>>>
>>> Chris
>>>
>>>
>>>> regards,
>>>> glenn
>>>>
>>>>
>>>> On Tue, Aug 3, 2010 at 6:30 PM, Vincent Hennebert 
>>>> >>> vhenneb...@gmail.com>> wrote:
>>>>
>>>>Hi Glenn,
>>>>
>>>>That sounds good but it would be preferable to work in the ASF area,
>>>>mainly use FOP’s wiki:
>>>>http://wiki.apache.org/xmlgraphics-fop/DeveloperPages
>>>>
>>>>Anyone can add content to the wiki, you just need to create an
>>>> account.
>>>>You can create a new page related to your work on complex scripts and
>>>>add a link to it on the above page.
>>>>
>>>>For issues, Bugzilla should cover your needs. If you deem necessary
>>>> you
>>>>can use bug #49687 as a meta-bug and create individual issues on which
>>>>that bug would depend.
>>>>
>>>>Thanks,
>>>>Vincent
>>>>
>>>>
>>>>Glenn Adams wrote:
>>>> > In order to better communicate with those interested in this
>>>>work, I have
>>>> > enabled an anonymous reas access to the following Trac site, with
>>>>wiki,
>>>> > ticket, and report views enabled.
>>>> >
>>>> > https://skynav.trac.cvsdude.com/fop/wiki
>>>> >
>>>> > I will be using this site during the development process to aid in
>>>> > documentation of this feature and to track related issues.
>>>> >
>>>> > I intend to also use it for other FOP related work I plan or may
>>>>accomplish.
>>>> >
>>>> > Regards,
>>>> > Glenn
>>>> >
>>>>
>>>>
>>>>
> 


Re: The complex script patch per se

2010-08-04 Thread Vincent Hennebert
Hi Benson,

Benson Margulies wrote:
> I'm a bit confused at this point. Is there a barrier to committing the
> patch-to-far to the designated interim branch?

Even if there is a dedicated branch the patch needs to be reviewed first
(or, rather, sanity-checked). I’m on it but can only dedicate a limited
amount of time at the moment.


Vincent


Re: fixing and maintaining zero reported warnings policy?

2010-08-03 Thread Vincent Hennebert
Hi Glenn,

(Moving to general@ as maybe this is something we want to do at the XML
Graphics project level. Please continue discussion there.)

Thanks for bringing up this topic. I personally agree that
a zero-warning policy would be A Good Thing. In theory newly committed
code should have no Checkstyle warning, but I’m not sure that policy is
thoroughly followed.

Before enforcing such a policy it is necessary to come up with
a Checkstyle file on which everyone agrees. The current one is not
properly customized IMO. I started to create a new one from scratch
a long time ago but never got round to finishing and testing it.

Feel free to submit such a file. Once everyone is happy with it then you
can start removing all the warnings on the current code if you feel like
doing it. But doing it now would be a bit premature.

I can’t really comment on findbugs, I must admit that I’ve never used it
(me blushing with shame). This would probably also be a good thing to
enforce its usage, but I suppose it also needs some customization.

Thanks,
Vincent


Glenn Adams wrote:
> Would anyone mind if I submit a patch that fixes all the outstanding
> warnings, etc., reported during the build process and by checkstyles and
> findbugs on the trunk? More importantly, if I do this, is it possible to
> adhere to a zero tolerance policy on warnings for future commits?
> 
> I find the 3000 or so warnings currently produced to be a rather significant
> impediment to doing work on this code base, or at least, in preventing an
> avalanche of new warnings upon future commits, given the trouble required to
> determine the diffs between new warnings and old warnings. Perhaps this
> isn't a problem for changes to one file, but for changes to a hundred files,
> it is a major headache. Anyway, some of these 3000 are actually real,
> lurking bugs.
> 
> I'm willing to do the cleanup work if others will help maintain cleanliness
> going forward.
> 
> Regards,
> Glenn
> 


Re: Complex Script Support - Trac Site Access

2010-08-03 Thread Vincent Hennebert
Hi Glenn,

That sounds good but it would be preferable to work in the ASF area,
mainly use FOP’s wiki:
http://wiki.apache.org/xmlgraphics-fop/DeveloperPages

Anyone can add content to the wiki, you just need to create an account.
You can create a new page related to your work on complex scripts and
add a link to it on the above page.

For issues, Bugzilla should cover your needs. If you deem necessary you
can use bug #49687 as a meta-bug and create individual issues on which
that bug would depend.

Thanks,
Vincent


Glenn Adams wrote:
> In order to better communicate with those interested in this work, I have
> enabled an anonymous reas access to the following Trac site, with wiki,
> ticket, and report views enabled.
> 
> https://skynav.trac.cvsdude.com/fop/wiki
> 
> I will be using this site during the development process to aid in
> documentation of this feature and to track related issues.
> 
> I intend to also use it for other FOP related work I plan or may accomplish.
> 
> Regards,
> Glenn
> 


Re: Chicken-and-egging a big patch

2010-08-02 Thread Vincent Hennebert
Hi,

I am also interested in seeing support for complex scripts implemented
in FOP, although I am a bit doubtful that any change can be made to the
layout engine in its current state.

I’ll try to support Simon in that task. I have some knowledge of font
formats.

Thanks,
Vincent


Simon Pepping wrote:
> Benson,
> 
> I am very interested in FOP acquiring good capabilities in the area of
> complex scripts and fonts. It is clear that FOP's current development
> team does not have any expertise in this area, and therefore I am also
> very interested in FOP getting better support for this area in its
> user community, in the form of contributions of code, tests,
> sponsoring etc.
> 
> I am very happy that your company Basis has been and is willing to
> sponsor this development. FOP is definitely in need of corporate
> support.
> 
> Therefore I will do whatever I can to make this a success. I have only
> little expertise in foreign scripts, in Tamil, which I can read a bit
> and of which I understand its organizational principles. But I have no
> expertise in Arabic. I do not have much time for FOP, but perhaps that
> can change. I am not sure I will understand Glen's work sufficiently;
> especially I have little knowledge about font organization and OT
> capabilities; but I want to contribute to its review as much as
> possible.
> 
> Note that I will read my email only intermittently until 11 August.
> 
> Simon
> 
> On Thu, Jul 29, 2010 at 09:35:18AM -0400, Benson Margulies wrote:
>> Dear FOP dev,
>>
>> Some of you might recall that, some months back, I sent some email
>> into this list looking for a FOP contributor who was interested in
>> being paid by us (Basis Technology) to take on some of the issues
>> related to complex scripts and fonts.
>>
>> As it happened, none of the committers had the necessary combination
>> of time and interest. However, Glen Adams, an old hand in this
>> technology area and a someone we at Basis have known for a long time,
>> was available.
>>
>> So now he has, in effect, a private branch of the code base that
>> addresses these issues, and he has been, as I understand it, in
>> correspondence with this group.
>>
>> Basis remains very enthusiastic about seeing FOP become capable in the
>> areas addressed by Glenn's patch, and so we are, still, more than
>> willing to do any reasonable thing to advance this process.
>>
>> When we need here is one or more committers with necessary time to
>> review, critique, and iterate until this functionality can be
>> committed. As before, we are more than willing to pay consulting rates
>> if that would facilitate this.
>>
>> Basis is a sponsor of the foundation, and I'm a member, and we see
>> this as just another way in which we as a company can contribute to
>> the work of the ASF in an area where we have a particularly acute
>> interest.
> 


Re: Font Glyph?

2010-07-26 Thread Vincent Hennebert
I can only repeat the following: either the user is advanced enough to
know how to configure custom fonts that contain all the glyphs they
need, and then a configuration option for a last-resort font will be of
no use to them; Or they are not confident enough yet to create their own
configuration file (or for some reason don’t want to use one), and then
the configuration of a last-resort font will be inaccessible either.

In both cases I believe that the possibility of configuring
a last-resort font will not help. Improving the user friendliness of
FOP’s behaviour in problematic situations is always welcome, but only if
it remains transparent to the user.

At the moment a warning is issued when glyphs are missing, listing the
affected code points. Along with using the .notdef glyph, I think that’s
user-friendly enough.

Vincent


Glenn Adams wrote:
> I agree that one should not simply add new configuration specifications
> willy-nilly. As I've said previously, the ideal situation would be to
> include a last resort font as a Base14 font as part of the FOP built-in font
> set, and I will investigate this possibility. However, in the absence of a
> built-in last-resort font, there seem to be four options:
> 
>1. add information to the FOP config file, which could be as simple as
>adding an attribute as follows ;
>2. add a command line option (even less desirable in my opinion);
>3. require user to specify a last resort font as last element of
>font-family attribute; however, that this will not actually work in the
>current implementation, since FontSelector.selectFontForCharactersInText
>always selects the font that has the most mappings in the context of a
>"word";
>   - for example, if 'A' is an Arabic character and 'L' is a Latin
>   character, then one would expect:
>   - ALA
>   - to produce three glyphs [glyph from Arabic font] [glyph from
>   LastResort font] [glyph from Arabic font]
>   - however, this will not happen because selectFontForCharactersInText
>   finds that two characters in the "word" have a mapping in the Arabic 
> font
>   and one character has a mapping in the LastResort font, so it chooses 
> the
>   Arabic font to process the entire word
>   - which results in the following glyphs: [glyph from Arabic font]
>   [default 'no-mapping' glyph from Arabic font] [glyph from Arabic font]
>4. require user to create their own "aggregate" fonts or modify their
>fonts to including last resort glyphs for all unsupported mappings.
> 
> The last solution above is so onerous that effectively makes it a
> non-solution, so we can drop that from consideration, but note that this
> "non-solution" is the only one that would work now. All of the other three
> require some modifications to FOP, even the third solution which requires
> the author to insert LastResort font into font family specifications.
> 
> Regards,
> Glenn
> 
> On Wed, Jul 21, 2010 at 7:46 PM, Eric Douglas wrote:
> 
>>  I like your idea of the 'last resort' font, though I didn't like the
>> configuration file to begin with.
>> You could add an option to the configuration file also if you like the
>> configuration file, but I think when the program allows integration using
>> embedded code, there should be an option for all custom font setup in the
>> API.
>>
>>  --
>> *From:* Glenn Adams [mailto:gl...@skynav.com]
>> *Sent:* Tuesday, July 20, 2010 8:59 PM
>>
>> *To:* fop-dev@xmlgraphics.apache.org
>> *Subject:* Re: Font Glyph?
>>
>> Comment inline. Note that I have assigned the new bug to myself, so I will
>> undertake the work to satisfy this.
>>
>> On Wed, Jul 21, 2010 at 1:25 AM, Vincent Hennebert 
>> wrote:
>>
>>> Hi,
>>>
>>> I’m not keen on adding Yet Another configuratin option to the config
>>> file, there are more than enough already.
>>>
>> What's the purpose in having a configuration file if it isn't used for
>> configuration information?
>>
>>
> 


Re: Font Glyph?

2010-07-20 Thread Vincent Hennebert
Hi,

I’m not keen on adding Yet Another configuratin option to the config
file, there are more than enough already.

I believe that if a user is advanced enough to be aware that a last
resort font can be configured, then they are also able to configure
custom fonts so as to avoid any glyph missing warning.

Moreover this last resort font is a TrueType font, which is not
supported by all output formats yet.

Both Type1 and TrueType (OpenType) fonts have a dedicated glyph for
unsupported characters. I think this is what FOP should fall back to in
case of a missing glyph.

Vincent


Glenn Adams wrote:
> Unicode does not prescribe how to render characters for which the assigned
> font(s) have no corresponding glyph(s). It does, however, make
> recommendations on how an application or system should handle this case,
> about which see Unicode 5.1 Section 5.3 Unknown and Missing Characters,
> under the sub-heading of *Interpretable but Unrenderable Characters*. See
> also the following FAQ:
> 
> http://unicode.org/faq/unsup_char.html?PHPSESSID=a05ee80b0f30ee349b9851a929e4e4e6
> 
> What FOP should be doing, rather than map an unrenderable character to '#',
> is to employ a so called Last Resort font, where each defined character is
> associated with some glyph, e.g., one that indicates the script of the
> character. In the absence of such a Last Resort font, it is customary to map
> the character to a glyph depicting an empty box.
> 
> Unicode has published such a Last Resort font see:
> 
> http://www.unicode.org/policies/lastresortfont_eula.html
> 
> A reasonable strategy for FOP might be to allow the user to specify (in the
> FOP configuration file) a font mapping to a last resort font to be used in
> such cases. The user would still have to download and install the last
> resort font on their system, due to licensing reasons.
> 
> I will post a bug to this effect, and suggesting this solution, if there is
> not already one present. Some minor modifications to FOP would be required
> to make use of the configuration information specifying a last resort font,
> and then using that font when no mapping is present in the assigned font.
> 
> Regards,
> Glenn
> 
> On Mon, Jul 19, 2010 at 11:50 PM, Eric Douglas wrote:
> 
>> I don't understand what unicode.org is saying if it's just referring to
>> what characters the codes should reference if they have to be in the
>> font.  Fontforge says U2610 and U2611 are not in the font.
>>
>> Fontforge is an ugly program.  It runs within Cygwin, where it displays
>> a window showing the characters in the font, but it doesn't show them
>> all and doesn't have a scrollbar..
>> I would like an easy way to view the characters in the font to see if I
>> have something available that looks like a square/checkbox.
>> I can only assume the square I'm getting is a default in FOP 0.95 for
>> all missing glyphs.
>>
>> -Original Message-
>> From: J.Pietschmann [mailto:j3322...@yahoo.de]
>> Sent: Saturday, July 17, 2010 11:20 AM
>> To: fop-dev@xmlgraphics.apache.org
>> Subject: Re: Font Glyph?
>>
>> On 15.07.2010 22:44, Eric Douglas wrote:
>>> Then I pass a text value of "☑" in my XML.  When the
>>> transformer uses FOP to translate the XML into output, this prints a
>> square.
>> Have a look at http://www.unicode.org/charts/charindex.html
>> U2611 is "BALLOT BOX WITH CHECK", i.e. not a square (U2610 should be a
>> square, are you sure about the entity?) If FOP couldn't find the glyph,
>> it would have printed a # instead.
>> You could use one of the font editors to check whether your font
>> actually has a glyph for the U2611 character (try
>> http://fontforge.sourceforge.net/)
>>
>>
>>> I tried replacing my fop.jar with one that I compiled from the Trunk,
>>> and instead of printing the square it printed an error message to the
>>> Java Console that the font doesn't contain the specified glyph.
>> That's mildly odd, I'd guess your method for telling FOP about your font
>> doesn't work as in Trunk.
>>
>> J.Pietschmann
>>
> 


Re: [VOTE] Release files for fop 1.0

2010-07-15 Thread Vincent Hennebert
+1

Vincent


Simon Pepping wrote:
> The release files for fop 1.0 are now ready for review and the release
> vote.
> 
> The release files were built from:
> https://svn.apache.org/repos/asf/xmlgraphics/fop/tags/fop-1_0
> (created in revision 963413).
> 
> The release files are found here: http://people.apache.org/~spepping/fop-1_0:
> 
> 3186f93a314bdcb710bd7cb02d80404c  fop-1.0-bin.tar.gz
> 262da85d77fbca68556bc74e44ecca27  fop-1.0-bin.zip
> b47043cea49a9291bc0ed369a4150dd3  fop-1.0-bundle.jar
> 95dcc4c2dd08b4bc88ce9ce1ee88c439  fop-1.0-src.tar.gz
> 8693ed0f4586d394e547a23625a64d34  fop-1.0-src.zip
> 
> As partially mentioned earlier, I used the following workaround:
> 
> jdk6:  ant distclean, docs
> jdk14 without junit: ant dist (which runs the docs target again), 
> maven-artifacts
> jdk6: junit
> 
> This uses the trick that forrest is run each time, but does not clean
> out its target directory.
> 
> With jdk14 all junit tests fail. I removed junit from the classpath
> for jdk14, and ran junit after the build with jdk6.
> 
> forrest also gives problems with jdk6:
> 
> 0.95/images/update.jpg: No pipeline matched request: 0.95/images/update.jpg
>   at  - 
> file:/fsc/source/apache-forrest-0.8/main/webapp/./sitemap.xmap:600:76
> 0.95/images/fix.jpg: No pipeline matched request: 0.95/images/fix.jpg
>   at  - 
> file:/fsc/source/apache-forrest-0.8/main/webapp/./sitemap.xmap:600:76
> 0.95/images/add.jpg: No pipeline matched request: 0.95/images/add.jpg
>   at  - 
> file:/fsc/source/apache-forrest-0.8/main/webapp/./sitemap.xmap:600:76
> 1.0/images/update.jpg: No pipeline matched request: 1.0/images/update.jpg
>   at  - 
> file:/fsc/source/apache-forrest-0.8/main/webapp/./sitemap.xmap:600:76
> 1.0/images/fix.jpg: No pipeline matched request: 1.0/images/fix.jpg
>   at  - 
> file:/fsc/source/apache-forrest-0.8/main/webapp/./sitemap.xmap:600:76
> 1.0/images/add.jpg: No pipeline matched request: 1.0/images/add.jpg
>   at  - 
> file:/fsc/source/apache-forrest-0.8/main/webapp/./sitemap.xmap:600:76
> 
> These images are in /images, and the problem does not seem to matter.
> 
> compliance.pdf: internal-destination or external-destination must be
> specified in basic-link. As a consequence no compliance.pdf was built.
> 
> For your perusal I uploaded the build log: 
> build-2010-07-12T22:19:58+02:00.log.
> 
> Please, review and cast your votes before Thu 15 July 19:00h UTC.
> 
> +1 from me.
> 


Purpose of IFRenderer.TextUtil.combined?

2010-07-07 Thread Vincent Hennebert
Hi,

what was that boolean supposed to do, given that it’s set to false by
default, never set to true and results into dead code in renderSpace and
renderText?

Thanks,
Vincent


Re: Switching to DocBook [was: svn commit: r960618 [1/3] - in /xmlgraphics/fop/branches/fop-1_0]

2010-07-06 Thread Vincent Hennebert
Hi,

Jeremias Maerki wrote:
> On 05.07.2010 17:13:32 Simon Pepping wrote:

>> In compliance, I kept only 0.95, 1.0 and trunk. This caused extensive
>> changes to comments.
> 
> I guess keeping track of various versions on the website is one of the
> biggest issues why doing FOP releases is so hard. I keep wondering if we
> should not transform the actual product information to DocBook. But that,
> too, takes a lot of (initial) work.

Interesting. Do you mean completely replacing Forrest by a DocBook-based
framework? Because otherwise that would only add up to the complexity
IMO.

>From my experience I see the following pros and cons of using DocBook:
Pros:
• stable, well-known, well supported format;
• very well documented: http://www.docbook.org/tdg/en/html/docbook.html
• geared towards technical documentation which exactly matches our
  needs;
• HTML output easily customizable by CSS;
• PDF output easily customizable by XSLT;
• well supported, excellently documented official stylesheets:
  http://www.sagehill.net/docbookxsl/
• I like it ;-)

Cons:
• horribly verbose;
• some work would be needed to turn the HTML output into a proper
  website; A website extension is available but I think it tends to lag
  behind;
• some currently automatically generated pages (like status.xml) would
  have to be re-created.

>From a personal point of view, I would be rather excited to work on
a DocBook-based website rather than a Forrest-based one. Mainly because
I’m more familiar with DocBook than Forrest that still looks a bit like
a black box to me. For example, I have already customized the PDF output
produced from a DocBook document, whereas I wouldn’t know where to start
with Forrest. The customization of the HTML output also looks easier to
me.

> 
> 
> Jeremias Maerki


That was my 2 cents,
Vincent


Re: OPen Type Korean (Hangul) Fonts

2010-07-06 Thread Vincent Hennebert
Hi Tom,

(FWIW, I think this list is appropriate for this discussion as it has to
do with enhancing FOP.)

Tom Browder wrote:
> I would like to be able to use a Hangul (Hangeul: Korean) Open Type
> font with fop.
> 
> I have found the Unifoundry and it has Hangul fonts in bfd format.
> The licensing statement from the web site
> (http://unifoundry.com/index.html):
> 
> 
> 
> My software and is released under the terms of the GNU General Public
> License (GNU GPL) version 2.0, or (at your option) a later version.
> The precompiled fonts are released under the terms of the GNU GPL
> version 2, with the exception that embedding the font in a document
> does not in itself bind that document to the terms of the GPL.
> 
> 
> 
> It seems to me that fontforge could be used to convert the bfd fonts
> to a vector Open Type format.  If so, would their license permit them
> to be used for fop use and testing?

No, GPL is not compatible with the Apache license. We wouldn’t be able
to ship those fonts with FOP.

Anyway, I’m not sure that that bitmap font is what you want. I don’t
know how FontForge does to convert a bitmap font into a vectorial font,
but the result is likely to be unsatisfying.

I think you want to find a font directly available in a vectorial
format, like Type 1, TrueType or OpenType. Linux distribution usually
come with loads of fonts for many languages. For example, check out this
one for Hangul:
http://packages.debian.org/sid/ttf-alee
The license seems to be MIT, which would allow us to store it in our
code repository for testing purpose.

On my system, I have an “UnBatang” font that is a TrueType font and
should be supported by FOP.


> Next question, could anyone working on the Open Type area in fop use
> some help to move toward being able to use such fonts?

We certainly welcome help in improving the font system. OpenType fonts
containing TrueType glyphs are supported the same way as normal TrueType
fonts are. OpenType fonts based on CFF glyphs are not supported at all.

It must be seen whether advanced typography is needed to properly
typeset Korean (for example, glyph shaping like in Arabic). Advanced
typographic tables are not used by FOP’s layout engine at the moment.
That may make the issue much more complicated.


> Thanks.
> 
> -Tom
> 
> Thomas M. Browder, Jr.
> Niceville, Florida
> USA

Vincent


Re: PostScript output: missing %%DocumentNeededResources comment

2010-06-28 Thread Vincent Hennebert
Vincent Hennebert wrote:
> Hi Jeremias,
> 
> Jeremias Maerki wrote:
>> Hi Vincent,
>>
>> hmmyes, that's tricky. An (atend) requires a corresponding comment in
>> the end, but  is defined to provide at least one item. An
>> ugly work-around would be to always list "Helvetica" as needed resources
>> and to generate a corresponding %%IncludeResource although it might
>> never be used.
> 
> FWIW, when optimization mode is off, /all/ of the base 14 fonts are
> %%IncludeResource:’d in the setup section of the document, whether they
> are actually used or not; And they aren’t listed in
> %%DocumentNeededResources:. I don’t know whether it’s another violation
> of the DSC specification or not.

Having just re-checked, it is.


>> At any rate, this happens only with the resource optimization disabled.
>> I think I'd add the missing (atend) but omit the trailer comment (when
>> there are no needed resources) in the hope that any consumer can deal
>> with it. We've never had any complaints about DSC comments that caused
>> trouble AFAICR.
> 
> I guess using a document manager goes in pair with optimizing the
> PostScript output anyway.
> 
> 
> Thanks,
> Vincent
> 
> 
>> On 25.06.2010 13:07:51 Vincent Hennebert wrote:
>>> Hi,
>>>
>>> The PostScript Document Structuring Conventions Specification states
>>> that the %%DocumentNeededResources: comment can be specified in the
>>> %%Trailer section, but if this is the case it must also be present in
>>> the header with an (atend) value.
>>> http://www.adobe.com/devnet/postscript/pdfs/5001.DSC_Spec.pdf
>>>
>>> This is not what FOP does. I suppose that that’s because external
>>> resources aren’t always needed (mainly, the base 14 fonts aren’t being
>>> used). But if they are then the document violates the DSC specification.
>>>
>>> There doesn’t seem to be any easy fix for that problem. We can’t
>>> systematically put it in the header because then it /must/ appear in the
>>> %%Trailer section as well. But if no base 14 font is used then it’s not
>>> needed. But if a base 14 font is used then I guess it’s too late when we
>>> know it, the header has already been produced. That kills a bit the
>>> utility of the (atend) feature.
>>>
>>> So... WDYT?
>>>
>>> Thanks,
>>> Vincent
>>
>>
>>
>> Jeremias Maerki
>>


Re: PostScript output: missing %%DocumentNeededResources comment

2010-06-25 Thread Vincent Hennebert
Hi Jeremias,

Jeremias Maerki wrote:
> Hi Vincent,
> 
> hmmyes, that's tricky. An (atend) requires a corresponding comment in
> the end, but  is defined to provide at least one item. An
> ugly work-around would be to always list "Helvetica" as needed resources
> and to generate a corresponding %%IncludeResource although it might
> never be used.

FWIW, when optimization mode is off, /all/ of the base 14 fonts are
%%IncludeResource:’d in the setup section of the document, whether they
are actually used or not; And they aren’t listed in
%%DocumentNeededResources:. I don’t know whether it’s another violation
of the DSC specification or not.


> At any rate, this happens only with the resource optimization disabled.
> I think I'd add the missing (atend) but omit the trailer comment (when
> there are no needed resources) in the hope that any consumer can deal
> with it. We've never had any complaints about DSC comments that caused
> trouble AFAICR.

I guess using a document manager goes in pair with optimizing the
PostScript output anyway.


Thanks,
Vincent


> On 25.06.2010 13:07:51 Vincent Hennebert wrote:
>> Hi,
>>
>> The PostScript Document Structuring Conventions Specification states
>> that the %%DocumentNeededResources: comment can be specified in the
>> %%Trailer section, but if this is the case it must also be present in
>> the header with an (atend) value.
>> http://www.adobe.com/devnet/postscript/pdfs/5001.DSC_Spec.pdf
>>
>> This is not what FOP does. I suppose that that’s because external
>> resources aren’t always needed (mainly, the base 14 fonts aren’t being
>> used). But if they are then the document violates the DSC specification.
>>
>> There doesn’t seem to be any easy fix for that problem. We can’t
>> systematically put it in the header because then it /must/ appear in the
>> %%Trailer section as well. But if no base 14 font is used then it’s not
>> needed. But if a base 14 font is used then I guess it’s too late when we
>> know it, the header has already been produced. That kills a bit the
>> utility of the (atend) feature.
>>
>> So... WDYT?
>>
>> Thanks,
>> Vincent
> 
> 
> 
> 
> Jeremias Maerki
> 


PostScript output: missing %%DocumentNeededResources comment

2010-06-25 Thread Vincent Hennebert
Hi,

The PostScript Document Structuring Conventions Specification states
that the %%DocumentNeededResources: comment can be specified in the
%%Trailer section, but if this is the case it must also be present in
the header with an (atend) value.
http://www.adobe.com/devnet/postscript/pdfs/5001.DSC_Spec.pdf

This is not what FOP does. I suppose that that’s because external
resources aren’t always needed (mainly, the base 14 fonts aren’t being
used). But if they are then the document violates the DSC specification.

There doesn’t seem to be any easy fix for that problem. We can’t
systematically put it in the header because then it /must/ appear in the
%%Trailer section as well. But if no base 14 font is used then it’s not
needed. But if a base 14 font is used then I guess it’s too late when we
know it, the header has already been produced. That kills a bit the
utility of the (atend) feature.

So... WDYT?

Thanks,
Vincent


TrueType Fonts in PostScript

2010-05-28 Thread Vincent Hennebert
Hi,

Following Jeremias’ notes about implementing support for TrueType fonts
in PostScript:
http://wiki.apache.org/xmlgraphics-fop/TrueTypeInPostScript
I’d like to give it a go. I will create a branch shortly and work from
there. Any help or comments would be most welcome, as I’m mostly
inexperienced in that area.


Thanks,
Vincent


Re: svn commit: r946585 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop/afp/fonts: AFPFont.java AbstractOutlineFont.java CharacterSet.java CharacterSetBuilder.java CharacterSetOrientation.java Dou

2010-05-26 Thread Vincent Hennebert
Ok.

Thanks,
Vincent


Jeremias Maerki wrote:
> Hi Vincent,
> 
> in the long term, I agree with you. But as long as so many other parts
> of FOP (like Font.mapChar()) use "char", there's no point to use "int"
> in the backend. There will never be any characters outside the basic
> plane until the whole process from input through layout engine to
> rendering components are prepared for these characters. In the short
> term, my change really improves understandability which is usually one
> of your major concerns. It helped me a lot identifying a problem. The
> changes can easily be reverted once there is a concerted effort to
> make the whole of FOP compatible with the full range of Unicode
> characters. It's also important to note that the AFP part will need some
> special attention when these characters need to be used as some of the
> data structures in there will get insanely large if we start supporting
> characters beyong the basic plane. So unless there is a sustained veto
> against rev 946585 I'm inclined to leave it like it is.
> 
> On 21.05.2010 11:46:42 Vincent Hennebert wrote:
>> Hi,
>>
>>> Author: jeremias
>>> Date: Thu May 20 09:52:27 2010
>>> New Revision: 946585
>>>
>>> URL: http://svn.apache.org/viewvc?rev=946585&view=rev
>>> Log:
>>> Changed many variables and parameters from "int" to "char" because AFP font 
>>> support mostly uses Unicode code points unlike Type 1 and TrueType support 
>>> which use internal character code points (the result of Font.mapChar()). 
>>> This should improve code readability.
>> Not sure this is a desirable change. char can only address characters
>> from the Basic Multilingual Plane. Java 1.5 have started to use int to
>> overcome that issue actually. So unless there is a fundamental
>> limitation in AFP such that characters beyond the BMP will never be
>> usable, I think we want to stick to int.
>>
>> 
>>
>> Vincent
> 
> 
> 
> 
> Jeremias Maerki
> 


Re: svn commit: r946585 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop/afp/fonts: AFPFont.java AbstractOutlineFont.java CharacterSet.java CharacterSetBuilder.java CharacterSetOrientation.java Dou

2010-05-21 Thread Vincent Hennebert
Hi,

> Author: jeremias
> Date: Thu May 20 09:52:27 2010
> New Revision: 946585
> 
> URL: http://svn.apache.org/viewvc?rev=946585&view=rev
> Log:
> Changed many variables and parameters from "int" to "char" because AFP font 
> support mostly uses Unicode code points unlike Type 1 and TrueType support 
> which use internal character code points (the result of Font.mapChar()). This 
> should improve code readability.

Not sure this is a desirable change. char can only address characters
from the Basic Multilingual Plane. Java 1.5 have started to use int to
overcome that issue actually. So unless there is a fundamental
limitation in AFP such that characters beyond the BMP will never be
usable, I think we want to stick to int.



Vincent


Re: Table layout = auto functionality needed: bounty.

2010-03-26 Thread Vincent Hennebert
Hi,

Adrian Cumiskey wrote:
> HI Simon,
> 
> I'm not sure it would, a very complex subject that would require a lot of
> time just to understand all the considerations involved.  There is a good
> reason why it has not been implemented up to now.

Agreed. The table code is too complicated and would require too much
time to dive in. Even I have difficulties debugging my own code, and
I have built up quite some experience now...

The underlying data model is not appropriate and has been pushed beyond
its limits. I don’t think anything more can be done on tables because
that would inevitably break something.

Of course, that’s bad news for Peter... The only possibility IMO is to
implement some limited, ad hoc functionality that would work in some
specific use case. Maybe adapting the patch from Bugzilla #47347 to the
current Trunk is doable in a reasonable amount of time.

The other possibility is to refactor the whole layout engine...


Vincent


> I think providing a more
> automated build/release process would be a far more suitable and achievable
> project for someone completely new to the project.
> 
> Adrian.
> 
> On 25 March 2010 07:24, Simon Pepping  wrote:
> 
>> FOP devs,
>>
>> Would this be suitable for a GSoC project? It is certainly not
>> trivial, and the candidate should have a reasonable chance of success.
>>
>> Simon
>>
>> On Wed, Mar 24, 2010 at 04:57:47PM +, Peterdk wrote:
>>> Hi,
>>>
>>> I am wondering, I need a basic version of table-layout=auto. It's not yet
>>> implemented with FOP.
>>> I am willing to set a bounty of max 250$ for it, if it's implemented to a
>>> level that I can use it for my project.
>>> Are there any devs interested and willing to work on this? For the bounty
>>> it would be needed to be ready in about 3 months.
>>>
>>> I know there is a patch in bugzilla for a older rev. that gives basic
>>> functionality, but it fails to work when margin's are applied to the
>> parent
>>> block or the table itself. I have contacted the author of this patch, but
>> I
>>> would rather have a FOP dev work on auto-table-layout so the
>> functionality
>>> will be included in the trunk version so other users also benefit, and I
>>> prefer to support some FOP dev with some money rather then a other
>>> programmer.
>>>
>>> Anybody interested?
>>>
>>> Peter, NL
>>>
>> --
>> Simon Pepping
>> home page: http://www.leverkruid.eu


Re: Redesigning the web site [was: Google Summer of Code: Bring out your projects]

2010-03-26 Thread Vincent Hennebert
Simon Pepping wrote:
> On Wed, Mar 24, 2010 at 07:37:12PM +0000, Vincent Hennebert wrote:
> 
>> Speaking of the release, many parts of the website are largely outdated
>> and need a serious re-work (the Development tab, mainly). Also, any
>> reference to 0.20.5 should IMO be removed before releasing 1.0. 0.20.5
>> is a thing of the past now.
> 
> That makes a release even more difficult. I am in favour of an early
> release, rather than working on the website
>  
>> Finally, the website could really do with a new look. ATM it???s looking
>> so... 1990???s. I started to work on that some time ago (based on the
>> Batik skin), but never got to finishing it. Forrest sometimes gets in
>> the way, I must say. Maybe switching to an alternative framework could
>> be investigated. Especially if it can also provide higher automation.
> 
> Again, I am in favour of focusing on an early release as our most
> important requirement.

Then someone is going to spend 3 days doing the 1.0 release, and when
it’s time to do the next release the exact same issue will show up
again.

Releasing 1.0 is the perfect opportunity to do some re-branding and
refactor the website, IMO.

Vincent


Re: Google Summer of Code: Bring out your projects

2010-03-24 Thread Vincent Hennebert
Hi Simon,

Simon Pepping wrote:
> Thanks to Adrian and Vincent who want to be mentor. We also need some
> ideas for projects.
> 
> 1. Our releases require too much work, which has resulted in no
> release for much too long a time. How can the work related to a
> release be minimized? Can we develop tools to automate much of the
> work?

Certainly. A while ago I mentioned the possibility of using Ant’s
variable substitution mechanism:
http://markmail.org/message/mgoxf2ptvoffaok7
Putting such variables (latest FOP version, copyright year, etc.) at all
appropriate places would already be of great help.

Then it’s mainly a matter of streamlining the whole process and removing
as much duplication as possible. Just as an example, do we really need
to duplicate the release notes in the README file?

Speaking of the release, many parts of the website are largely outdated
and need a serious re-work (the Development tab, mainly). Also, any
reference to 0.20.5 should IMO be removed before releasing 1.0. 0.20.5
is a thing of the past now.

Finally, the website could really do with a new look. ATM it’s looking
so... 1990’s. I started to work on that some time ago (based on the
Batik skin), but never got to finishing it. Forrest sometimes gets in
the way, I must say. Maybe switching to an alternative framework could
be investigated. Especially if it can also provide higher automation.


> 2. Implementing features of the XSL-FO 1.1 spec which have remained
> unimplemented.
> 
> 3. Implementing proposed features of the XSL-FO 2.0 spec. For XSL-FO
> 1.0 FOP was the reference implementation. We could host reference
> implementations of newly proposed features.
> 4 ... n. Do we have open issues that would make up a GSoC project?
> 
> On Fri, Mar 12, 2010 at 01:07:21PM +0100, Simon Pepping wrote:
>> Ross Gardler of ASF announced that it is time for our projects to
>> start preparing for Google Summer of Code (GSoC). Do we have ideas for
>> GSoC projects? Are committers willing to be a mentor?
> 
> Simon

Vincent




Re: Google Summer of Code: Bring out your projects

2010-03-16 Thread Vincent Hennebert
Simon Pepping wrote:
> Ross Gardler of ASF announced that it is time for our projects to
> start preparing for Google Summer of Code (GSoC). Do we have ideas for
> GSoC projects? Are committers willing to be a mentor?

If any student comes up with a good idea of project in the FOP area, I’d
be happy to be a mentor, as I myself benefited from that a few years
ago.

Vincent


Re: svn commit: r911800 - in /xmlgraphics/fop/trunk/src/documentation/content/xdocs: 0.94/upgrading.xml 0.95/upgrading.xml trunk/upgrading.xml

2010-02-24 Thread Vincent Hennebert
Hi,

Pascal Sancho wrote:
> Hi,
> 
> I partially agree, but...:
>  - both releases (0.94: 2007, 0.95: 2008) are posterior to REC 1.1 (2006);
>  - both releases implement some REC 1.1 new features (Cf. bookmarks).
> 
> I can revert the change, but that will not reflect the above things, IMHO.
> WDYT?

I too agree with the principle, but in the present case the move to
XSL-FO 1.1 was made before the release of FOP 0.94:
http://markmail.org/thread/ku643qmlfklrloya
So it’s rather the website that was lagging way behind, and Pascal’s
change was actually a fix IMO.


Vincent


> --
> Pascal
> 
> Simon Pepping a écrit :
>> Note that pages 0.94/upgrading.xml and 0.95/upgrading.xml talk about
>> versions 0.94 and 0.95, even though the page calls it the latest
>> version. I think it is not correct to talk about FO 1.1 here, because
>> at the time of the releases FO 1.1 was not really considered. These
>> pages are as they were included in those releases, and therefore I
>> think they should not be updated at all except for glaring errors.
>>
>> Simon
>>
>> On Fri, Feb 19, 2010 at 12:37:51PM -, psan...@apache.org wrote:
>>   
>>> Author: psancho
>>> Date: Fri Feb 19 12:37:51 2010
>>> New Revision: 911800
>>>
>>> URL: http://svn.apache.org/viewvc?rev=911800&view=rev
>>> Log:
>>> Fop WebSite: XSL-FO 1.0 references were still there; fixed now
>>>
>>> Modified:
>>> xmlgraphics/fop/trunk/src/documentation/content/xdocs/0.94/upgrading.xml
>>> xmlgraphics/fop/trunk/src/documentation/content/xdocs/0.95/upgrading.xml
>>> 
>>> xmlgraphics/fop/trunk/src/documentation/content/xdocs/trunk/upgrading.xml
>>>
>>> Modified: 
>>> xmlgraphics/fop/trunk/src/documentation/content/xdocs/0.94/upgrading.xml
>>> URL: 
>>> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/documentation/content/xdocs/0.94/upgrading.xml?rev=911800&r1=911799&r2=911800&view=diff
>>> ==
>>> --- 
>>> xmlgraphics/fop/trunk/src/documentation/content/xdocs/0.94/upgrading.xml 
>>> (original)
>>> +++ 
>>> xmlgraphics/fop/trunk/src/documentation/content/xdocs/0.94/upgrading.xml 
>>> Fri Feb 19 12:37:51 2010
>>> @@ -26,7 +26,7 @@
>>>  
>>>Important!
>>>
>>> -If you're planning to upgrade to the latest FOP version there are 
>>> a few very important things 
>>> +If you're planning to upgrade to the latest FOP version there are 
>>> a few very important things
>>>  to consider:
>>>
>>>
>>> @@ -63,21 +63,21 @@
>>>  
>>>  
>>>
>>> -The new code is much more strict about the interpretation of 
>>> the XSL-FO 1.0 specification.
>>> +The new code is much more strict about the interpretation of 
>>> the XSL-FO 1.1 specification.
>>>  Things that worked fine in version 0.20.5 might start to 
>>> produce warnings or even errors
>>>  now. FOP 0.20.5 contains many bugs which have been corrected 
>>> in the new code.
>>>


Re: java.lang.NullPointerException: org.apache.fop.layoutmgr.inline.InlineStackingLayoutManager.getChangedKnuthElements(InlineStackingLayoutManager.java:375)

2010-02-18 Thread Vincent Hennebert
Hi Mathieu,

Mathieu Malaterre wrote:
> [previously sent in fop-user mailing list]
> 
> hi there,
> 
> This is my first post to fop-dev, it is suggested to report bug here.

Actually bug reports should be made on Bugzilla:
https://issues.apache.org/bugzilla/enter_bug.cgi?product=Fop

Please could you open a new issue and post your sample FO file there?
That will be easier to keep track of the problem.

Thanks,
Vincent


> Here is my input docbook file:
> 
> 
>  "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"; []>
> 
> 
>  title
> 
> 
> The  id="example.anchor.1"/>anchor element is empty and
> contributes
> nothing to the flow of the content in which it occurs.  It is only useful
> as a target.
> 
> 
> 
> 
> 
> which I process with:
> 
> /usr/bin/xsltproc --stringparam fop1.extensions 1 --stringparam
> ulink.show 0 --xinclude -o test2.fo
> /usr/share/xml/docbook/stylesheet/nwalsh/fo/docbook.xsl test2.xml
> 
> and lead to:
> 

> 
> 
> I am using fop from today's trunk. I have attached the .fo file as
> cmopressed gzip.
> 
> Let me know if you need more info.
> 
> Thanks !


Re: svn commit: r910239 [1/2] - /xmlgraphics/fop/trunk/src/documentation/content/xdocs/compliance.ihtml

2010-02-18 Thread Vincent Hennebert
Pascal Sancho wrote:
> You are right, Simon.
> I've tried to commit deployment by hands, and it's OK.
> I have to investigate why ForrestBot did not.

FWIW, ForrestBot has never worked for me. From the quick investigation
I made (it was a long time ago), the parsing of the ‘svn st’ it was
doing to figure out which files had to be added was not working. Since
then I’ve always published the website by hand. At least I know what I’m
doing, plus I can remove files that are no longer necessary.


> Pascal
> 
> Simon Pepping a écrit :
>> On Wed, Feb 17, 2010 at 01:29:19PM +0100, Pascal Sancho wrote:
>>   
>>> Hi Vincent,
>>>
>>> I cannot deploy using ForrestBot; I think I have not write access to
>>> [https://svn.apache.org/repos/asf/xmlgraphics/site/deploy/fop/]
>>> 
>> You should have write access to
>> https://svn.apache.org/repos/asf/xmlgraphics/site/.
>>
>> Simon


Vincent


Re: svn commit: r910239 [1/2] - /xmlgraphics/fop/trunk/src/documentation/content/xdocs/compliance.ihtml

2010-02-17 Thread Vincent Hennebert
Hi Pascal,

Pascal Sancho wrote:
> Hi Vincent,
> 
> I missed this this bug , thanks for the link.
> For now, I'm fighting with Forrest...
> When I understood it, I'll make the change.

Great. The next step is to upload the updated website. See here for more
information:
http://xmlgraphics.apache.org/fop/dev/doc.html#web
Feel free to ask here if you have any question.


Thanks,
Vincent


> Pascal
> 
> Vincent Hennebert a écrit :
>> Hi Pascal,
>>
>>   
>>> Author: psancho
>>> Date: Mon Feb 15 15:43:32 2010
>>> New Revision: 910239
>>>
>>> URL: http://svn.apache.org/viewvc?rev=910239&view=rev
>>> Log:
>>> Added complete list of 1.1 item in compliance page.
>>>  - Udated information for page-number-citation-last, content-width, 
>>> content-height, and page-position
>>>  - Updated links for bookmarks
>>>  - retrieve-table-marker now related to fox:outline
>>> 
>> Good work. This is something that had been needed for a long time.
>> Did you notice the following bug?
>> https://issues.apache.org/bugzilla/show_bug.cgi?id=46565
>> I think any reference to XSL-FO 1.0 can be removed now. So if you feel
>> like giving it another update...
>>
>> Thanks,
>> Vincent


Re: svn commit: r910239 [1/2] - /xmlgraphics/fop/trunk/src/documentation/content/xdocs/compliance.ihtml

2010-02-16 Thread Vincent Hennebert
Hi Pascal,

> Author: psancho
> Date: Mon Feb 15 15:43:32 2010
> New Revision: 910239
> 
> URL: http://svn.apache.org/viewvc?rev=910239&view=rev
> Log:
> Added complete list of 1.1 item in compliance page.
>  - Udated information for page-number-citation-last, content-width, 
> content-height, and page-position
>  - Updated links for bookmarks
>  - retrieve-table-marker now related to fox:outline

Good work. This is something that had been needed for a long time.
Did you notice the following bug?
https://issues.apache.org/bugzilla/show_bug.cgi?id=46565
I think any reference to XSL-FO 1.0 can be removed now. So if you feel
like giving it another update...

Thanks,
Vincent


Re: svn commit: r908543 - /xmlgraphics/fop/trunk/src/java/org/apache/fop/fonts/type1/Type1FontLoader.java

2010-02-12 Thread Vincent Hennebert
Hi Jeremias,

> Author: jeremias
> Date: Wed Feb 10 15:37:04 2010
> New Revision: 908543
> 
> URL: http://svn.apache.org/viewvc?rev=908543&view=rev
> Log:
> Bugzilla #48512:
> Bugfix: Don't map AdobeStandardEncoding to StandardEncoding. They are not the 
> same. Fixes problem with invalid character widths on PostScript output and 
> missing umlauts.

What makes you think that they are not the same? What is
AdobeStandardEncoding then, if not the Adobe Standard Encoding [1]
itself?

[1] http://www.adobe.com/devnet/opentype/archives/std_enc.html

Thanks,
Vincent


> Modified:
> 
> xmlgraphics/fop/trunk/src/java/org/apache/fop/fonts/type1/Type1FontLoader.java
> 
> Modified: 
> xmlgraphics/fop/trunk/src/java/org/apache/fop/fonts/type1/Type1FontLoader.java
> URL: 
> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/fonts/type1/Type1FontLoader.java?rev=908543&r1=908542&r2=908543&view=diff
> ==
> --- 
> xmlgraphics/fop/trunk/src/java/org/apache/fop/fonts/type1/Type1FontLoader.java
>  (original)
> +++ 
> xmlgraphics/fop/trunk/src/java/org/apache/fop/fonts/type1/Type1FontLoader.java
>  Wed Feb 10 15:37:04 2010
> @@ -141,7 +141,7 @@
>  if (afm != null) {
>  String encoding = afm.getEncodingScheme();
>  singleFont.setUseNativeEncoding(true);
> -if ("AdobeStandardEncoding".equals(encoding)) {
> +if ("StandardEncoding".equals(encoding)) {
>  singleFont.setEncoding(CodePointMapping.STANDARD_ENCODING);
>  } else {
>  String effEncodingName;


Re: trying to debug using eclipse

2010-02-08 Thread Vincent Hennebert
Hi Martin,

Martin Edge wrote:
> Within eclipse it says it's within the 'build path' or do you mean within
> Java's class path? 

You need to refresh the directory for Eclipse to take into account new
files created by the build process.

Select the project in the Package Explorer view and press ‘F5’, or
right-click and select ‘Refresh’.

HTH,
Vincent


> From: Peter Hancock
> Sent: Sunday, 7 February 2010 7:08 PM
> To: fop-dev@xmlgraphics.apache.org; martin.e...@intellimail.com.au
> Subject: Re: trying to debug using eclipse
> 
>  
> 
> Hi Martin
> 
> Is build/gensrc on your classpath?  This gets generated during the default
> ant task.
> 
> Pete
> 
> On Sat, Feb 6, 2010 at 3:28 AM,  wrote:
> 
> Hi Guys,
> 
> Wondering if there are any tips of what i'm doing wrong - have built the
> application using ant, and it says it was built successfully. Can see the
> event-models.xml in the accessibility section, however, when running the
> application from eclipse, I get:
> 
> NFO: Default page-width set to: 210mm
> Exception in thread "main" java.lang.ExceptionInInitializerError
>at org.apache.fop.apps.FOUserAgent.(FOUserAgent.java:102)
>at
> org.apache.fop.apps.FopFactory.newFOUserAgent(FopFactory.java:188)
>at
> org.apache.fop.cli.CommandLineOptions.parse(CommandLineOptions.java:171)
>at org.apache.fop.cli.Main.startFOP(Main.java:158)
>at org.apache.fop.cli.Main.main(Main.java:205)
> Caused by: java.util.MissingResourceException: File event-model.xml not
> found
>at
> org.apache.fop.events.model.AbstractEventModelFactory.loadModel(AbstractEven
> tModelFactory.java:46)
>at
> org.apache.fop.accessibility.AccessibilityEventProducer$EventModelFactory.cr
> eateEventModel(AccessibilityEventProducer.java:54)
>at
> org.apache.fop.events.DefaultEventBroadcaster.(DefaultEventBroadcast
> er.java:73)
>... 5 more
> 
> 
> Any suggestions on where I should start looking?
> 
> THanks
> Martin


Re: ConcurrentModificationException error

2010-01-29 Thread Vincent Hennebert
Hi,

FYI, I’ve just committed a fix for this bug:
http://svn.apache.org/viewvc?rev=904467&view=rev

If you ever need to you can apply the patch to FOP 0.95’s source code as
is.

As a side note, introducing a new object just for synchronization
purpose is not necessary. The class object (PDFICCBasedColorSpace.class)
can be used for that.

Regards,
Vincent


Anil Pinto wrote:
> Andreas,
> 
> I have not encountered the problem since, thankfully, so you are right the
> chances of it happening seem to be quite slim ;-)
> 
> Though I am glad you replied, I was getting a feeling the question was
> ignored for some reason unknown to me.
> 
> Appreciate the time taken to describe in detail the cause and possible fix.
> I guess I can wait until the patch gets applied and is available in the next
> stable release.
> 
> Thank you very much for your response. Have a great weekend ahead.
> Anil Pinto.
> 
> Lobo Technologies, Inc.
> 16980 Via Tazon, Suite 120, San Diego, CA 92127
> Voice : 858-485-9033 x 103
> Fax   : 858-485-9152
> 
> 
> -Original Message-
> From: Andreas Delmelle [mailto:andreas.delme...@telenet.be]
> Sent: Thursday, January 14, 2010 10:36 AM
> To: Anil Pinto
> Cc: fop-dev@xmlgraphics.apache.org
> Subject: Re: ConcurrentModificationException error
> 
> 
> On 10 Dec 2009, at 03:24, Anil Pinto wrote:
> 
> Hi Anil
> 
> (Didn't see a response for this one come in, so far on fop-us...@...
> Apologies if the reply comes a bit late.)
> 
>> We have FOP (0.95) embedded in a multithreaded environment to create many
> PDFs almost simultaneously.
>> We hav been using this configuration for 6 months plus now. I noticed the
> following trace for the first time and it caught my attention, as I thought
> we have followed all the multithreaded requirements required by FOP.
> 
> It is pointing to a bug in FOP, due to a slight oversight in making use of
> java.awt.ICC_Profile, IIC.
> 
>> java.util.ConcurrentModificationException
>>  at
> java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
>>  at java.util.AbstractList$Itr.next(AbstractList.java:343)
>>  at
> sun.awt.color.ProfileDeferralMgr.activateProfiles(ProfileDeferralMgr.java:75
> )
>>  at java.awt.color.ICC_Profile.getInstance(ICC_Profile.java:756)
>>  at java.awt.color.ICC_Profile.getInstance(ICC_Profile.java:996)
> 
> Checking the Javadocs, there is no mention anywhere of the multi-thread
> (un)safety of ICC_Profile or the call to getInstance(). So, I think we can
> only safely assume that this means it is unsafe.
> 
>>  at
> org.apache.fop.pdf.PDFICCBasedColorSpace.setupsRGBColorProfile(PDFICCBasedCo
> lorSpace.java:140)
> 
> Seen that it is a static method calling another static method, the chances
> of anything bad happening are very slim, but so you stumbled upon it. :(
> Seems a perfect example of a race condition, though: you mean this is the
> first time in all those 6 months that this error occurred? Very slim indeed,
> then!
> 
> As for the good news (I hope I am correct about this):
> FOP can solve this easily, either by making setupsRGBProfile() a
> synchronized method, or by performing only the call to
> ICC_Profile.getInstance() in a synchronized block. My preference goes in the
> direction of the latter, as that limits the synchronization overhead to the
> single call into the AWT library, which is causing the issue. The rest of
> the method appears safe for concurrent runs, at first glance.
> The (minor) downside is that we would have to introduce a new static final
> to synchronize the calls on.
> 
> Very quick patch below (vs current trunk; don't know if it can be applied to
> 0.95 without small changes...).
> 
> 
> HTH!
> 
> Regards,
> 
> Andreas
> 
> ---
> Index: src/java/org/apache/fop/pdf/PDFICCBasedColorSpace.java
> ===
> --- src/java/org/apache/fop/pdf/PDFICCBasedColorSpace.java(revision 
> 679326)
> +++ src/java/org/apache/fop/pdf/PDFICCBasedColorSpace.javaWed Jan 13
> 20:29:07 CET 2010
> @@ -34,6 +34,8 @@
>  private PDFICCStream iccStream;
>  private String explicitName;
> 
> +private static final Object _S = new Object();
> +
>  /**
>   * Constructs a the ICCBased color space with an explicit name (ex.
> "DefaultRGB").
>   * @param explicitName an explicit name or null if a name should be
> generated
> @@ -137,7 +139,9 @@
>  InputStream in = PDFDocument.class.getResourceAsStream("sRGB Color
> Space Profile.icm");
>  if (in != null) {
>  try {
> +synchronized (_S) {
> -profile = ICC_Profile.getInstance(in);
> +profile = ICC_Profile.getInstance(in);
> +}
>  } catch (IOException ioe) {
>  throw new RuntimeException(
>  "Unexpected IOException loading the sRGB profile: "
> + ioe.getMessage());
> ---
> 


Re: Sharing font code with FontBox

2010-01-19 Thread Vincent Hennebert
Hi Simon,

Simon Pepping wrote:
> From a FontBox report: We are still working on integrating Villu's
> patch which adds support for Adobe CFF/Type2 fonts to FontBox.
> 
> This raises my question: Can we share font handling code with FontBox?

A priori yes. Our own font code definitely needs improvements, so the
question is whether it is better to drop it altogether and switch to
FontBox, or if we should keep it because FontBox doesn’t match our needs
closely enough.

I’d be interested to hear from Alexander Kiel. He seems to have gone
a different route for implementing OpenType support and I’d be curious
to know why.

Vincent


Re: svn commit: r898845 - /xmlgraphics/fop/trunk/test/accessibility/pdf/

2010-01-14 Thread Vincent Hennebert
Hi Simon,

Simon Pepping wrote:
> On Wed, Jan 13, 2010 at 05:17:03PM -, vhenneb...@apache.org wrote:
>> Author: vhennebert
>> Date: Wed Jan 13 17:17:01 2010
>> New Revision: 898845
>>
>> URL: http://svn.apache.org/viewvc?rev=898845&view=rev
>> Log:
>> Updated reference accessible PDF files. Old ones had "Apache FOP Version SVN 
>> branches/Temp_Accessibility" as Creator and Producer values. New ones have 
>> "Apache FOP Version SVN trunk". This was causing spurious differences when 
>> testing PDF accessibility.
>  
> Where is the version property set?

In FOUserAgent, producer field is set to "Apache FOP Version
" + Version.getVersion() and returned by the getProducer() method; That
method is called, among others, in PDFRenderingUtil.setupPDFDocument.
Something similar is done (I suppose, haven’t checked) for Creator.

Is that what you asked for?

Vincent


Re: FOP and large documents: out of memory

2010-01-14 Thread Vincent Hennebert
Hi Stephan,

I’m not sure I would invest any energy into improving the
CachedRenderPagesModel (-conserve option). It doesn’t look like the
right approach to me, and like you noticed it doesn’t even work out of
the box currently.

Why store the Area Tree on disk? Why not directly render it into the
final output format? If that latter supports out-of-order pages, then
that’s great; Otherwise we may as well store the final pages and order
them later on when the document is complete, instead of storing them in
a half-finished area tree format.

As to pages that hold unresolved references, so can’t obviously be
rendered yet: there usually aren’t that many of them that would make the
area tree solution vastly superior to a final format one in term of
memory consumption. Those ones could be kept in memory until all the
references they hold are resolved.

Also, the handling of forward references is currently less than optimal.
The resolution is made in the area tree instead of looping back to the
layout engine. ATM, a page-reference is rendered using a placeholder
string (‘MMM’), and that placeholder is later replaced with the actual
value (e.g., ‘5’). This is fine for constructs like tables of content,
but may produce ugly results if the page-number-citation is inside
a paragraph, ruining the even spacing. What’s the point of implementing
a high-quality line-breaking algorithm if its output is spoiled by
a poor handling of page citations?

I think the two-pass approach is the best long-term solution, although
obviously less trivial. One challenge is to detect a possible infinite
loop. For example: referenced item is at the beginning of page IX,
reference is updated to IX, which takes less room than MMM, so the
document is re-laid out and referenced item is moved to page VIII;
Reference must be updated again, document is laid out again and
referenced item end up on page IX again. And again, and again...


One possible workaround for your use case is to generate your document
once with a dummy TOC and just “Page X” into the intermediate format;
Parse it to get the total number of pages and the page numbers for each
element of the TOC; Re-generate it with hardcoded values for page
references.

HTH,
Vincent


Stephan Thesing wrote:
> Hello,
> 
> as is well-known, FOP can run out of heap memory, when large documents
> are processed (http://xmlgraphics.apache.org/fop/0.95/running.html#memory).
> 
> I have the situation that the documents I have to process mandate a footer on 
> each page that contains a "page X of Y" element and a TOC at the
> beginning of the document, i.e. FOP cannot layout the pages until all
> referenced page-citations are known, which is after the last page of the 
> document.
> 
> When page content is quite complicated (e.g. 2000 pages mostly full with 
> tables), the heap space does not suffice to hold all pages until all 
> references can be resolved, thus FOP aborts with out-of-memory.
> 
> Since increasing the heap space does not always work (3 GB heap space was 
> required in one example), I need a better solution for this.
> 
> 1. "-conserve" option
> One alternative would be the "-conserve" option, which serializes the pages 
> to disk and reloads them as needed.
> Although slow, this definitely would be a solution, if it worked, which it 
> doesn't:
>  Our documents include graphics (SVG, PNG), and the serialization with 
> "-conserve" throws an exception, because some class in Batik is not 
> serializable (e.g. "SVGOMAnimatedString" IIRR), thus the page is missing, 
> causing FOP to abort later.
> Thus, Batik would have to be fixed for this.
> 
> 2. Two passes
> Since the pages are kept because of unresolved references, one could do the
> same as e.g. LaTeX always did: process the document twice.
> In a first run, pages are discarded after layout, only the references for 
> page-citations are kept and at the end reused for the second pass
> (when all pages for the citations are finally known).
> For the second run, these id-refs are initially loaded and no pages have
> to be kept.
> This would require more changes in FOP (and should definitely be made 
> optional obviously).
> 
> 
> 
> I would appreciate any comments or other suggestions !
> 
> 
> Best regards
>   Stephan


Implementing More Elaborate PDF Tagging

2010-01-07 Thread Vincent Hennebert
Hi,

Some time ago basic PDF accessibility was implemented in FOP [1]. Part
of the job is to store the document’s logical structure into the PDF
output [2]. Basically, store the information “This content was in
a block”, “that content comes from a table-cell”, etc.

PDF defines a set of standard structure elements and FOP implements
a default mapping of FOs to those structure elements. For example,
fo:root is mapped to Document, fo:block to P (Paragraph), fo:table to
Table, etc.

There is a need to do more fine-grain mapping, and be able to tag
certain fo:block as headers (H1 to H6) instead of simply Paragraphs.
That way the structure of the source document would be more accurately
represented in the PDF.

The role property [3] has been defined pretty much for that purpose;
Its value should be the name of the element from which the FO comes, or
if it’s not enough the URI of an RDF resource describing some structure
type.

Nothing is enforced, though (‘should’, not ‘must’) and I think we can
get away with directly putting a PDF standard structure type (Document,
Part, P, H1, Table, etc.). If a non-standard type is specified, we would
fall back to the default mapping and a warning could possibly be issued.

Since PDF is the only output format that support logical structure at
the moment, that should be enough for now.

I’m going to implement this enhancement in the next few days. Any
comments or suggestions are welcome.

[1] http://markmail.org/thread/mjskmien2ha6agzb
[2] http://wiki.apache.org/xmlgraphics-fop/LogicalStructure
[3] http://www.w3.org/TR/xsl11/#role


Thanks,
Vincent


Re: Adding support for Arabic to FOP

2010-01-07 Thread Vincent Hennebert
Hi,

Adrian Cumiskey wrote:
> Hi Jonathan,
> 
> I took a stab at applying a patch by Richard Wheeldon some time ago now
> but the solution was not complete (see
> https://issues.apache.org/bugzilla/show_bug.cgi?id=42307).
> 
> Without spending too much time looking into this, I am of the impression
> that there is not a singular place where you could implement this in the
> current architecture as each painter and renderer provides its own
> implementation for text handling.

And this is only half of the story. Changes also need to be made to the
layout engine: character re-ordering, BIDI implementation, glyph
shaping, etc.

You will need to familiarize yourself with the Knuth approach. Some
information is available on the wiki:
http://wiki.apache.org/xmlgraphics-fop/DeveloperPages
But you will most probably also have to read Donald Knuth’s article
“Breaking Paragraphs into Lines”.

The corresponding code is in the org.apache.fop.layout package.

Using the ICU4J library will probably be necessary. Search the fop-dev
archive for ICU4J and you will already find some information on the
topic.

Also, I recently stumbled upon the following article:
http://behdad.org/text/
which might be worth reading. The whole idea being that we should avoid
re-inventing the wheel as much as possible.


> From what I can see, it would currently involve changes to :-
> 
> 1. renderText(TextArea text, Graphics2D g2d, Font font) method in
> Java2DRenderer.
> 
> 2. The renderText() methods in all the Renderer implementations.
> 
> 3. The drawText() method in each Painter implementation.
> 
> The situation appears to be far from ideal.  I'm sorry but I do not have
> any time to help you with this due to other off-project development
> commitments that I have at the moment.  Hope this information is of help
>  to you.
> 
> Best of luck,
> 
> Adrian Cumiskey.
> 
> Jonathan Levinson wrote:
>> Is there a way the software team at InterSystems could work with you
>> (the FOP team) to add support for Arabic to FOP?
>>
>> We do have in-house expertise on Arabic.
>>
>> We are not expert in FOP internals and would need help knowing what
>> areas of the code need to be worked on.
>>
>> I’ve been trying to send you an e-mail about the issues involved but
>> it is being rejected as spam.

Try sending your messages as plain text instead of HTML. That should
reduce the spam rating. Plain text messages on public mailing lists are
preferred anyway.


>> Best Regards,
>>
>> Jonathan Levinson

HTH,
Vincent


Re: Patching with GIT/SVN (Re: Making MinOptMax Immutable)

2009-11-04 Thread Vincent Hennebert
Hi,

Looks like Max is busy with more urgent things :-)

As this patch will affect my future work on the layout engine, I’d like
to take over the patch review.

Your suggestion to use Skype sounds good. That will ease the job a bit.
I’ll contact you off-line to exchange details and arrange a time.

More below:

Alexander Kiel wrote:
> Hi Max,
> 
> you are right. It's always better to have small patches focused on one
> thing. I don't get my MinOptMax patch focused only on the refactoring of
> making MinOptMax immutable.
> 
> In the last half-an-hour I walked myself through all the diffs,
> file-by-file. I must say - except from TextLayoutManager - it is
> possible to understand all changes.
> 
> There are two other things done:
> 
>  - changing the signature from 
>InlineLevelLayoutManager#getWordChars from
>void getWordChars(StringBuffer sbChars, Position pos) to
>String getWordChars(Position pos)

What’s the reason for that change?


>  - moving the adjustment enum constants from BlockLevelLayoutManager
>into its own class.
> 
> All other things are renamings (okay mostly unrelated to MinOptMax) und
> reformattings. The problem with the reformatting is, that I mechanically
> type Ctl + Alt + L in Intellij after each crappy written peace of code.
> I even tried to reformat only selected lines. But one unattended Ctl +
> Alt + L is sufficient :| I mean, my code style options in Intellij
> conform to the FOP coding styles. Mostly the reformatting corrects
> things historically not conforming to the coding styles before.

I’m strongly against reformatting a whole file in one go. At least, as
long as code formatters don’t do a better job at formatting multi-line
statements. They break the line as near to the length limit (100 in FOP)
as possible, instead of breaking as high as possible in the statement’s
hierarchy. For example, in o.a.f.layoutmgr.table.ActiveCell.java:
elementList.add(iter.nextIndex() - 1,
new FillerPenalty(minBPD - cumulateLength));

is automatically formatted into:
elementList.add(iter.nextIndex() - 1, new FillerPenalty(minBPD
- cumulateLength));

which I find is less readable. Also, sometimes you break a line where
it’s most logical (e.g., keeping variables of similar semantics on one
line), and a code formatter will never be able to do that.

So, please try and ban that Ctl-Alt-L shortcut :-)


Also, from the quick look I had at your patch, many of your
reformattings affect Javadoc comments. I don’t think we have any
enforced convention about Javadoc. Agreeing upon one would probably ease
everyone’s lives.


> Now, I could rewind all the not related refactorings from the patch. But
> I fear that this would be much work.
> 
> So I have one suggestion: Max - maybe we could use Skype and walk
> through the code together. If we both see the same diff and I can answer
> your questions, I think it would be faster than as when I remove all the
> unrelated stuff. Maybe if we both came to the conclusion that it would
> be better to remove some aspect entirely - I would do this of course. I
> nice side effect from this Skype session would be that we become more
> familiar to one another. 
> 
> If I think about my OpenType patch or topics like refactoring the font
> subsystem and advanced OpenType layout features in text processing, some
> Skype sessions would be very useful.
> 
> This weekend, I'm a bit offside in Brandenburg without internet. So if
> the Skype option is an option I'm happy to talk on Monday - Thursday
> evening.
> 
> 
> Best Regards
> Alex 
> 
> On Thu, 2009-10-29 at 14:45 +0100, Max Berger wrote:
>> Hi Alex,
>> Hi *,
>>
>> if you do not yet have FOP developer access, and you are working on a
>> larger set of problems, please do not submit one large patch - current
>> committers will not have the time to go through every single change.
>> Instead, it is much nicer to have a series of small patches.
>>
>> One option is to use git. There is a current git clone of the FOP source
>> tree available [1][2]. It also provides help to untangle tangled working
>> copies [3]. Git lets you produce patches between different individual
>> changesets [4], and detects if the patches where applied by someone else.
>>
>> References:
>> [1] http://wiki.apache.org/general/GitAtApache
>> [2] git://git.apache.org/fop.git
>> [3] http://tomayko.com/writings/the-thing-about-git
>> [4]
>> http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#sharing-your-changes
>>
>> hth
>>
>> Max
>>


Thanks,
Vincent


Re: svn commit: r830476 - in /xmlgraphics/fop/trunk/src/java: META-INF/services/org.apache.fop.events.model.EventModelFactory org/apache/fop/accessibility/AccessibilityEventProducer.java

2009-10-28 Thread Vincent Hennebert
Jeremias Maerki wrote:
> http://xmlgraphics.apache.org/fop/trunk/events.html#event-model
> 
> "For a given application, there can be multiple event models active at
> the same time. In FOP, each renderer is considered to be a plug-in and
> provides its own specific event model. The individual event models are
> provided through an EventModelFactory. This interface is implemented for
> each event model and registered through the service provider mechanism
> (see the plug-ins section for details)."

Ok. Maybe that’s just me, but after reading that section, I still fail
to understand what an ‘event object model’ is. And since that section
starts by explaining implementation details (names of parameters not
provided by the JVM, need of QDox), I’m inclined to skip it all. After
all, I’m not a developer, but a user of the library.


> And http://xmlgraphics.apache.org/fop/trunk/events.html#plug-ins

That section starts with “The event subsystem is extensible.” I don’t
want to extend it, I just want to use it!

And nowhere is it written that something must also be added to the build
file. This is all confusing.

But I guess I shall feel free to improve it as I see fit.


Sorry for the rant, but I had to get that off my chest.
Vincent


> On 28.10.2009 12:09:24 Vincent Hennebert wrote:
>>> Author: jeremias
>>> Date: Wed Oct 28 09:09:55 2009
>>> New Revision: 830476
>>>
>>> URL: http://svn.apache.org/viewvc?rev=830476&view=rev
>>> Log:
>>> Added missing EventModelFactory to avoid error:
>>>   java.lang.IllegalStateException: Event model doesn't contain the 
>>> definition for org.apache.fop.accessibility.AccessibilityEventProducer
>>>
>>> Modified:
>>> 
>>> xmlgraphics/fop/trunk/src/java/META-INF/services/org.apache.fop.events.model.EventModelFactory
>>> 
>>> xmlgraphics/fop/trunk/src/java/org/apache/fop/accessibility/AccessibilityEventProducer.java
>> How was I supposed to know that those files needed to be modified? Where
>> is the documentation for that? I took the FOValidationEventProducer
>> interface as a model and it didn’t have that EventModelFactory
>> sub-class. And when I tested my change by crafting an input file that
>> would trigger the event I didn’t get that IllegalStateException. I just
>> got the error message relating to the event as expected. The reason why
>> still is a mystery for me.
>>
>>
>> Vincent
> 
> 
> 
> 
> Jeremias Maerki
> 


Re: svn commit: r830476 - in /xmlgraphics/fop/trunk/src/java: META-INF/services/org.apache.fop.events.model.EventModelFactory org/apache/fop/accessibility/AccessibilityEventProducer.java

2009-10-28 Thread Vincent Hennebert
> Author: jeremias
> Date: Wed Oct 28 09:09:55 2009
> New Revision: 830476
> 
> URL: http://svn.apache.org/viewvc?rev=830476&view=rev
> Log:
> Added missing EventModelFactory to avoid error:
>   java.lang.IllegalStateException: Event model doesn't contain the definition 
> for org.apache.fop.accessibility.AccessibilityEventProducer
> 
> Modified:
> 
> xmlgraphics/fop/trunk/src/java/META-INF/services/org.apache.fop.events.model.EventModelFactory
> 
> xmlgraphics/fop/trunk/src/java/org/apache/fop/accessibility/AccessibilityEventProducer.java

How was I supposed to know that those files needed to be modified? Where
is the documentation for that? I took the FOValidationEventProducer
interface as a model and it didn’t have that EventModelFactory
sub-class. And when I tested my change by crafting an input file that
would trigger the event I didn’t get that IllegalStateException. I just
got the error message relating to the event as expected. The reason why
still is a mystery for me.


Vincent


Re: [VOTE][RESULT] Merge the Temp_Accessibility Branch Back to Trunk

2009-10-27 Thread Vincent Hennebert
Time to sum up. So we have 6 +1, no other vote. The vote passes.

I’ll proceed with the merge shortly. If you could refrain from
committing anything to the Trunk until then, I would be grateful.


Thanks,
Vincent


Vincent Hennebert wrote:
> Hi,
> 
> Work on PDF accessibility is basically done. There are still some tests
> to perform and maybe a few tweaks here and there, but the main
> functionality is in place.
> 
> So I’d like to start a vote for merging the branch back to the Trunk:
> https://svn.eu.apache.org/repos/asf/xmlgraphics/fop/branches/Temp_Accessibility
> 
> The vote will last the usual 3 days but, since it’s a non-trivial new
> feature, if any committer would like more time to review it, feel free
> to say so and we can extend the vote to 1 week.
> 
> Attached is the diff between the branch and the Trunk, if this is of any
> help.
> 
> +1 from me.
> 
> Thanks,
> Vincent
> 


Re: Event system

2009-10-27 Thread Vincent Hennebert
Hi,

Jeremias Maerki wrote:
> http://wiki.apache.org/xmlgraphics-fop/ProcessingFeedback is from the
> design phase and lists the reasons:
> - type safety
> - check mechanism to detect missing translations
> - check mechanism to make sure all necessary parameters are really dealt
> with, especially when calling the same event from multiple places.
> 
> I concede that overall the whole thing might look complex, but the end
> result makes for quite clean code on the message production side.
> 
> Adding a new event isn't a big deal at all:
> 1. Add a new method to the EventProducer interface
> 2. Run the "resourcegen" task to update the model and translations
> 3. Fill in the translation
> 
> I believe the whole thing worked out quite nicely. Only recently did I
> have a chance to make use of the event subsystem on a project where I
> needed to detect certain layout problems. That was easily done and works
> nicely. I sometimes think this whole mechanism would even warrant it own
> Apache Commons subproject.
> 
> Design discussions:
> http://markmail.org/thread/bkfrub4334pcmrjd

There is a fundamental flaw in the current design IMO: it’s not well
integrated with Java’s exception-handling mechanism. It’s not the job of
the event broadcaster to throw the exception, it’s the job of the client
code, using the normal exception mechanism. And it’s up to some
higher-level object to catch the exception (or not), route it to the
event notification system, and stop the application with an error code.

All that needs to be ensured is that a localized message can be
associated to the exception. Wether the exception should be an unchecked
exception or not is a design decision pertaining to the client code.
Also, whether there should be a dedicated severity level (fatal) for
exceptions is IMO debatable. I’d say that it’s not needed. The fact that
the application stops or not should be clear enough.

At the time of that discussion that was not entirely clear to me yet.


Vincent


Re: Generation of *EventProducer.xml [was: Re: svn commit: r828747...]

2009-10-27 Thread Vincent Hennebert
Hi,

Jeremias Maerki wrote:
> On 22.10.2009 19:36:14 Vincent Hennebert wrote:
>> Hi,
>>
>>> Log:
>>> Issue an error when attempting to render an intermediate XML file in 
>>> accessibility mode, but that file wasn't generated with accessibility 
>>> (i.e., does not contain the structure tree)
>>>
>>> Added:
>>> 
>>> xmlgraphics/fop/branches/Temp_Accessibility/src/java/org/apache/fop/accessibility/AccessibilityEventProducer.java
>>>(with props)
>>> 
>>> xmlgraphics/fop/branches/Temp_Accessibility/src/java/org/apache/fop/accessibility/AccessibilityEventProducer.xml
>>>(with props)
>> After creating the AccessibilityEventProducer.xml file and running ‘ant
>> resourcegen’ I discovered that an empty message had been added to
>> src/java/org/apache/fop/events/EventFormatter.xml. Why?
> 
> Because the new files wasn't reflected in the build. All events not
> specifically directed into a special file go into the catch-all file in
> the events package. I've updated the build accordingly:
> http://svn.apache.org/viewvc?rev=828805&view=rev

I have trouble seeing the necessity of adding something to the build
file I must say. (That build file, BTW, is already 1441 lines long. We
should think twice before adding anything to it IMO.)

All eventResourceGenerator tasks are exactly the same. Couldn’t we set
the convention that the translation file corresponding to a certain
EventProducer interface must have the same name and be in the same
directory as the interface itself?

Example: a PDFEventProducer.java file is found in the
org/apache/fop/render/pdf directory; a PDFEventProducer.xml file
containing the translation is expected in that same directory.

Using a catch-all file kills the modularity of the thing IMO. Also, the
individual translation files are called *EventProducer.xml but the
catch-all file is called EventFormatter.xml!?


>> Also, after re-building FOP I regularly find myself with modified
>> *EventProducer.xml files, where the sole modification is an
>> added/removed line break. This is annoying. How can that be avoided?
> 
> These are small differences in behaviour of XML serializers. I guess if
> that is so annoying, we'd have to make sure we always use the same
> serializer (make & version) somehow. We could also experiment with
> removing the XML declaration [1] at the beginning of the file. That
> might get rid of the problem but that's not for sure. I've stumbled over
> this myself a number of times but found it to be only a minor nuisance
> which is why I didn't do anything about it.
> 
> [1] 
> http://java.sun.com/j2se/1.4.2/docs/api/javax/xml/transform/OutputKeys.html#OMIT_XML_DECLARATION

I can see the interest of filling the translation file with empty
messages having correct keys (those are not exactly trivial —although
necessary, I guess). However, there is IMO a non-negligible danger that
the user then forgets to fill in those messages appropriately.

Also, I’m not sure I like having a file that is both manually edited and
automatically generated. That usually doesn’t go well together, as the
automatic generation usually messes up any manual formatting. The above
is an illustration.

If there were the convention that the translation file must be put in
the same package as the EventProducer interface, the key wouldn’t need
to be fully qualified, only the method name would be necessary. Then
I think it’s reasonable to expect the user to fill in the translation
file accordingly, and just check at build time that both the interface
and the translation file are consistent.

Is there anything wrong with that?

Vincent


Re: Bug report - reading config xml

2009-10-26 Thread Vincent Hennebert
Hi Domján,

Domján Gergő wrote:
> Hi Chris,
> 
> Thank You for Your answer.
> I would like to make clear, that it was not me, who added this "pageSize" 
> element arbitrarily to the config file, it was in the sample config xml I 
> downloaded with the FOP package 0.95 from the Apache FOP website.
> So it is definitely a mistake.
> 
> There is two solution for the problem:
> 1. to delete the "pageSize" element from the sample config xml on the Apache 
> web site
> 2. to add this functionality really to the FOP application.
> 
> As You wrote, the second is also not a big deal (or: "fairly easy"), to 
> someone, who has the development environment for Java installed and who has 
> some experience with Java programming.
> 
> Sadly, I have none of them, so I was wondering, if I could ask someone from 
> the Apache FOP community and who would be so kind to do this (maybe not only) 
> for me.

I’m afraid there is no interest (to my knowledge) in the current
development team in maintaining the text renderer. Our resources are
very sparse and we prefere to concentrate on parts that are more core to
the FOP project. If the text renderer is ever broken by improvements to
the code, it’s probable that it’s not going to be fixed.

Sorry, if you really need this functionality you will have to give it
a go yourself. But I’m not even sure that would be helpful anyway. What
do you want to do? Maybe you could describe your issue in more details
on the fop-users list, and we could give you some help or suggest
alternative ways to produce your text output. Using the well-supported
PDF output and converting it to text with some PDF post-processing tool
is probably going to give better results, for example. Have a look at
the following thread:
http://markmail.org/thread/y7k2awwwxuihyw2y


HTH,
Vincent


> Thank You in advance
> 
> Gergo
> 
> 
> 
> Chris Bowditch  írta: 
> 
> 
>> Domján Gergő wrote:
>>> Hi Everyone,
>> Hi Domjan,
>>
>>>
>>> I think, I have found a bug.
>>> I want to use FOP to convert xsl-fo files into txt files with layout.
>>>
>>> I experienced, that the layout is distorted because the max column 
>>> number is set to 80 by default. I wanted to change this value using a 
>>> config xml ( fop -c cfg.xml),
>>> so I took the example config xml (distribution 0.95, 
>>> {fop-dir}/conf/fop.xconf).
>>>
>>> It contains this section:
>>>
>>> 
>>> 
>>> 
>>> 
>>>
>>> 
>>>
>>>
>>> This would be exactly what I need, and because this section, I think, 
>>> FOP is supposed to read this information from the configuration xml and 
>>> not only to work with the default value.
>>> The problem is, that the application is not reading this config 
>>> information from the config file.
>> That's right. It is not sufficient to simply add new XML elements into 
>> the config and hope that FOP somehow applies this value to the output. 
>> You have to make FOP read the element by changing the code. This is done 
>> fairly eaily in the configure method of class TXTRendererConfigurator. 
>> Then you have to make sure the value extracted there is applied to the 
>> output which in some cases is less trivial.
>>
>>> Than I validated the config file I built, and according to the latest 
>>> .xsd 
>>> (http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/foschema/fop-configuration.xsd,
>>>  
>>> No 731248),
>>> it is not valid, the "pageSize" element does not exist.
>>>
>>> Can You please advise, how to set default column number to higher or 
>>> could anyone please add this functionality/fix this bug?
>> Chris
>>
>>>
>>> Thank You
>>>
>>> doger


Re: Problem in PageBreakingAlgorithm Constructor

2009-10-26 Thread Vincent Hennebert
Hi Alexander,

That piece of code didn’t make sense, in addition to being wrong. It’s
up to the user to add stretching to the footnote separator if they ever
want to. Actually that code could even have led to unexpected results in
some cases.

Since it wasn’t breaking any layout test case, I removed it altogether.

Thanks,
Vincent


Alexander Kiel wrote:
> Hi,
> 
> the constructor of the class PageBreakingAlgorithm looks like this:
> 
> public PageBreakingAlgorithm(LayoutManager topLevelLM,
>  PageProvider pageProvider,
>  PageBreakingLayoutListener
> layoutListener,
>  int alignment, int alignmentLast,
>  MinOptMax footnoteSeparatorLength,
>  boolean partOverflowRecovery, boolean
> autoHeight,
>  boolean favorSinglePart) {
> super(alignment, alignmentLast, true, partOverflowRecovery, 0);
> this.topLevelLM = topLevelLM;
> this.pageProvider = pageProvider;
> this.layoutListener = layoutListener;
> best = new BestPageRecords();
> this.footnoteSeparatorLength = (MinOptMax)
> footnoteSeparatorLength.clone();
> // add some stretch, to avoid a restart for every page
> containing footnotes
> if (footnoteSeparatorLength.min == footnoteSeparatorLength.max)
> {
> footnoteSeparatorLength.max += 1;
> }
> this.autoHeight = autoHeight;
> this.favorSinglePart = favorSinglePart;
> }
> 
> The problem is the line:
> 
> footnoteSeparatorLength.max += 1;
> 
> I think it should read rather:
> 
> this.footnoteSeparatorLength.max += 1;
> 
> Clients calling the constructor shouldn't be happy about this situation.
> 
> I discovered this statement while refactoring the MinOptMax class into
> an immutable one. I think this refactoring project should be another
> mail. But this example shows how valuable a immutable MinOptMax would
> be.
> 
> Can someone familiar with this part of FOP write a test which fails
> against this current behavior? I could than use this test to verify that
> my immutable MinOptMax works with this part.
> 
> 
> Thanks
> Alex
> 


Re: Update qdox-1.6.3.jar to qdox-1.10.jar

2009-10-26 Thread Vincent Hennebert
Hi Alexander,

Alexander Kiel wrote:
> Hi,
> 
> can we update QDox from 1.6.3 to 1.10? 
> 
> I have an issue with QDox 1.6.3: It can't parse my Fixed16 class (which
> I have attached to this mail). The problem is in line 35:
> 
> private static final float DENOMINATOR = (float) (1 << 14);
> 
> It can't parse the shift operator.
> 
> QDox 1.10 works fine.

I’d suggest you to update locally, and we will include the update when
integrating your patch. We must make sure that no backwards
incompatibility has been introduced in newer versions.


> Best Regards
> Alex

Thanks,
Vincent


Inserting Empty Elements Into the Structure Tree [was: Re: [VOTE] Merge the Temp_Accessibility Branch Back to Trunk]

2009-10-23 Thread Vincent Hennebert
Vincent Hennebert wrote:
> Hi,
> 

>> There's another side-effect to tagged PDF: It allows for better text
>> extraction from the document. PDF even describes ways to make
>> round-trips from XML -> PDF -> XML -> PDF if certain conditions were met.
>> However, we don't do that.
> 
> Speaking of that, the current code doesn’t insert empty elements (like
> ) into the structure tree. The corresponding StructElem
> object /is/ created, but is not linked to its parent. Actually it’s
> present in the PDF without being referred to by any other object.
> I think this is inconsistent, and actually wrong since that would cause
> a loss of information possibly needed by a round-trip transformation.
> I’m going to change that.

I mean, /at some point/ I’m going to change that...

This is not as easily done as it is said. Take the following example:

  Before the empty block.
  
  After the empty block.


What basically happens currently is that two text drawing requests are
made to the PDF renderer. The renderer creates the appropriate PDF
stream and registers the pieces of text as children of the structure
element corresponding to the outer block. But nothing happens regarding
the inner empty block, since obviously there’s nothing to do.

The structure element for the inner empty block can’t be added to the
outer block’s children at creation time, otherwise the logical order
wouldn’t be followed.

>From the quick look I had this is a fundamental limitation of the
current approach. There’s no way to know at which place an empty element
must be inserted into the children list of its parent.

The only way to solve this issue probably is to integrate the handling
of the logical structure into the whole processing chain, passing the
suitable information from the FO tree to the layout engine to the area
tree to the renderer. Probably something that should have been done from
the beginning but this is all but trivial.

Vincent


Re: [VOTE] Merge the Temp_Accessibility Branch Back to Trunk

2009-10-23 Thread Vincent Hennebert
Hi,

Just a few precisions:

Jeremias Maerki wrote:
> On 22.10.2009 21:15:40 Simon Pepping wrote:

>> Can you summarize what the branch tries to achieve?
> 
> I'll try. In short: it provides the Tagged PDF feature that some people
> have always wanted.
> 
> Long story: Without the accessibility/document structure feature, FOP
> simply produces pages with visual content. Visually impaired people need
> tools like a screen reader to read document to them. For that the reader
> needs to know which parts of a page are important and which are not, and
> in which order the elements should be read. It needs to know that a
> sentence continues on the next page without stumbling over the page
> footer in the middle of the sentence.

This is something that the branch doesn’t actually do yet... The
header/footer will be read at every new page, in the middle of the
sentence.
I don’t know yet how to fix that, and I’m not sure if that should be
done blindly anyway. It could be imagined that in some elaborate layouts
the side-regions have content that the author wants to be read aloud.



> There's another side-effect to tagged PDF: It allows for better text
> extraction from the document. PDF even describes ways to make
> round-trips from XML -> PDF -> XML -> PDF if certain conditions were met.
> However, we don't do that.

Speaking of that, the current code doesn’t insert empty elements (like
) into the structure tree. The corresponding StructElem
object /is/ created, but is not linked to its parent. Actually it’s
present in the PDF without being referred to by any other object.
I think this is inconsistent, and actually wrong since that would cause
a loss of information possibly needed by a round-trip transformation.
I’m going to change that.



>>> The vote will last the usual 3 days but, since it???s a non-trivial new
>>> feature, if any committer would like more time to review it, feel free
>>> to say so and we can extend the vote to 1 week.
>> Can you make that 3 working days?
> 
> Does that imply you don't work 7 days a week? ;-) Working days are what
> we usually apply here, don't we?

Errr... no. At least it’s just by chance if all the votes I’ve launched
so far turned out to last 3 working days. I usually just wait that most
active committers have voted. Speaking of working days doesn’t make much
sense to me anyway since not all committers work on FOP in their day
jobs. Some of them may actually be more active at week-ends.

All that said, I’m happy to make the vote last longer as Simon
requested. And to ensure that it lasts at least 3 working days from now
on.


Vincent


Generation of *EventProducer.xml [was: Re: svn commit: r828747...]

2009-10-22 Thread Vincent Hennebert
Hi,

> Log:
> Issue an error when attempting to render an intermediate XML file in 
> accessibility mode, but that file wasn't generated with accessibility (i.e., 
> does not contain the structure tree)
> 
> Added:
> 
> xmlgraphics/fop/branches/Temp_Accessibility/src/java/org/apache/fop/accessibility/AccessibilityEventProducer.java
>(with props)
> 
> xmlgraphics/fop/branches/Temp_Accessibility/src/java/org/apache/fop/accessibility/AccessibilityEventProducer.xml
>(with props)

After creating the AccessibilityEventProducer.xml file and running ‘ant
resourcegen’ I discovered that an empty message had been added to
src/java/org/apache/fop/events/EventFormatter.xml. Why?

Also, after re-building FOP I regularly find myself with modified
*EventProducer.xml files, where the sole modification is an
added/removed line break. This is annoying. How can that be avoided?


Thanks,
Vincent


Re: svn commit: r827023 - /xmlgraphics/fop/trunk/src/java/org/apache/fop/cli/InputHandler.java

2009-10-21 Thread Vincent Hennebert
Hi Simon,

Simon Pepping wrote:
> On Tue, Oct 20, 2009 at 10:04:09AM -, vhenneb...@apache.org wrote:
>> Author: vhennebert
>> Date: Tue Oct 20 10:04:09 2009
>> New Revision: 827023
>>
>> URL: http://svn.apache.org/viewvc?rev=827023&view=rev
>> Log:
>> Fixed checkstyle issues.
>> Factorized duplicated code into getXMLReader method.
>>
>> Modified:
>> xmlgraphics/fop/trunk/src/java/org/apache/fop/cli/InputHandler.java
>>
>>  /**
>> - * Create a catalog resolver and use it for XML parsing and XSLT URI 
>> resolution
>> + * Creates a catalog resolver and use it for XML parsing and XSLT URI 
>> resolution.
>>   * Try the Apache Commons Resolver, and if unsuccessful,
>>   * try the same built into Java 6
>>   */
> 
> The above text uses inconsistent language: creates, use, try.

Right. The original purpose of the change was to add the missing period
at the end of the first sentence, then I saw the missing ‘s’ [1] and
mechanically added it without checking the rest.

[1] http://java.sun.com/j2se/javadoc/writingdoccomments/index.html#styleguide


Thanks for double-checking,
Vincent


License Headers in Test Files? [was: Re: svn commit: r827725 - in /xmlgraphics/fop/branches/Temp_Accessibility/test: accessibility/ accessibility/pdf/ resources/images/]

2009-10-20 Thread Vincent Hennebert
Hi,

In doubt, I added them, but is it actually necessary to put license
headers in such test files? The header is bigger than the actual
content...

Vincent


> Added: 
> xmlgraphics/fop/branches/Temp_Accessibility/test/accessibility/background-image_jpg_repeat.fo
> URL: 
> http://svn.apache.org/viewvc/xmlgraphics/fop/branches/Temp_Accessibility/test/accessibility/background-image_jpg_repeat.fo?rev=827725&view=auto
> ==
> --- 
> xmlgraphics/fop/branches/Temp_Accessibility/test/accessibility/background-image_jpg_repeat.fo
>  (added)
> +++ 
> xmlgraphics/fop/branches/Temp_Accessibility/test/accessibility/background-image_jpg_repeat.fo
>  Tue Oct 20 16:24:44 2009
> @@ -0,0 +1,34 @@
> +
> +
> +
> +http://www.w3.org/1999/XSL/Format";>
> +  
> + +  page-height="220pt" page-width="320pt" margin="10pt">
> +  
> +
> +  
> +  
> + text-align="justify">
> +  Apache FOP (Formatting Objects Processor) is a print 
> formatter driven by XSL 
> +formatting objects (XSL-FO) and an output independent formatter. It 
> is a Java application 
> +that reads a formatting object (FO) tree and renders the resulting 
> pages to a specified 
> +output.
> +
> +  
> +


Initial Values of Variables [was: Re: svn commit: r825875 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop/cli: CommandLineOptions.java InputHandler.java]

2009-10-20 Thread Vincent Hennebert
Hi,

Just a nit:

> +private boolean useCatalogResolver = false;
> +private EntityResolver entityResolver = null;
> +private URIResolver uriResolver = null;

Those fields are being initialized to their default values. The Java
Language Specification states [1] that every field must be initialized
with a default value, basically 0 for numbers, false for booleans, and
null for objects. So explicitly initializing them with their default
values is just noise.
I’d like to suggest everyone to remove those unnecessary initializations
in the future.

[1] 
http://java.sun.com/docs/books/jls/third_edition/html/typesValues.html#4.12.5

Thanks,
Vincent


Re: When is PDFImageHandlerGraphics2D used?

2009-10-16 Thread Vincent Hennebert
Hi Jeremias,

Thanks for the hint. I tried with a WMF image and found out that this
class is used by both the renderer and the painter. I’ve just
implemented accessibility support in it.

Vincent


Jeremias Maerki wrote:
> And you can use a WMF (Windows Metafile) file. See
> org.apache.fop.image.loader.batik.ImageConverterWMF2G2D.java
> 
> On 15.10.2009 18:30:46 Jeremias Maerki wrote:
>> Hi Vincent
>>
>> That would be Barcode4J, for example. If the latest code from
>> Barcode4J's CVS HEAD is used, ImageConverters get registered. One of
>> those can convert Barcode XML to Graphics2D which should be picked over
>> the Barcode XML -> SVG implementation in the PDF case. But that will
>> only be triggered with the new IF. With the renderer, I assume (without
>> checking) the XMLHandler interface will be used.
>>
>> On 15.10.2009 17:51:01 Vincent Hennebert wrote:
>>> Hi,
>>>
>>> I tried every image format I could think of, none triggers the use of
>>> PDFImageHandlerGraphics2D. AFAIU this is the only image handler that
>>> doesn’t call PDFRenderer.placeImage or renderDocument in return, in
>>> which accessibility is handled. So if that handler is used the generated
>>> PDF will be invalid.
>>>
>>> What type of images is that handler used for?
>>>
>>> Thanks,
>>> Vincent
>>
>>
>>
>> Jeremias Maerki
> 
> 
> 
> 
> Jeremias Maerki
> 


When is PDFImageHandlerGraphics2D used?

2009-10-15 Thread Vincent Hennebert
Hi,

I tried every image format I could think of, none triggers the use of
PDFImageHandlerGraphics2D. AFAIU this is the only image handler that
doesn’t call PDFRenderer.placeImage or renderDocument in return, in
which accessibility is handled. So if that handler is used the generated
PDF will be invalid.

What type of images is that handler used for?

Thanks,
Vincent


Re: Regular expression use

2009-10-08 Thread Vincent Hennebert
Hi Jonathan,

Jonathan Levinson wrote:
> I'm sure someone has mentioned it already but what about the lexer support in 
> ANTLR?
> 
> http://www.antlr.org/wiki/display/ANTLR3/FAQ+-+Lexical+analysis
> 
> ANTLR is available under the BSD license, which seems to be one with no 
> strings attached:
> 
> http://www.antlr.org/license.html

Basically we’re back to the same discussion as about the parser
generator, this time at the lexer level.
http://markmail.org/thread/64rmyl7x4nyoxhh3

Among the tools mentioned in the above thread, it would be good to know
which ones allow to use the lexer independently of the parser. Unless we
decide to use both the lexer and parser anyway...


Vincent


> Best Regards,
> Jonathan S. Levinson
> 
> -Original Message-
> From: Vincent Hennebert [mailto:vhenneb...@gmail.com] 
> Sent: Wednesday, October 07, 2009 6:51 AM
> To: fop-dev@xmlgraphics.apache.org
> Subject: Re: Regular expression use
> 
> Hi Jonathan,
> 
> Jonathan Levinson wrote:
>> I noticed that if one is not careful in one's regular expression use,
>> the compilation for a regular expression can take minutes.  I'm not
>> talking about applying the pattern just compiling it!
>>
>>  
>>
>> Should regular expressions be avoided altogether and should one use
>> hand-crafted state machines for parsing, and tokenizing, or can regular
>> expressions be used as long as one is careful?  
> 
> I’d say, use regular expressions as long as they are not too complex.
> But I guess you’re mentioning that in the context of property parsing,
> in which case I don’t think regular expressions are the ultimate answer.
> A proper lexer is likely to be needed, either generated or written by
> hand. As the latter solution quickly becomes a maintenance nightmare,
> some lexer generator will probably be needed. Question remains, which
> one, and I’m not even sure there’s one that exists whose license is
> ASLv2-compatible. Plus there are some issues specific to property
> parsing, like shorthands (which should ideally re-use the parsers of the
> individual properties), sub-properties, etc.
> 
> 
> Vincent


Re: NPE when using non-base14 font via IF XML

2009-10-07 Thread Vincent Hennebert
Hi Alexander,

Alexander Kiel wrote:
> Hi Vincent,
> 
>> To reproduce: put the config file at the root of a FOP local copy, then
>> run the following:
>> fop -c config.xconf test.fo -if if.xml
>> fop -c config.xconf -ifin if.xml test.pdf
> 
> I would like to run your example this way, but there is no fop.sh. Is
> there such a thing for the Linux guys or should I write one?

The script is called just fop. Look at the root of the project, it’s
actually a shell script.
http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/fop?view=log


> Best Regards
> Alex

Vincent


Mimic a specific output with the intermediate format [was: Re: NPE when using non-base14 font via IF XML]

2009-10-07 Thread Vincent Hennebert
While I'm at it, and taking the same FO file and config file: specifying
a mime type for the intermediate format doesn’t seem to work, contrary
to the area tree. Take the config file, remove the part corresponding to
the intermediate format (

NPE when using non-base14 font via IF XML

2009-10-07 Thread Vincent Hennebert
Hi,

If I render the attached FO file into IF XML with the attached
configuration file, then render the xml file into PDF, then I get the
following error:
SEVERE: Exception
java.lang.NullPointerException: fontName must not be null
at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:239)
at org.apache.fop.cli.IFInputHandler.renderTo(IFInputHandler.java:77)
at org.apache.fop.cli.Main.startFOP(Main.java:174)
at org.apache.fop.cli.Main.main(Main.java:205)
Caused by: java.lang.NullPointerException: fontName must not be null
at org.apache.fop.render.pdf.PDFPainter.getTypeface(PDFPainter.java:246)
at org.apache.fop.render.pdf.PDFPainter.drawText(PDFPainter.java:269)
at
org.apache.fop.render.intermediate.IFParser$Handler$TextHandler.endElement(IFParser.java:487)
at
org.apache.fop.render.intermediate.IFParser$Handler.endElement(IFParser.java:277)
at
org.apache.xalan.transformer.TransformerIdentityImpl.endElement(TransformerIdentityImpl.java:1101)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown 
Source)
at org.apache.xerces.xinclude.XIncludeHandler.endElement(Unknown Source)
at 
org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanEndElement(Unknown Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
at 
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown 
Source)
at
org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:484)
at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:236)
... 3 more

To reproduce: put the config file at the root of a FOP local copy, then
run the following:
fop -c config.xconf test.fo -if if.xml
fop -c config.xconf -ifin if.xml test.pdf

Did I miss anything?

Thanks,
Vincent

http://www.w3.org/1999/XSL/Format";>
  

  

  
  

  Apache FOP (Formatting Objects Processor) is a print formatter driven by XSL 
formatting objects (XSL-FO) and an output independent formatter. It is a Java application 
that reads a formatting object (FO) tree and renders the resulting pages to a specified 
output.

  



  false
  test/resources/fonts/
  

  
null
  
  
flate
ascii-85
  
  

  

  


  

  

  

  



Re: Regular expression use

2009-10-07 Thread Vincent Hennebert
Hi Jonathan,

Jonathan Levinson wrote:
> I noticed that if one is not careful in one's regular expression use,
> the compilation for a regular expression can take minutes.  I'm not
> talking about applying the pattern just compiling it!
> 
>  
> 
> Should regular expressions be avoided altogether and should one use
> hand-crafted state machines for parsing, and tokenizing, or can regular
> expressions be used as long as one is careful?  

I’d say, use regular expressions as long as they are not too complex.
But I guess you’re mentioning that in the context of property parsing,
in which case I don’t think regular expressions are the ultimate answer.
A proper lexer is likely to be needed, either generated or written by
hand. As the latter solution quickly becomes a maintenance nightmare,
some lexer generator will probably be needed. Question remains, which
one, and I’m not even sure there’s one that exists whose license is
ASLv2-compatible. Plus there are some issues specific to property
parsing, like shorthands (which should ideally re-use the parsers of the
individual properties), sub-properties, etc.


Vincent


Re: Writing PDF Documents and other source code parts

2009-10-02 Thread Vincent Hennebert
Hi Alexander,

I can’t really help I’m afraid, as I personally don’t have the necessary
knowledge. It’s probably time to submit what you already have as a patch
attached to a Bugzilla entry:
https://issues.apache.org/bugzilla/enter_bug.cgi?product=Fop
That will allow us to have a look and maybe provide some additional
guidance.

How feasible would it be to write a thin layer on top of your library
that would bridge the gap between it and the current one? That would be
a temporary layer until the PDF code is in turn refactored, allowing you
to keep the new library clean (do we really want write support for
OpenType files??). Refactoring the PDF code now will lead you too far.
Keep concentrated on fonts (as much as possible) for now.

BTW, have you submitted your ICLA? 155 new classes... We’re gonna need
one :-)

Thanks,
Vincent


Alexander Kiel wrote:
> Hi,
> 
> I know my goal is to implement basic OpenType support for FOP. But from
> font subsetting/embedding my eyes touched the actual PDF output
> routines.
> 
> I think, that this module needs refactoring. If you have a look at the
> PDFWritable interface, there is a really ugly method. The method
> outputInline takes an OutputStream and a Writer, which are related to
> each other. The comment says, that the writer is buffered and every time
> out want to write something to the OutputStream, you have to flush the
> Writer first. Thats crude.
> 
> What is really needed is some output interface which is able to do both,
> write chars and write bytes.
> 
> I had also a look at PDFBox regarding writing PDF's. Maybe we shouldn't
> refactor FOP's own, maybe a bit legacy PDF code. But I don't like PDFBox
> code either.
> 
> So I'm a bit helpless now. The problem is, regardless of what code I
> see, let it be:
> 
> TTFSubSetFile 
> 
> Which is all about, reading a TrueType file, taking account of
> some glyph mapping (the glyphs used) and returning a byte array,
> which contains the bytes of a TrueType file with the subset of
> glyphs. This thing extends TTFFile which is about representing a
> TrueType file mixed with all the reading stuff. Here, reading,
> writing and representing some real world object is mixed in a
> really ugly way.
> 
> PDFFactory
> 
> This class does two things: creating and registering PDF objects.
> A factory should only create objects. Than this class has nearly
> 1800 lines of code. Maybe it is a factory of to much things?
> 
> If I look at the method which interests me "makeFontFile" the
> comment says: "Embeds a font.", but the method name is
> "makeFontFile". "makeFontFile" makes sense in a factory. But
> "Embeds a font." hints that this created font file is actually
> embedded in the PDF document. Than this method has nearly 100
> lines of code, which does all sorts of things that I can't
> understand fast. In some line the TTFSubSetFile is created and
> the resulting bytes go into some PDFTTFStream - okay.
> 
> So do not wonder about memory problems. Here you have whole
> 300 kb+ fonts sitting in arrays.
> 
> MultiByteFont
> 
> It seems to me that the MuliByteFont tracks the glyph usage. 
> "getUsedGlyphs", "mapChar", "subSet". I always thought that
> fonts are immutable objects, representing a font program which
> can be used shared all over the application. Enjoy building
> a common font source in FOP!
> 
> I don't know how I should integrate my own code into it. I think here is
> a lot of refactoring necessary in order to get the FOP parts into some
> state here I can integrate new code. 
> 
> But I'm not sure where to start, not sure if here are enough tests. I
> don't know the overall structure. I'm simply a bit helpless.
> 
> I have a nice fonts.opentype package here with 155 classes and 279 tests
> covering 93 % of the classes and 80 % of the lines. I can already read
> all of the TrueType metrics and OpenType kerning info. I have a class of
> every entity of the OpenType spec and a Reader for every such class.
> That means you can test reading every substructure alone. I think that
> this is a really nice API for reading OpenType files.
> 
> So now as I saw what TTFSubSetFile really has to do, I will start adding
> write support for OpenType files. Than I will write some manipulation
> routine which can build a subset of a file. But I don't like so get the
> glyph mapping info for this manipulation from a MultiByteFont which
> should be really immutable.
> 
> I found it sufficient to write a KerningMapBuilder which stuffs kerning
> pairs into a really nice double nested Map construction. As the comment
> on CustomFont#replaceKerningMap says:
> 
> the kerning map (Map, 
> the integers are character codes)
> 
> Such a high specialized, self explaining, problem-oriented data
> structure is spread all over the font system. Know your tools!
> 
> So where to start?
> 
> Best Regards
> Alex
> 


[PDF] Are all images rendered as XObjects?

2009-10-01 Thread Vincent Hennebert
Hi,

Am I right in thinking that images are always rendered into PDF as Image
XObjects, and never as inline images (section 4.8.6 of the PDF
Reference, Third Edition)?

Thanks,
Vincent


Re: Javadoc Codestyle

2009-10-01 Thread Vincent Hennebert
Hi Jeremias,

Jeremias Maerki wrote:
> On 01.10.2009 15:08:55 Vincent Hennebert wrote:
>> Hi Alexander,
>>
>> Alexander Kiel wrote:
>>> Hi,
>>>
>>> do we use ,  or {...@code}? I found all three version. Is there a
>>> Checkstyle for that?
>> Use {...@code}. HTML tags should be avoided as much as possible.
>>
>>
>>> Do we introduce a newline between the Javadoc body and the @param,
>>> @return or @throws clause?
>> Yes.
> 
> I'm sure Vincent wanted to write "Yes, that would be my preference.".
> "We", the project as a whole, have no such rule.

I was under the impression that this was an unwritten convention applied
by every active committer. Having re-checked, this is actually not the
case. Sorry.

I indeed find that a blank line makes the comment clearer. But that’s no
big deal. A helpful javadoc is what really counts.


> Our code conventions are here:
> http://xmlgraphics.apache.org/fop/dev/conventions.html
> plus the Checkstyle configuration which has become a de-facto standard,
> you could say. Everything beyond that is personal preference.
> 
> That said, I'm against over-regulating. Can you actually check that
> blank line in Checkstyle? I don't think so. Going beyond what we already
> have in terms of conventions doesn't make much sense as long as noone
> fixes each and every Checkstyle violation in FOP.
> 
>>> Again I see both:
>>>
>>> /**
>>>  * create the /Font object
>>>  *
>>>  * @param fontname the internal name for the font
>>>  * @param subtype the font's subtype
>>>  * @param basefont the base font name
>>>  * @param encoding the character encoding schema used by the font
>>>  */
>>>
>>> /**
>>>  * Sets the Encoding value of the font.
>>>  * @param encoding the encoding
>>>  */
>>>
>>>
>>> Best Regards
>>> Alex
>> Vincent
> 
> 
> 
> 
> Jeremias Maerki

Vincent


Re: Javadoc Codestyle

2009-10-01 Thread Vincent Hennebert
Hi Alexander,

Alexander Kiel wrote:
> Hi,
> 
> do we use ,  or {...@code}? I found all three version. Is there a
> Checkstyle for that?

Use {...@code}. HTML tags should be avoided as much as possible.


> Do we introduce a newline between the Javadoc body and the @param,
> @return or @throws clause?

Yes.


> Again I see both:
> 
> /**
>  * create the /Font object
>  *
>  * @param fontname the internal name for the font
>  * @param subtype the font's subtype
>  * @param basefont the base font name
>  * @param encoding the character encoding schema used by the font
>  */
> 
> /**
>  * Sets the Encoding value of the font.
>  * @param encoding the encoding
>  */
> 
> 
> Best Regards
> Alex

Vincent


Re: Checkstyle RedundantThrowsCheck

2009-10-01 Thread Vincent Hennebert
Hi Max,

Thanks for your explanation.

Max Berger wrote:
> Hi *,
> 
> this rule is usefull in the case where you use common names for
> attributes (such as x, or y), and accidentially "overwrite" them as
> parameters. This again comes back to the point of readability.
> 
> The same variable name should ALWAYS refer to the same variable / value.
> 
> For setters and constructors this makes sense -> after all, in each of
> these you have a simple assignment, and both variables will carry the
> same value.
> 
> But in most other methods, the parameter you pass is NOT assigned to the
> internal variable, so they actually refer to a different thing, and
> thats where the confusion starts.

You definitely have a point here. But I’ve found that in a majority of
cases, warnings raised by this Checkstyle rule are “false positive”
i.e., correspond to setters that don’t match the simple setField pattern
(Alexander’s examples are good ones). Insofar as this rule creates more
noise than useful warnings, I’d remove it.


> I know modern IDEs can show you which variable you actually refer to,
> but this usually requires selecting the variable or hovering over it,
> which you will not do if you are just reading the code trying to
> understand it.
> 
> However, since we cannot agree to keep the rule, I'll have to be content
> with removing it (which is already done).

Yeah, at least waiting for Max’ explanation before applying the majority
rule would have been good.



Vincent


Re: Confused about checkstyle use

2009-10-01 Thread Vincent Hennebert
Hi Alexander,

Alexander Kiel wrote:
> Hi Vincent,
> 
>> Should the rule be disabled because of that? Having proper javadoc on at
>> least public methods is very important. OTOH, this is actually not
>> something Checkstyle can verify. How many methods in the code base have
>> totally useless comments that are there just to avoid a Checkstyle
>> warning...
>>
>> I think I’d prefer to keep the rule, but wouldn’t veto its removal.
> 
> I don't vote for removal too, I only vote for the right to violate it in
> cases one can't add any useful information in the comment.

Hmmm, I think that once we’ve agreed on a Checkstyle config we really
want to follow, we won’t accept any warning at all. It was my intent to
propose that anyway. I think it’s more annoying to have little yellow
exclamation marks attached to every file that contains Checkstyle
warnings (in Eclipse, at least), than have dull javadoc comments.


Vincent


Re: Confused about checkstyle use

2009-09-30 Thread Vincent Hennebert
Hi Alexander,

Alexander Kiel wrote:
> Hi Max,
> 
> First, I will respect every code style of FOP. Its just a matter of
> discussion.
> 
>>> Really? That means commenting every public method even simple Getters
>>> and Setters?
>> Yes. Simple Getter and Setters are the only place where you can
>> publicly document private variables. (in most cases, comment in the
>> getter and link from the setter)
> 
> Yes thats right. But is this Javadoc better than no Javadoc?
> 
> public class Person {
> 
> /**
>  * Returns the first name of this person.
>  *
>  * @returns the first name of this person.
>  */
> public String getFirstName() {
> return firstName;
> }
> }

Except in the simplest cases like that one, there is always a bit of
additional information that can be added about the variable or its
usage.


>>> Commenting equals(), hashCode() and toString()? I think,
>>> this would be only clutter.
>> /** {...@inheritdoc} */
> 
> In my eyes this is enough clutter. I saw classes in FOP with maybe 10
> methods using this /** {...@inheritdoc} */. It just distracts the eye from
> ready the actual method name. And it adds absolutely no information for
> the source code reader.

That one is indeed there only to make Checkstyle happy. The Javadoc tool
is able to retrieve by itself the javadoc from the redefined method
(Eclipse as well). I wish Checkstyle could do that too. We will be able
to partially solve that when switching to Java 1.5, by using the
@Override annotation.

Should the rule be disabled because of that? Having proper javadoc on at
least public methods is very important. OTOH, this is actually not
something Checkstyle can verify. How many methods in the code base have
totally useless comments that are there just to avoid a Checkstyle
warning...

I think I’d prefer to keep the rule, but wouldn’t veto its removal.


>> would do the trick on those,  UNLESS they implement something which is
>> unexpected (such as the equals methods I recently renamed which did
>> not implement equals) or special (a toString which creates a
>> guaranteed parsable result for example)
> 
> Hmmm. A equals method shouldn't do anything unexpected. But your
> toString() example is a good one. If such standard methods do something
> more as the comment in Object says, that a comment is useful. 
> 
> I think it's the same as on simple public methods like the getter from
> above. If your comment doesn't say anything more than the method name
> says already, I don't want to read it.
> 
> Best Regards
> Alex

Vincent


Re: Best Interface for reading OpenType Files

2009-09-30 Thread Vincent Hennebert
Hi Alexander,

Alexander Kiel wrote:
> Hi Vincent,
> 
>> I see. I had in mind to use OpenTypeDataInputStream as the common
>> interface. It actually makes sense to use ImageInputStream instead.
>> Simpler and just as flexible. That will add a direct dependency on
>> a class in the javax.imageio package, but this is not a problem as it is
>> part of the standard library. That ImageInputStream interface is
>> unfortunately named really.
> 
> What did you mean with your last sentence? That ImageInputStream isn't
> named good?

Yes. AFAICT its methods have nothing to do with images. This interface
should probably have been given a more neutral name.


>> - does the use of serializable objects make sense? What would be more
>>   efficient: re-parsing font data all the time or re-loading
>>   serializable object representation of them?
> You mean the font metrics XML files? I've alwas asking me for what
> propose they are there. No, I don't think, we need this. I really don't
> want to serialize the Advanced OpenType Features! It took me already a
> good amount of code to parse just a bit of it.
 What I meant was to use the java.io.Serializable interface. I don’t
 indeed think XML representations are any useful, apart maybe for
 debugging purpose or to have a more human-readable version of the font
 file.
 IIC there would be next to nothing to do to cache Serializable objects
 on the hard drive and retrieve them?
>>> Hmmm. Ok. But if we want to use Serializable for that, your classes have
>>> to be very stable. Versioning the Serializable stuff is a real burden in
>>> my opinion. So we will need a cache which detects version changes and
>>> invalidate the objects if so. Do you know such a lib?
>> I was thinking that just catching the InvalidClassException when reading
>> the object would be enough to conclude that the cache is no longer valid
>> and must be re-created. Maybe I’m wrong? I must confess that I have no
>> experience with serialization.
> 
> Yes this could work. But I find it always difficult and time consuming
> to design classes for serialization. And reading the serialized version
> is most likely not much faster than reading the actual OpenType file. So
> I would really want to wait until we have a real performance problem.

Sure. Nothing wrong with that.


Thanks,
Vincent


Re: Questionable whether font-shorthand grammar LL(1)

2009-09-30 Thread Vincent Hennebert
Hi Jonathan,

Jonathan Levinson wrote:
> Hi Vincent,
> 
> Excellent ideas!  
> 
> The diagram you drew is extremely useful!
> 
> If the font shorthand sub-language has a grammar that is regular then it also 
> has a grammar that is LL(1).  So recursive descent parsing will work, if 
> there is a regular grammar.
> 
> I think the best way of getting font shorthand to work would proceed in 
> stages:
> 
> 1) First get the current code to properly parse and accept valid font 
> shorthand expressions.  This should be very easy.  The one remaining problem 
> (AFAIK) is the parsing of font-size/line-height where /line-height is 
> optional.   Currently spaces are not allowed around the slash "/" and they 
> should be.  I'm going to try to get to this problem as soon as I have time, 
> probably in a day or so.

The current code predates the switch to Java 1.4 as a minimum
requirement, so couldn’t use the java.util.regex package. Feel free to
make use of regular expressions if you think that will make the job
easier.


> 2) Evaluate which parser or automaton approach is the simplest and produces 
> better error states than the current approach.  
> 3) Implement the approach one has chosen in (2).

Good luck!



Vincent


Re: Questionable whether font-shorthand grammar LL(1)

2009-09-30 Thread Vincent Hennebert
Thanks everyone for your parser suggestions. I believe we should be able
to do without one for the font shorthand, but this is definitely
something to keep in mind if we want to improve the parsing of other
properties.

I’m starting to realise that the most difficult part is probably not so
much the grammar parsing as the lexical analysis. To be continued,
I guess...

Vincent


Laurent Caillette wrote:
> Hi all,
> 
> I've never used SableCC or JavaCC so I cannot compare, but I'm using ANTLR a 
> lot. ANTLR is highly customizable and has a very strong community. It's 
> integrated development environment offers a debugger and visualization of 
> grammar ambiguities. It's not only simple to setup and use, it also offers 
> all the comfort you can reasonably dream of when developing grammars.
> 
> Maybe that a tool like JarJar could reduce the pain of depending on one more 
> library (with all possible conflicts that could happen to FOP users).
> 
> Because code generation has some drawbacks (at least in terms of build 
> complexity) you may be interested by JParsec, which creates parsers 
> dynamically from pure Java code. Disclaimer: never used it.
> http://jparsec.codehaus.org
> 
> Hope this will help you to do a reasonable choice.
> 
> c.
> 
> 
> -Message d'origine-
> De : berger@gmail.com [mailto:berger@gmail.com] De la part de Max 
> Berger
> Envoyé : mardi 29 septembre 2009 13:00
> À : fop-dev@xmlgraphics.apache.org
> Objet : Re: Questionable whether font-shorthand grammar LL(1)
> 
> Hi Vincent,
> 
> 
> 2009/9/29 Vincent Hennebert :
>>> How about specifing the grammer and using a tool such as JavaCC to
>>> generate the actual parser? This way you could focus more complete
>>> grammer and have to spend less time writing the parser.
>> That would be the same as using ANTLR. I feel that this is a bit
>> overkill for just parsing the font shorthand property, although that may
>> prove to be useful for other properties that can accept complex
>> expressions.
>> That said, JavaCC is an interesting suggestion, I didn’t think of it. If
>> a choice had to be made between ANTLR and JavaCC, which one would win?
> 
> ANTLR:
> - easy to use
> - requires runtime linking of jar [1] (a *huge* disadvantage imo)
> 
> JavaCC:
> - very sparse documentation
> - generates standalone java classes
> 
> SableCC:
> - better documentation
> - LGPL (And therefore maybe not feasible, although it would only be
> used at compile time and not runtime)
> 
> [1] http://beust.com/weblog/archives/000145.html
> 
> 
> Max


Re: Checkstyle RedundantThrowsCheck

2009-09-29 Thread Vincent Hennebert
Hi Max,

Max Berger wrote:
> Vincent,
> 
> 
> 2009/9/29 Vincent Hennebert:
>> I started to write my own checkstyle configuration from scratch some
>> time ago, enabling everything that looked important to me. But I’d like
>> to test it a bit more before submitting it.
> 
> Same here. See the checkstyle file for JEuclid as an example.
> 
> http://jeuclid.hg.sourceforge.net/hgweb/jeuclid/jeuclid/file/tip/support/build-tools/src/main/resources/jeuclid/checkstyle.xml
> 
>> Speaking of that, there’s a rule that I would suggest to disable: the
>> HiddenFieldCheck. I don’t really see its benefit. It forces to find
>> somewhat artificial names for variables, where the field name is exactly
>> what I want. Sometimes a method doesn’t have a name following the
>> setField pattern, yet still acts as a setter for Field. This rule would
>> make sense if we were using a Hungarian-like notation for variables
>> (mMember, pParam, etc.), but that’s not the case in FOP.
>> WDYT?
> 
> I like the rule, BUT I am ok with an exception for setters and
> constructors (this is IMO a new option in checkstyle 5):
> http://checkstyle.sourceforge.net/config_coding.html#HiddenField

(Actually this option is available in checkstyle 4.)

But what is the benefit of that rule? I find it annoying, so unless I am
convinced of its usefulness I’d rather disable it.


Vincent


[PDF] Entries in number tree not specified as indirect references

2009-09-29 Thread Vincent Hennebert
Hi,

The StructTreeRoot dictionary must have a ParentTree entry whose type is
a number tree. As explained in Section 3.8.5, “Number Trees” of the PDF
Reference, Third Edition, the Nums entry of a number tree node must be
an array of key-value pairs where value is an indirect reference to the
object associated with the key.

This is not what is done in the current implementation of Logical
Structure in FOP (Temp_Accessibility branch). The value (an array) is
directly stored in the array of key-value pairs instead of being
referenced. So technically the PDF produced is invalid. Acrobat doesn’t
seem to complain, though.

Did I miss anything?
Thanks,
Vincent


Re: Checkstyle RedundantThrowsCheck

2009-09-29 Thread Vincent Hennebert
Hi Max,

Max Berger wrote:
> Alex,
> 
> The checkstyle checks are historically grown, and are therefore
> incomplete. I personally would turn on much more checks for certain
> style issues I like. IMO every option set helps deciding a certain
> factor. So more the more checks the better :)

If you think that the current checkstyle could be improved, then by all
means, do suggest changes.

I started to write my own checkstyle configuration from scratch some
time ago, enabling everything that looked important to me. But I’d like
to test it a bit more before submitting it.

Speaking of that, there’s a rule that I would suggest to disable: the
HiddenFieldCheck. I don’t really see its benefit. It forces to find
somewhat artificial names for variables, where the field name is exactly
what I want. Sometimes a method doesn’t have a name following the
setField pattern, yet still acts as a setter for Field. This rule would
make sense if we were using a Hungarian-like notation for variables
(mMember, pParam, etc.), but that’s not the case in FOP.

WDYT?


> (in short: +1 to your changes).
> 
> Right now we have 3 checkstyle files: 3.5, 4.0, and 5.0, which also
> means the checks would need to be added in all of them (if possible).
> Can we remove any of them? I'd volunteer to modify the ant buildfile
> to support 5.0.
> 
> I'd also vote for dropping 3.5 support, and potentially dropping checkstyle 4.

+1. Let’s avoid redundancy. Checkstyle 5.0 still looks a bit on the
bleeding edge to me, but I’m happy to update my checkstyle plug-in
accordingly.


Vincent


> Max
> 
> 
> 
> 2009/9/26 Alexander Kiel :
>> Hi,
>>
>> why didn't our code style allow unchecked exceptions or subclasses of
>> thrown exceptions in Javadoc?
>>
>> From checkstyle-5.0.xml:
>>
>> 
>>
>>
>>
>> 
>>
>> From "J. Bloch: Effective Java, Second Edition" [1] page 252:
>>
>>> Use the Javadoc @thows tag to document each unchecked exception
>>> that a method can throw, but do not use the throws keyword to
>>> include unchecked exceptions in the method declaration.
>> Every good code I know, documents unchecked exceptions. Take the Java
>> Collections API. Every possible ClassCastException or
>> NullPointerException is documented.
>>
>> Another quote from J. Bloch:
>>
>>> A well-documented list of unchecked exceptions that a method
>>> can throw effectively describes the preconditions for its
>>> successful execution. It is essential that each method's
>>> documentation describe its preconditions [...]
>> I think that everyone can agree with the statements J. Bloch made. So I
>> would strongly vote to allow documenting unchecked exceptions.
>>
>>
>> The second point is not allowing subclasses of exceptions in Javadoc. I
>> don't use this very often, but I have just one example in my mind where
>> this makes sense. If you have a look into
>> java.io.DataInputStream#readByte(), there are both IOException and
>> EOFException documented. EOFException is a subclass of IOException. As
>> you know a normal InputStream.read() returns -1 at EOF but readByte()
>> doesn't. So it's worth documenting that readByte() is throwing a
>> EOFException instead.
>>
>> So I would also vote allowing subclasses.
>>
>>
>> Best Regards
>> Alex
>>
>> [1]: 
>>
>> --
>> e-mail: alexanderk...@gmx.net
>> web:www.alexanderkiel.net


Re: Best Interface for reading OpenType Files

2009-09-29 Thread Vincent Hennebert
Hi Alexander,

Alexander Kiel wrote:
> Hi Vincent,
> 
 Here are my two cents: if you make use of classes in javax.imagio at
 only one place in your font library, then there’s no need to worry about
 creating a more neutral layer. If OTOH you need to use those classes
 everywhere, then it makes sense to use a simplified abstraction layer.
 That abstraction layer could be shipped as a separate module and evolve
 separately. An implementation could be based on imageIO, Apache Commons
 IO (?), your own implementation based on byte arrays for testing
 purpose, etc.
>>> Thanks for that. I think, I will write a OpenTypeDataInputStream which
>>> is not a FilterInputStream, but takes a ImageInputStream as constructor
>>> argument like a FilterInputStream would take a InputStream. This
>>> OpenTypeDataInputStream will be the API for all the Streams on top of
>>> it. So I would have only one point which depends on ImageInputStream.
>> You may want to use a factory a la SAXParserFactory. Although that may
>> go a bit far.
> 
> Hmmm. I don't see the benefit of such a factory here. The
> OpenTypeDataInputStream would look like this:
> 
> public class OpenTypeDataInputStream {
snip/>
> }
> 
> This is the common FilterInputStream pattern. OpenTypeDataInputStream
> only depends on ImageInputStream which is an interface.
> OpenTypeDataInputStream is really simple and straitforward, so that I
> can't imagine different implementations. Except implementations on top
> of other things as ImageInputStream. But than we are at the question, if
> we want ImageInputStream the common interface for different
> implementations (on top of files, streams, byte arrays) or if we want
> OpenTypeDataInputStream to do that. I think that ImageInputStream is the
> right place, because it abstracts from getting bytes and be able to
> seek. OpenTypeDataInputStream on the other hand implements the semantics
> of the common OpenType data types, which are well defined in the
> specification.

I see. I had in mind to use OpenTypeDataInputStream as the common
interface. It actually makes sense to use ImageInputStream instead.
Simpler and just as flexible. That will add a direct dependency on
a class in the javax.imageio package, but this is not a problem as it is
part of the standard library. That ImageInputStream interface is
unfortunately named really.


>> There’s no such thing as IoC container in FOP. I’m not sure how easy it
>> would be to introduce one. Although that would probably be A Good Thing.
>> So do design your font library with IoC in mind.
> 
> Yes, I will. We can use IoC even without a container. And if we want to
> choose one, I have plenty experience with spring.

Good!


> So if I should vote, it would properly vote for spring.

Well I’m not sure I like the abundance of XML in spring actually. POJOs
powaaa! Also, spring may be overkill to just deploy FOP. Anyway, this is
probably a bit early to discuss that. (What do you think of the
following though: http://code.google.com/p/google-guice/ ?)


 - does the use of serializable objects make sense? What would be more
   efficient: re-parsing font data all the time or re-loading
   serializable object representation of them?
>>> You mean the font metrics XML files? I've alwas asking me for what
>>> propose they are there. No, I don't think, we need this. I really don't
>>> want to serialize the Advanced OpenType Features! It took me already a
>>> good amount of code to parse just a bit of it.
>> What I meant was to use the java.io.Serializable interface. I don’t
>> indeed think XML representations are any useful, apart maybe for
>> debugging purpose or to have a more human-readable version of the font
>> file.
>> IIC there would be next to nothing to do to cache Serializable objects
>> on the hard drive and retrieve them?
> 
> Hmmm. Ok. But if we want to use Serializable for that, your classes have
> to be very stable. Versioning the Serializable stuff is a real burden in
> my opinion. So we will need a cache which detects version changes and
> invalidate the objects if so. Do you know such a lib?

I was thinking that just catching the InvalidClassException when reading
the object would be enough to conclude that the cache is no longer valid
and must be re-created. Maybe I’m wrong? I must confess that I have no
experience with serialization.


HTH,
Vincent


Re: Questionable whether font-shorthand grammar LL(1)

2009-09-29 Thread Vincent Hennebert
Hi Max,

Max Berger wrote:
> Hi *,
> 
> I just want to throw in a different idea (you may ignore it if you like):
> 
> How about specifing the grammer and using a tool such as JavaCC to
> generate the actual parser? This way you could focus more complete
> grammer and have to spend less time writing the parser.

That would be the same as using ANTLR. I feel that this is a bit
overkill for just parsing the font shorthand property, although that may
prove to be useful for other properties that can accept complex
expressions.
That said, JavaCC is an interesting suggestion, I didn’t think of it. If
a choice had to be made between ANTLR and JavaCC, which one would win?


> JavaCC is BSD license, so we could easily integrate it in the fop build.
> 
> Max

Thanks,
Vincent



> 2009/9/28 Vincent Hennebert:
>> Hi Jonathan,
>>
>> Interesting stuff!
>>
>> Jonathan Levinson wrote:
>>> Hi Vincent,
>>>
>> 
>>> Because font-variant font-style and font-weight can occur in any order,
>>> I could not (currently) come up with a grammar in which the directing
>>> sets were disjoint for each non-terminal.  So I was unable to come up
>>> with an LL(1) grammar.
>>>
>>> For instance, here are two productions of my attempt at a grammar:
>>>
>>>  -> 
>>>
>>>  -> 
>>>
>>> In each case, the first set of  shares a common
>>> element in two different productions, the literal values for variant.
>>> One needs to look ahead one more token to see if one has a
>>>  or a .
>> (I’ll call “modifier” any of the three style, variant, weight
>> properties.)
>> Taking the ‘normal’ case apart, and since ‘inherit’ is not allowed in
>> the shorthand, I think the values for all modifiers are distinct:
>> ‘italic’, ‘oblique’, ‘backslant’ for font-style, ‘small-caps’ for
>> font-variant, and the various weight values for font-weight.
>>
>> Since all modifiers are set to their initial values prior to the
>> shorthand parsing, which is ‘normal’ for all three of them, I think we
>> can simply ignore any ‘normal’ value found in the string. That is,
>> accept it as a legal terminal but not do anything.
>>
>> So I don’t think there is any ambiguity any more. What remains to be
>> done is to check that the same modifier is not specified more than once
>> (that includes checking that ‘normal’ is not specified more than
>> 3 times). And it’s probably easier to check that at the semantic level
>> instead of crafting special grammar rules.
>>
>>
>> 
>>> The books and web articles I read only discussed using recursive descent
>>> when the grammar is LL(1).  I have the feeling that despite the
>>> ambiguities in the grammar it is almost LL(k) because font-variant and
>>> font-style and font-weight almost have disjoint values.   It is at least
>>> LL(3) and I suspect it is LL(6).
>> The font-size property has the good idea of not allowing ‘normal’ as
>> a value. The ‘normal’ case for modifiers can be ignored as explained
>> above. So I think the grammar still is LL(1)
>>
>>
>> 
>>> I'm not as convinced as you are that recursive descent parsing or a
>>> formal bottom-up-parser will make the code simpler rather than more
>>> complex because of the complexities of a formal grammar.   Of course,
>>> however complex the grammar, a table-generating tool - like ANTLR - will
>>> generate code, however complex, which will faithfully reflect the
>>> inputted grammar.  However, none of the other properties in FOP use a
>>> table-generating tool like ANTLR - and I'm not sure what the
>>> consequences would be to FOP of introducing such a tool.  Given the
>>> complexities of the grammar, I'm sure that a recursive descent parser
>>> will be quite complex, and if we are going to use a grammar driven
>>> approach we would be better off with a tool that generates parsers from
>>> grammars rather than the recursive descent approach.  Also an advantage
>>> of parser generators is that one doesn't have to rewrite so much code to
>>> correct a mistake in one's grammar, if one makes a mistake, or if the
>>> grammar changes.  Recursive descent parsing can pose its own maintenance
>>> nightmares.
>> Using a grammar tool like ANTLR is probably overkill to parse just
>> a shorthand property. Moreover the grammar is not likely to change, so
>> that reduces its usefulness even more. That said, most properties can
>> accept expressions, where such a tool might actually be interesti

Re: When must the structure tree be output in the PDF file?

2009-09-28 Thread Vincent Hennebert
Hi Jeremias,

Jeremias Maerki wrote:

> IFParser is also still missing the parse code for the structure tree. I
> guess I would defer the call to startPageSequence in the IFParser, then
> parse the reduced FO tree using a ContentHandler delegate, set that on
> the user agent and then call startPageSequence when the first page tag
> is encountered.

That’s what I ended up doing.
Thanks for your input,
Vincent


> On 24.09.2009 13:07:11 Vincent Hennebert wrote:
>> Jeremias Maerki wrote:
>>> Not just like that (if at all). The content items being produced inside
>>> the page-sequence have to be linked into the structure tree. There are
>>> links (MCIDs) back and forth between the structure tree and the content
>>> streams. You have to have the structure tree available while you create
>>> the page contents to build up the links. You could probably move the
>>> generation to endPageSequence but you'd end up duplicating some of the
>>> data structures for establishing the links in the process which you'd
>>> then have to map to the PDF library in the end. Not sure if that's what
>>> you want. I don't have this stuff present as much as back when I helped
>>> Jost, so I may be missing something.
>> Ok, then there’s the following problem: when creating the PDF document
>> out of an IF XML file, the structure tree is not yet available at the
>> time PDFDocumentHandler.startPageSequence is called. Indeed in the IF
>> the structure tree is stored as a child of the page-sequence element.
>>
>> Any idea of how to handle this, other than putting an ugly boolean at
>> the beginning of PDFDocumentHandler.startPage, “if structure tree not
>> yet built, then build structure tree”?
>>
>>
>>> On 23.09.2009 13:44:11 Vincent Hennebert wrote:
>>>> To those PDF specialists around here: am I right that the structure tree
>>>> could as well be converted into PDF at the end of a page sequence, as at
>>>> the beginning?
>>>>
>>>> In other words: could the piece of code dealing with the structure tree
>>>> be moved from PDFDocumentHandler.startPageSequence to
>>>> PDFDocumentHandler.endPageSequence?
>>>>
>>>> Thanks,
>>>> Vincent
>>>
>>>
>>>
>>> Jeremias Maerki
>>
>> Thanks,
>> Vincent
> 
> 
> 
> 
> Jeremias Maerki
> 


Re: Questionable whether font-shorthand grammar LL(1)

2009-09-28 Thread Vincent Hennebert
Hi Jonathan,

Interesting stuff!

Jonathan Levinson wrote:
> Hi Vincent,
> 

> 
> Because font-variant font-style and font-weight can occur in any order,
> I could not (currently) come up with a grammar in which the directing
> sets were disjoint for each non-terminal.  So I was unable to come up
> with an LL(1) grammar.
> 
> For instance, here are two productions of my attempt at a grammar: 
> 
>  -> 
> 
>  -> 
> 
> In each case, the first set of  shares a common
> element in two different productions, the literal values for variant.
> One needs to look ahead one more token to see if one has a
>  or a .

(I’ll call “modifier” any of the three style, variant, weight
properties.)
Taking the ‘normal’ case apart, and since ‘inherit’ is not allowed in
the shorthand, I think the values for all modifiers are distinct:
‘italic’, ‘oblique’, ‘backslant’ for font-style, ‘small-caps’ for
font-variant, and the various weight values for font-weight.

Since all modifiers are set to their initial values prior to the
shorthand parsing, which is ‘normal’ for all three of them, I think we
can simply ignore any ‘normal’ value found in the string. That is,
accept it as a legal terminal but not do anything.

So I don’t think there is any ambiguity any more. What remains to be
done is to check that the same modifier is not specified more than once
(that includes checking that ‘normal’ is not specified more than
3 times). And it’s probably easier to check that at the semantic level
instead of crafting special grammar rules.



> The books and web articles I read only discussed using recursive descent
> when the grammar is LL(1).  I have the feeling that despite the
> ambiguities in the grammar it is almost LL(k) because font-variant and
> font-style and font-weight almost have disjoint values.   It is at least
> LL(3) and I suspect it is LL(6).

The font-size property has the good idea of not allowing ‘normal’ as
a value. The ‘normal’ case for modifiers can be ignored as explained
above. So I think the grammar still is LL(1)



> I'm not as convinced as you are that recursive descent parsing or a
> formal bottom-up-parser will make the code simpler rather than more
> complex because of the complexities of a formal grammar.   Of course,
> however complex the grammar, a table-generating tool - like ANTLR - will
> generate code, however complex, which will faithfully reflect the
> inputted grammar.  However, none of the other properties in FOP use a
> table-generating tool like ANTLR - and I'm not sure what the
> consequences would be to FOP of introducing such a tool.  Given the
> complexities of the grammar, I'm sure that a recursive descent parser
> will be quite complex, and if we are going to use a grammar driven
> approach we would be better off with a tool that generates parsers from
> grammars rather than the recursive descent approach.  Also an advantage
> of parser generators is that one doesn't have to rewrite so much code to
> correct a mistake in one's grammar, if one makes a mistake, or if the
> grammar changes.  Recursive descent parsing can pose its own maintenance
> nightmares.

Using a grammar tool like ANTLR is probably overkill to parse just
a shorthand property. Moreover the grammar is not likely to change, so
that reduces its usefulness even more. That said, most properties can
accept expressions, where such a tool might actually be interesting.
I don’t know how far FOP goes to supporting expressions in other
properties.


> The current approach in FOP for font-shorthand is obscurely written but
> strikes me as basically sound.
> 
> 1)  One parses from right-to-left using the fact that spaces divide
> tokens

The problem is that font families can be specified with strings
containing whitespace, that must be handled in a specific manner and not
as a terminal delimitation. Otherwise parsing from right to left would
indeed probably be relatively easy.


> 2)  One lets property makers determine whether they apply to a
> token.  Each property maker is a little parser of the token one feeds
> it.  Because the property makers determine whether they apply to a
> token, one can handle the fact that variant, weight and style can occur
> in any order by feeding the current token to each of the property makers
> for font-variant, font-weight, and font-style in turn.  Whatever they
> accept is ipso-facto a font-variant or a font-weight or font-style.
> 
> Just want to let you know I take the problem seriously, and I'm not
> trying to duck the responsibility of coming up with an adequate
> solution.  I'm not sure what I did fits into a "job priority" which is
> why I spent many hours this weekend on this research.

Thanks for looking into this issue. I hope my comments above can help.

As an alternative: the fact that the shorthand property can be parsed
using a regular expression shows that its grammar is a regular
language [1], thus can also be parsed using an automaton. I’ve quickly
sketched such an automaton tha

Re: omit first table header/last footer

2009-09-28 Thread Vincent Hennebert
Hi Carlos,

Carlos Villegas wrote:
> Hi,
> 
> I searched the mailing lists and it seems that although some people had
> worked at several times at trying to implement retrieve-table-marker,
> it's not yet done. Is somebody working on this? What's the status?

It’s not being worked on at the moment. This is still a missing feature.


> In many use cases omitting the first table header and the last table
> footer will do the trick.
> 
> How easy is this to implement?
> What will be the steps to add such an extension to FOP?
> I just started looking at the code so I'm exploring whether this is
> viable solution.

That might work. You would need to change the
o.a.f.layoutmgr.table.TableContentLayoutManager.getNextKnuthElements
method. There is a “if (getTableLM().getTable().omitHeaderAtBreak())”
test that you could augment with a “&& !(omitFirstHeader)” clause.
Likewise for the footer.

The easiest is to directly modify that class and re-build FOP. A bit
less easy would be to add a variable in the configuration file, so that
you can enable it only for certain FO files. Even less easy would be to
add an extension property to fo:table so that you can enable it only for
some tables of an FO document. Please ask if you need more details.

All that said, such a change would be very hacky and, unless there is
overwhelming demand from the user community, I would oppose to integrate
it in the code base. This is a patch that you would have to maintain on
your side. Better would be of course to actually implement
retrieve-table-marker. Although this would be more involving than
implementing this little trick...


HTH,
Vincent


<    1   2   3   4   5   6   7   8   9   10   >