Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-28 Thread Clay Leeds
> However, if someone actually renamed my variables after I have declared my 
> position, then I would interpret that as "doing bad things to the code". In 
> fact, i would revert such a change.
> 
> If folks aren't willing to respect my style of coding along with my promise 
> to document short names, then I will withdraw my request for a merge and 
> abrogate my ICLA. I would not wish my work to be used in a community that 
> does not have sufficient respect for personal coding styles of contributors 
> that do not in any way vary from documented conventions.

This is a long thread and my hope is that we will be able to resolve it 
amicably, and that our community can grow from this experience. 

The ComplexScripts functionality being merged is a valued and important 
addition to the Apache FOP Project. 

More important than the code to me personally, is the continued community and 
its growth. 

FOP is a community built around a common desire to create a robust and open 
source tool for generating XSL Formatting Output in various common formats for 
print, screen, archival and transfer purposes.  

There is an expressed concern about code readability and reusability. What 
concerns me most is that this may cause division.

I'm concerned about the level of compromise and intent. It may take more of a 
stretch on each or our parts to move forward.

Clay

"My religion is simple. My religion is kindness."
- HH The Dalai Lama of Tibet



RE: Merge Request - Temp_ComplexScripts into Trunk

2011-10-27 Thread Eric Douglas
There is little room for individuality in a coding community.
I'd say it doesn't matter what you call your variables as long as
someone who's never seen the code can understand the purpose through the
name and/or comments, and they conform to any predefined naming standard
for the project.
Ages ago all variables had short names because disk space and/or memory
was at a premium.  Today that shouldn't be an excuse.
Try to avoid overly simplistic names like x1 unless they have overly
simplistic purpose (create, use, destroy within a 5 line method), and
try to avoid overly verbose names.
As long as what you write makes sense I'd agree with you no one should
change the code simply for the sake of personal preference.
You should certainly change it if you don't want someone else to change
it if someone unfamiliar with it can't understand it.
If they're changing your names that should be fine as long as they're
fixing a bug or enhancing something where the name change makes sense to
go with the new logic.
 



From: Glenn Adams [mailto:gl...@skynav.com] 
Sent: Thursday, October 27, 2011 3:56 AM
To: fop-dev@xmlgraphics.apache.org
Subject: Re: Merge Request - Temp_ComplexScripts into Trunk




On Thu, Oct 27, 2011 at 3:41 PM, Simon Pepping 
wrote:


> > > Ninth, spending time changing variable names is a waste of
time when I
> > could
> > > be working on adding support for other scripts.
> >
> > So someone else is going to have to waste all that time
converting those
> > names into more readable ones. That's a bit unfair, isn't
it?
> >
>
> I would advise against anyone wasting their time by changing
my names.
> Indeed, I will likely react very negatively to such an
attempt. What you
> want to do in your code is your business, but don't imagine
you are going to
> start rewriting my code to meet your style. Or at least don't
do so if you
> wish me to be a part of this team.
>
> I would take such an action as a direct affront.


This is a big no. At the moment you hand in your code to FOP, it
belongs to the community. Anyone can touch it. That is where
team
membership kicks in. Team members trust each other not to do bad
things to the code.

>
> If in the indefinite future I am not working on this code,
then feel free to
> change it as you like. In the mean time, I'd appreciate a
little respect.


Respect yes, but not touching it, no.



I did not say or imply hands off, so of course I presume that anyone can
touch it. Why do you insist on reading me otherwise?

However, if someone actually renamed my variables after I have declared
my position, then I would interpret that as "doing bad things to the
code". In fact, i would revert such a change.

If folks aren't willing to respect my style of coding along with my
promise to document short names, then I will withdraw my request for a
merge and abrogate my ICLA. I would not wish my work to be used in a
community that does not have sufficient respect for personal coding
styles of contributors that do not in any way vary from documented
conventions.
 


Simon Pepping





Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-27 Thread Glenn Adams
On Thu, Oct 27, 2011 at 3:41 PM, Simon Pepping wrote:

> > > > Ninth, spending time changing variable names is a waste of time when
> I
> > > could
> > > > be working on adding support for other scripts.
> > >
> > > So someone else is going to have to waste all that time converting
> those
> > > names into more readable ones. That’s a bit unfair, isn’t it?
> > >
> >
> > I would advise against anyone wasting their time by changing my names.
> > Indeed, I will likely react very negatively to such an attempt. What you
> > want to do in your code is your business, but don't imagine you are going
> to
> > start rewriting my code to meet your style. Or at least don't do so if
> you
> > wish me to be a part of this team.
> >
> > I would take such an action as a direct affront.
>
> This is a big no. At the moment you hand in your code to FOP, it
> belongs to the community. Anyone can touch it. That is where team
> membership kicks in. Team members trust each other not to do bad
> things to the code.
> >
> > If in the indefinite future I am not working on this code, then feel free
> to
> > change it as you like. In the mean time, I'd appreciate a little respect.
>
> Respect yes, but not touching it, no.
>

I did not say or imply hands off, so of course I presume that anyone can
touch it. Why do you insist on reading me otherwise?

However, if someone actually renamed my variables after I have declared my
position, then I would interpret that as "doing bad things to the code". In
fact, i would revert such a change.

If folks aren't willing to respect my style of coding along with my promise
to document short names, then I will withdraw my request for a merge and
abrogate my ICLA. I would not wish my work to be used in a community that
does not have sufficient respect for personal coding styles of contributors
that do not in any way vary from documented conventions.


>
> Simon Pepping
>


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-27 Thread Simon Pepping
> > > Ninth, spending time changing variable names is a waste of time when I
> > could
> > > be working on adding support for other scripts.
> >
> > So someone else is going to have to waste all that time converting those
> > names into more readable ones. That’s a bit unfair, isn’t it?
> >
> 
> I would advise against anyone wasting their time by changing my names.
> Indeed, I will likely react very negatively to such an attempt. What you
> want to do in your code is your business, but don't imagine you are going to
> start rewriting my code to meet your style. Or at least don't do so if you
> wish me to be a part of this team.
> 
> I would take such an action as a direct affront.

This is a big no. At the moment you hand in your code to FOP, it
belongs to the community. Anyone can touch it. That is where team
membership kicks in. Team members trust each other not to do bad
things to the code.
> 
> If in the indefinite future I am not working on this code, then feel free to
> change it as you like. In the mean time, I'd appreciate a little respect.

Respect yes, but not touching it, no.

Simon Pepping


RE: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Eric Douglas
I haven't looked at the code in question on this particular discussion
so this is not to criticize.

Overly concise variables names should be acceptable within limited
scope.
Calling an ObjectOutputStream oos may be sufficient when it's created
and destroyed within one little method.
Using i or z may suffice as loop counters within a single simple method,
while you may want a longer name simply to track the loop if it gets
more complex nesting loops.
A project should have defined standards for meaningful variable naming,
particularly when they're declared at the class level or they're public,
protected, or passed in to the method.

The simplest readability standard is of course the layout.  Eclipse has
plenty of preferences and an option to export them.  Line wraps, comment
format, etc should be consistant within a project.

Of course if code must be reused it helps if standard naming can be
enforced by such as abstract methods and interfaces.

-Original Message-
From: Peter Hancock [mailto:peter.hanc...@gmail.com] 
Sent: Wednesday, October 26, 2011 9:34 AM
To: fop-dev@xmlgraphics.apache.org
Subject: Re: Merge Request - Temp_ComplexScripts into Trunk

> I wonder what you think about the code in 
> o.a.f.hyphenation.TernaryTree, where the author apparently did not 
> know Java, and introduces the libc functions strcmp, strcpy, and 
> strlen, and which uses the Java char type (within the String type) for
coding tree pointers!

My apprehension about certain areas of your code (and not the
majority!) stems from such examples, and the headaches theycan
bring.  This is old code that I had no influence over at the time and I
do not want it to have any bearing on where the  project is heading.

> If you wanted to make a serious case against using short names, you 
> would start first by analyzing existing FOP usage and using such an 
> analysis to establish concrete metrics.

I do not think I have focused on the length of variable or member
names have I?  I did a PhD in mathematics and I have a soft spot
for the aesthetic value of short names.  It is always pleasing to
distill a mathematical proof to the simplist form possible and
using consise variable naming is often a part of that.  That said, I
do not think that working codebenefits from this approach:
what can seem like an efficient and powerful piece of code when
written can prove to be an  overly difficult thing to read later.
Unlike yourself, apparently, my memory ain't so good and I benefit from
code that has clear intention.

Peter


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Glenn Adams
On Wed, Oct 26, 2011 at 9:34 PM, Peter Hancock wrote:

> > I wonder what you think about the code in o.a.f.hyphenation.TernaryTree,
> > where the author apparently did not know Java, and introduces the libc
> > functions strcmp, strcpy, and strlen, and which uses the Java char type
> > (within the String type) for coding tree pointers!
>
> My apprehension about certain areas of your code (and not the
> majority!) stems from such examples, and the headaches theycan
> bring.  This is old code that I had no influence over at the time and
> I do not want it to have any bearing on where the  project is heading.
>
> > If you wanted to make a serious case against using short names, you would
> > start first by analyzing existing FOP usage and using such an analysis to
> > establish concrete metrics.
>
> I do not think I have focused on the length of variable or member
> names have I?  I did a PhD in mathematics and I have a soft spot
> for the aesthetic value of short names.  It is always pleasing to
> distill a mathematical proof to the simplist form possible and
> using consise variable naming is often a part of that.  That said, I
> do not think that working codebenefits from this approach:
> what can seem like an efficient and powerful piece of code when
> written can prove to be an  overly difficult thing to read later.
> Unlike yourself, apparently, my memory ain't so good and I benefit
> from code that has clear intention.


Yet you continue to imply that:

short variable names != clear intention

This I must disagree with. I could use long random names and obfuscate
intention. I can uses short names and document intention (in comments). I
have agreed to do the latter. Is that not enough?


> Peter
>


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Peter Hancock
On Wed, Oct 26, 2011 at 2:13 PM, Glenn Adams  wrote:
> Notice also the considerable use of nested classes (and interfaces), which
> tends to make the file longer, but nevertheless encapsulates abstractions in
> smaller units. True, this file could be sub-divided into smaller files, and
> I may yet do that. However, I found it convenient to keep it in one file for
> the initial implementation.

I appreciate that Java does not always help us when striving for well
encapsulated code AND manageable file lengths!

I really do not think you implementation is fundamentally that far off
the mark and the amount of constructive attention it has received has
naturally been proportional to the quantity - something that is very
impressive!

Peter


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Peter Hancock
> I wonder what you think about the code in o.a.f.hyphenation.TernaryTree,
> where the author apparently did not know Java, and introduces the libc
> functions strcmp, strcpy, and strlen, and which uses the Java char type
> (within the String type) for coding tree pointers!

My apprehension about certain areas of your code (and not the
majority!) stems from such examples, and the headaches theycan
bring.  This is old code that I had no influence over at the time and
I do not want it to have any bearing on where the  project is heading.

> If you wanted to make a serious case against using short names, you would
> start first by analyzing existing FOP usage and using such an analysis to
> establish concrete metrics.

I do not think I have focused on the length of variable or member
names have I?  I did a PhD in mathematics and I have a soft spot
for the aesthetic value of short names.  It is always pleasing to
distill a mathematical proof to the simplist form possible and
using consise variable naming is often a part of that.  That said, I
do not think that working codebenefits from this approach:
what can seem like an efficient and powerful piece of code when
written can prove to be an  overly difficult thing to read later.
Unlike yourself, apparently, my memory ain't so good and I benefit
from code that has clear intention.

Peter


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Glenn Adams
While you are at it, Peter, you may also take note that I have made liberal
use of *assert* in the file I reference below (NumberConverter). If we are
going to improve not only understandability but also real quality, how about
a campaign to maximize use of assertions to document code assumptions?

I notice that nobody has pointed out my liberal use of assertions as a
positive for understanding and quality, while instead focusing on the narrow
issue of short symbol name length as a negative. Given that symbol length
has no impact at the JVM layer (except for consuming more runtime memory
resources than shorter symbol names), perhaps we should focus on something
which does have an impact, such as runtime assertion testing.

On Wed, Oct 26, 2011 at 9:13 PM, Glenn Adams  wrote:

> BTW, sometimes I choose to use longer names for local variables: see my
> reimplementation of number to string conversion in
> o.a.f.util.NumberConverter, which is a new (and large) class I added in the
> CS branch. I use a few short names here, but not as many as longer names. So
> you can see that sometimes I find it useful to use longer names. This sort
> of decision (when to use long or short) should be based on an author's
> preferences, and not established by fiat.
>
> Notice also the considerable use of nested classes (and interfaces), which
> tends to make the file longer, but nevertheless encapsulates abstractions in
> smaller units. True, this file could be sub-divided into smaller files, and
> I may yet do that. However, I found it convenient to keep it in one file for
> the initial implementation.
>
>
> On Wed, Oct 26, 2011 at 8:54 PM, Glenn Adams  wrote:
>
>>
>>
>> On Wed, Oct 26, 2011 at 8:36 PM, Peter Hancock 
>> wrote:
>>
>>> >> On Mon, Oct 24, 2011 at 09:05:34PM +0800, Glenn Adams wrote:
>>> > are you claiming my code is not maintainable by other developers? if
>>> so,
>>> > then please prove it objectively; otherwise, let's stop talking about
>>> this,
>>> > and move on with the merge vote
>>> How would one go about proving objectively that code is not maintainable?
>>>
>>
>> My point is that this is a subjective exercise we are having here, and not
>> particularly fruitful. It can be done objectively, or at least more
>> objectively if one wants to take the time to do so. For example, by defining
>> specific objective metrics and tools that measure those metrics against the
>> existing code base and against new code.
>>
>> But instead of doing that, we are presently dealing with argument by
>> innuendo.
>>
>>
>>>
>>> There are many aspects to writing maintainable code, spanning from the
>>> synax level through to the structuring of classes  and modules
>>> (packages).  Importantly we should encourage:
>>> Code reuse - (using trusted libraries, applying the DRY principle)
>>> hard to measure objectively
>>> A consistent style - this may be an emergent aspect of a project and
>>> choosing guidelines at the start or even retrospectively may be too
>>> difficult, but we can largely infer the style from the current state.
>>> An imperfect but consistent style is arguably favorable to
>>> inconsistency.
>>> Idiomatic language usage - applying common solutions that leverage the
>>> constructs of, and the philosophies behind a language (e.g applying OO
>>> design patterns in Java applications).
>>>
>>> I find that writing code that is in keeping with a the style of a
>>> project and using the language as recommended makes it easier to
>>> distill the intention of a piece of code from the implementation and
>>> can lead towards self-documenting code.
>>>
>>> The inner workings of FOP are complex and I think that all efforts to
>>> boost understandability are essential.
>>>
>>
>> It is rather ironic that I find myself being interpreted as somehow trying
>> to decrease coding understandability, or being interpreted as promoting
>> idiomatic or inconsistent usage. You should ask some of the hundred or so
>> developers who have worked under me their opinion about my code. You would
>> come to a different conclusion.
>>
>> I wonder what you think about the code in o.a.f.hyphenation.TernaryTree,
>> where the author apparently did not know Java, and introduces the libc
>> functions strcmp, strcpy, and strlen, and which uses the Java char type
>> (within the String type) for coding tree pointers!
>>
>> I also note the author of this file uses short names for (exposed,
>> non-private) instance variables, such as:
>>
>> protected char[] lo;
>> protected char[] hi;
>> protected char[] eq;
>> protected char[] sc;
>> protected CharVector kv;
>>
>> At least in my case, I use long names for instance variables, even when
>> they are private.
>>
>> If you wanted to make a serious case against using short names, you would
>> start first by analyzing existing FOP usage and using such an analysis to
>> establish concrete metrics. That would allow objective comparisons to be
>> made.
>>
>> G.
>>
>>
>>>
>>> Peter
>>>
>>> On 

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Glenn Adams
BTW, sometimes I choose to use longer names for local variables: see my
reimplementation of number to string conversion in
o.a.f.util.NumberConverter, which is a new (and large) class I added in the
CS branch. I use a few short names here, but not as many as longer names. So
you can see that sometimes I find it useful to use longer names. This sort
of decision (when to use long or short) should be based on an author's
preferences, and not established by fiat.

Notice also the considerable use of nested classes (and interfaces), which
tends to make the file longer, but nevertheless encapsulates abstractions in
smaller units. True, this file could be sub-divided into smaller files, and
I may yet do that. However, I found it convenient to keep it in one file for
the initial implementation.

On Wed, Oct 26, 2011 at 8:54 PM, Glenn Adams  wrote:

>
>
> On Wed, Oct 26, 2011 at 8:36 PM, Peter Hancock wrote:
>
>> >> On Mon, Oct 24, 2011 at 09:05:34PM +0800, Glenn Adams wrote:
>> > are you claiming my code is not maintainable by other developers? if so,
>> > then please prove it objectively; otherwise, let's stop talking about
>> this,
>> > and move on with the merge vote
>> How would one go about proving objectively that code is not maintainable?
>>
>
> My point is that this is a subjective exercise we are having here, and not
> particularly fruitful. It can be done objectively, or at least more
> objectively if one wants to take the time to do so. For example, by defining
> specific objective metrics and tools that measure those metrics against the
> existing code base and against new code.
>
> But instead of doing that, we are presently dealing with argument by
> innuendo.
>
>
>>
>> There are many aspects to writing maintainable code, spanning from the
>> synax level through to the structuring of classes  and modules
>> (packages).  Importantly we should encourage:
>> Code reuse - (using trusted libraries, applying the DRY principle)
>> hard to measure objectively
>> A consistent style - this may be an emergent aspect of a project and
>> choosing guidelines at the start or even retrospectively may be too
>> difficult, but we can largely infer the style from the current state.
>> An imperfect but consistent style is arguably favorable to
>> inconsistency.
>> Idiomatic language usage - applying common solutions that leverage the
>> constructs of, and the philosophies behind a language (e.g applying OO
>> design patterns in Java applications).
>>
>> I find that writing code that is in keeping with a the style of a
>> project and using the language as recommended makes it easier to
>> distill the intention of a piece of code from the implementation and
>> can lead towards self-documenting code.
>>
>> The inner workings of FOP are complex and I think that all efforts to
>> boost understandability are essential.
>>
>
> It is rather ironic that I find myself being interpreted as somehow trying
> to decrease coding understandability, or being interpreted as promoting
> idiomatic or inconsistent usage. You should ask some of the hundred or so
> developers who have worked under me their opinion about my code. You would
> come to a different conclusion.
>
> I wonder what you think about the code in o.a.f.hyphenation.TernaryTree,
> where the author apparently did not know Java, and introduces the libc
> functions strcmp, strcpy, and strlen, and which uses the Java char type
> (within the String type) for coding tree pointers!
>
> I also note the author of this file uses short names for (exposed,
> non-private) instance variables, such as:
>
> protected char[] lo;
> protected char[] hi;
> protected char[] eq;
> protected char[] sc;
> protected CharVector kv;
>
> At least in my case, I use long names for instance variables, even when
> they are private.
>
> If you wanted to make a serious case against using short names, you would
> start first by analyzing existing FOP usage and using such an analysis to
> establish concrete metrics. That would allow objective comparisons to be
> made.
>
> G.
>
>
>>
>> Peter
>>
>> On Wed, Oct 26, 2011 at 12:55 PM, Glenn Adams  wrote:
>> > inline
>> > On Wed, Oct 26, 2011 at 7:17 PM, Vincent Hennebert <
>> vhenneb...@gmail.com>
>> > wrote:
>> >>
>> >> On 24/10/11 14:05, Glenn Adams wrote:
>> >> > On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl
>> >> > wrote:
>> >> >
>> >> >> Hello Glenn,
>> >> >>
>> >> >>> (2) there is no standard for symbol length documented in FOP
>> practice
>> >> >>> or enforced by checkstyle; I decline to exchange my choice of
>> symbols
>> >> >>> with longer symbols simply because you prefer it that way; I have
>> >> >>> offered to add comments to my uses, and that is the most I'm
>> willing
>> >> >>> to do to address this matter;
>> >> >>
>> >> >> You probably spent more years programming than I am alive, so please
>> >> >> excuse
>> >> >> me if that’s a stupid question: What is the reasoning/advantage
>> behind
>> >> >> those
>> >> >> short variable n

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Glenn Adams
On Wed, Oct 26, 2011 at 8:36 PM, Peter Hancock wrote:

> >> On Mon, Oct 24, 2011 at 09:05:34PM +0800, Glenn Adams wrote:
> > are you claiming my code is not maintainable by other developers? if so,
> > then please prove it objectively; otherwise, let's stop talking about
> this,
> > and move on with the merge vote
> How would one go about proving objectively that code is not maintainable?
>

My point is that this is a subjective exercise we are having here, and not
particularly fruitful. It can be done objectively, or at least more
objectively if one wants to take the time to do so. For example, by defining
specific objective metrics and tools that measure those metrics against the
existing code base and against new code.

But instead of doing that, we are presently dealing with argument by
innuendo.


>
> There are many aspects to writing maintainable code, spanning from the
> synax level through to the structuring of classes  and modules
> (packages).  Importantly we should encourage:
> Code reuse - (using trusted libraries, applying the DRY principle)
> hard to measure objectively
> A consistent style - this may be an emergent aspect of a project and
> choosing guidelines at the start or even retrospectively may be too
> difficult, but we can largely infer the style from the current state.
> An imperfect but consistent style is arguably favorable to
> inconsistency.
> Idiomatic language usage - applying common solutions that leverage the
> constructs of, and the philosophies behind a language (e.g applying OO
> design patterns in Java applications).
>
> I find that writing code that is in keeping with a the style of a
> project and using the language as recommended makes it easier to
> distill the intention of a piece of code from the implementation and
> can lead towards self-documenting code.
>
> The inner workings of FOP are complex and I think that all efforts to
> boost understandability are essential.
>

It is rather ironic that I find myself being interpreted as somehow trying
to decrease coding understandability, or being interpreted as promoting
idiomatic or inconsistent usage. You should ask some of the hundred or so
developers who have worked under me their opinion about my code. You would
come to a different conclusion.

I wonder what you think about the code in o.a.f.hyphenation.TernaryTree,
where the author apparently did not know Java, and introduces the libc
functions strcmp, strcpy, and strlen, and which uses the Java char type
(within the String type) for coding tree pointers!

I also note the author of this file uses short names for (exposed,
non-private) instance variables, such as:

protected char[] lo;
protected char[] hi;
protected char[] eq;
protected char[] sc;
protected CharVector kv;

At least in my case, I use long names for instance variables, even when they
are private.

If you wanted to make a serious case against using short names, you would
start first by analyzing existing FOP usage and using such an analysis to
establish concrete metrics. That would allow objective comparisons to be
made.

G.


>
> Peter
>
> On Wed, Oct 26, 2011 at 12:55 PM, Glenn Adams  wrote:
> > inline
> > On Wed, Oct 26, 2011 at 7:17 PM, Vincent Hennebert  >
> > wrote:
> >>
> >> On 24/10/11 14:05, Glenn Adams wrote:
> >> > On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl
> >> > wrote:
> >> >
> >> >> Hello Glenn,
> >> >>
> >> >>> (2) there is no standard for symbol length documented in FOP
> practice
> >> >>> or enforced by checkstyle; I decline to exchange my choice of
> symbols
> >> >>> with longer symbols simply because you prefer it that way; I have
> >> >>> offered to add comments to my uses, and that is the most I'm willing
> >> >>> to do to address this matter;
> >> >>
> >> >> You probably spent more years programming than I am alive, so please
> >> >> excuse
> >> >> me if that’s a stupid question: What is the reasoning/advantage
> behind
> >> >> those
> >> >> short variable names?
> >> >>
> >> >
> >> > First, I don't use short names everywhere. Mostly I just use in local
> >> > variables, but generally not as class variables.
> >> >
> >> > Second, I was trained in Physics and Mathematics, which uses short
> >> > variable
> >> > names (E = M C ^ 2).
> >>
> >> Welcome to the Computer Science world, where longer variable names rule
> >> because they allow to make a program easier to understand and maintain.
> >>
> >> When I read the paper about the total-fit algorithm for breaking
> >> paragraphs into line, I found that the numerous one-letter variable
> >> names were an impediment to understanding it. It was difficult to
> >> remember what concept each variable was associated to.
> >
> > I had no trouble understanding it. In fact, I re-implemented it in Lisp
> > (Scheme), and fixed a few issues in the process, which I reported to Don
> > Knuth and for which he sent me a check for $2.56. See attached file. Note
> > that I used long names for (structure) member names and dynami

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Peter Hancock
>> On Mon, Oct 24, 2011 at 09:05:34PM +0800, Glenn Adams wrote:
> are you claiming my code is not maintainable by other developers? if so,
> then please prove it objectively; otherwise, let's stop talking about this,
> and move on with the merge vote
How would one go about proving objectively that code is not maintainable?

There are many aspects to writing maintainable code, spanning from the
synax level through to the structuring of classes  and modules
(packages).  Importantly we should encourage:
Code reuse - (using trusted libraries, applying the DRY principle)
hard to measure objectively
A consistent style - this may be an emergent aspect of a project and
choosing guidelines at the start or even retrospectively may be too
difficult, but we can largely infer the style from the current state.
An imperfect but consistent style is arguably favorable to
inconsistency.
Idiomatic language usage - applying common solutions that leverage the
constructs of, and the philosophies behind a language (e.g applying OO
design patterns in Java applications).

I find that writing code that is in keeping with a the style of a
project and using the language as recommended makes it easier to
distill the intention of a piece of code from the implementation and
can lead towards self-documenting code.

The inner workings of FOP are complex and I think that all efforts to
boost understandability are essential.

Peter

On Wed, Oct 26, 2011 at 12:55 PM, Glenn Adams  wrote:
> inline
> On Wed, Oct 26, 2011 at 7:17 PM, Vincent Hennebert 
> wrote:
>>
>> On 24/10/11 14:05, Glenn Adams wrote:
>> > On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl
>> > wrote:
>> >
>> >> Hello Glenn,
>> >>
>> >>> (2) there is no standard for symbol length documented in FOP practice
>> >>> or enforced by checkstyle; I decline to exchange my choice of symbols
>> >>> with longer symbols simply because you prefer it that way; I have
>> >>> offered to add comments to my uses, and that is the most I'm willing
>> >>> to do to address this matter;
>> >>
>> >> You probably spent more years programming than I am alive, so please
>> >> excuse
>> >> me if that’s a stupid question: What is the reasoning/advantage behind
>> >> those
>> >> short variable names?
>> >>
>> >
>> > First, I don't use short names everywhere. Mostly I just use in local
>> > variables, but generally not as class variables.
>> >
>> > Second, I was trained in Physics and Mathematics, which uses short
>> > variable
>> > names (E = M C ^ 2).
>>
>> Welcome to the Computer Science world, where longer variable names rule
>> because they allow to make a program easier to understand and maintain.
>>
>> When I read the paper about the total-fit algorithm for breaking
>> paragraphs into line, I found that the numerous one-letter variable
>> names were an impediment to understanding it. It was difficult to
>> remember what concept each variable was associated to.
>
> I had no trouble understanding it. In fact, I re-implemented it in Lisp
> (Scheme), and fixed a few issues in the process, which I reported to Don
> Knuth and for which he sent me a check for $2.56. See attached file. Note
> that I used long names for (structure) member names and dynamic variables,
> but often short names for local (lexical) variables in this code which I
> wrote 20 years ago. I haven't changed my style since then, and I don't
> intend to do so now.
>
>>
>> > Third, I started programming in the 1960s with BAL 360, APL, then
>> > FORTRAN
>> > IV. We use short names there.
>>
>> Yes, it is very fortunate that the computer world has learnt from those
>> old days, and moved on to embrace better programming practices.
>
> We are apparently in different generations, and this influences our
> thinking. I am not judging your style, but you seem to be quick to judge my
> style. Personally, I find ideology counterproductive.
>
>>
>> > Fourth, I happen to have a good memory and I have no trouble remembering
>> > the
>> > meaning of variable names.
>> >
>> > Fifth, I find that short names prevents making lines too long and gives
>> > me
>> > more room for comments.
>>
>> By putting only one statement per line it is rare to bump into the 100
>> characters per line limit.
>>
>>
>> > Sixth, I am going to be maintaining this code. If anyone has a problem
>> > with
>> > specific code during a merge or regression, they merely need ask me.
>>
>> As Simon has already pointed out, this is not the way it should be in an
>> open-source project.
>>
>>
>> > Seventh, that's just my style, and I assert it is as valid as doing it
>> > with
>> > long names.
>>
>> When I joined the FOP project, I adjusted my style to follow the
>> project’s practices. It seemed obvious to me to do so, because
>> a consistent style within a project avoids unnecessary distraction when
>> wandering through its different parts. I would expect any contributor to
>> do the same.
>>
>>
>> > Eighth, asking me to adhere to an undocumented convention that is not
>> > otherwise enfor

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Glenn Adams
inline

On Wed, Oct 26, 2011 at 6:49 PM, Vincent Hennebert wrote:

> On 21/10/11 08:29, Glenn Adams wrote:
> > On Thu, Oct 20, 2011 at 10:31 PM, Peter Hancock  >wrote:
> >
> >>
> >> From the surface I would have been very much in favor of supporting a
> >> merge in the near future, however I have had the chance to review some
> >> areas of the complex script branch and I have some concerns.
> >> The treatment of Unicode Bidi spans from the creation of the FO Tree
> >> through to the construction of the Area Tree and I would have liked to
> >> have seen the complex scripts solution integrate the Unicode Bidi
> >> Algorithm more directly into the core process:  For example, the
> >> implementation performs a post process on the FO Tree to resolve the
> >> Bidi properties of FONodes relating to text. It would be preferable to
> >> see the construction of the FO Tree embracing this Bidi aspect:
> >> FONodes should be responsible for determining their own bidi state
> >> from the fo node semantics in context to the position in the tree.
> >> Such an implementation would immediately force the maintainer to
> >> consider how a change would effect the Bidi process.
> >>
> >
> > Please review XSL-FO 1.1 Section 5.8, and, in particular:
> >
> > "the algorithm is applied to a sequence of characters coming from the
> > content of one or more formatting objects. The sequence of characters is
> > created by processing a fragment of the formatting object tree. A
> *fragment* is
> > any contiguous sequence of children of some formatting object in the
> tree.
> > The sequence is created by doing a pre-order traversal of the fragment
> down
> > to the fo:character level."
> >
> > "the final, text reordering step is not done during refinement. Instead,
> the
> > XSL equivalent of re-ordering is done during area tree generation"
> >
> > The current implementation adheres to the XSL-FO specification in this
> > regard, while your suggestion that this behavior be isolated to
> individual
> > FONodes is contrary to the specification and does not permit correct
> > implementation of the functionality required.
>
> Section 5 of the XSL-FO 1.1 Recommendation starts with the following
> note:
> “Although the refinement process is described in a series of steps, this
> is solely for the convenience of exposition and does not imply they must
> be implemented as separate steps in any conforming implementation.
> A conforming implementation must only achieve the same effect.”
>
> So we are free to implement the algorithm any way we see adequate.
>

And I have done just that: implement the [Bidi] algorithm in a way I found
adequate.


> But even so, a fragment is “any contiguous sequence of children of some
> formatting object in the tree“. That formatting object doesn’t have to
> be a whole page-sequence and can as well be a single block or inline or
> anything else.
>
> Implementing Bidi in individual FONode sub-classes allows to keep the
> treatment encapsulated in each FO element, and adapt it to the specific
> semantics of that element.
>

I have not ruled this option out. I have stated previously that this may be
possible; however, there are possible problems with confining this behavior
to FONode classes such as the fact that what constitutes a "fragment" and a
"delimited text range" (see XSL-FO 1.1 Section 5.8) is not confined to FO
node boundaries.


> If this is done in a single BidiUtil class, all the behaviours that are
> specific to each element are mixed together. Implementation details that
> should be kept within individual classes are being exposed to the rest
> of them. Elements must be handled in lenghty sequences of if/else
> statement using ‘instanceof’ and casts to the concrete class.
>

In adding support to Bidi to FOP, I was faced with the problem of (1) not
knowing the implementation, and (2) desiring to minimize the point of
contact of my changes with existing code in order to better isolate
behavioral regressions.

I made the determination that it would be most convenient and expedient to
code the bidi level and order resolution functionality in a single set of
utility functions (with other helper classes, such as UnicodeBidiAlgorithm).
That approach has worked so far. However, that doesn't mean it has to remain
that way.

It is not a straightforward task to add Bidi support to a large
implementation like FOP which failed to take the needs of
internationalization into account in its design. As a consequence, adding
this support must be done delicately, and in stages.


> If a new FO element is being implemented this will be very likely that
> it will be forgotten to add the appropriate ‘if’ statement for that
> element in the BidiUtil class. If it’s not forgotten, it will be
> difficult to find out where to put that statement.
>

Although I have not added tests for right-to-left writing mode for all
existing layoutengine standard tests, I am in the process of doing just
that, and have already put a number of te

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Vincent Hennebert
On 24/10/11 14:05, Glenn Adams wrote:
> On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl wrote:
> 
>> Hello Glenn,
>>
>>> (2) there is no standard for symbol length documented in FOP practice
>>> or enforced by checkstyle; I decline to exchange my choice of symbols
>>> with longer symbols simply because you prefer it that way; I have
>>> offered to add comments to my uses, and that is the most I'm willing
>>> to do to address this matter;
>>
>> You probably spent more years programming than I am alive, so please excuse
>> me if that’s a stupid question: What is the reasoning/advantage behind those
>> short variable names?
>>
> 
> First, I don't use short names everywhere. Mostly I just use in local
> variables, but generally not as class variables.
> 
> Second, I was trained in Physics and Mathematics, which uses short variable
> names (E = M C ^ 2).

Welcome to the Computer Science world, where longer variable names rule
because they allow to make a program easier to understand and maintain.

When I read the paper about the total-fit algorithm for breaking
paragraphs into line, I found that the numerous one-letter variable
names were an impediment to understanding it. It was difficult to
remember what concept each variable was associated to.


> Third, I started programming in the 1960s with BAL 360, APL, then FORTRAN
> IV. We use short names there.

Yes, it is very fortunate that the computer world has learnt from those
old days, and moved on to embrace better programming practices.


> Fourth, I happen to have a good memory and I have no trouble remembering the
> meaning of variable names.
> 
> Fifth, I find that short names prevents making lines too long and gives me
> more room for comments.

By putting only one statement per line it is rare to bump into the 100
characters per line limit.


> Sixth, I am going to be maintaining this code. If anyone has a problem with
> specific code during a merge or regression, they merely need ask me.

As Simon has already pointed out, this is not the way it should be in an
open-source project.


> Seventh, that's just my style, and I assert it is as valid as doing it with
> long names.

When I joined the FOP project, I adjusted my style to follow the
project’s practices. It seemed obvious to me to do so, because
a consistent style within a project avoids unnecessary distraction when
wandering through its different parts. I would expect any contributor to
do the same.


> Eighth, asking me to adhere to an undocumented convention that is not
> otherwise enforced, and for which there is no evidence or analysis of having
> been previously followed in FOP contributions is unwarranted.

There is no documented convention simply because it has never occurred
to anybody in the past that short variable names could be an option.
This is this kind of unwritten convention that everybody takes for
granted.


> Ninth, spending time changing variable names is a waste of time when I could
> be working on adding support for other scripts.

So someone else is going to have to waste all that time converting those
names into more readable ones. That’s a bit unfair, isn’t it?


> I can probably throw in a few more random reasons, but this should be
> sufficient.
> 
> I've offered to add comments, take it or leave it.

Would you at least agree to use more readable variable names in new
code? That would be of great help to people who will get involved in the
Bidi stuff in the future.


Thanks,
Vincent


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-26 Thread Vincent Hennebert
On 21/10/11 08:29, Glenn Adams wrote:
> On Thu, Oct 20, 2011 at 10:31 PM, Peter Hancock 
> wrote:
> 
>>
>> From the surface I would have been very much in favor of supporting a
>> merge in the near future, however I have had the chance to review some
>> areas of the complex script branch and I have some concerns.
>> The treatment of Unicode Bidi spans from the creation of the FO Tree
>> through to the construction of the Area Tree and I would have liked to
>> have seen the complex scripts solution integrate the Unicode Bidi
>> Algorithm more directly into the core process:  For example, the
>> implementation performs a post process on the FO Tree to resolve the
>> Bidi properties of FONodes relating to text. It would be preferable to
>> see the construction of the FO Tree embracing this Bidi aspect:
>> FONodes should be responsible for determining their own bidi state
>> from the fo node semantics in context to the position in the tree.
>> Such an implementation would immediately force the maintainer to
>> consider how a change would effect the Bidi process.
>>
> 
> Please review XSL-FO 1.1 Section 5.8, and, in particular:
> 
> "the algorithm is applied to a sequence of characters coming from the
> content of one or more formatting objects. The sequence of characters is
> created by processing a fragment of the formatting object tree. A *fragment* 
> is
> any contiguous sequence of children of some formatting object in the tree.
> The sequence is created by doing a pre-order traversal of the fragment down
> to the fo:character level."
> 
> "the final, text reordering step is not done during refinement. Instead, the
> XSL equivalent of re-ordering is done during area tree generation"
> 
> The current implementation adheres to the XSL-FO specification in this
> regard, while your suggestion that this behavior be isolated to individual
> FONodes is contrary to the specification and does not permit correct
> implementation of the functionality required.

Section 5 of the XSL-FO 1.1 Recommendation starts with the following
note:
“Although the refinement process is described in a series of steps, this
is solely for the convenience of exposition and does not imply they must
be implemented as separate steps in any conforming implementation.
A conforming implementation must only achieve the same effect.”

So we are free to implement the algorithm any way we see adequate.

But even so, a fragment is “any contiguous sequence of children of some
formatting object in the tree“. That formatting object doesn’t have to
be a whole page-sequence and can as well be a single block or inline or
anything else.

Implementing Bidi in individual FONode sub-classes allows to keep the
treatment encapsulated in each FO element, and adapt it to the specific
semantics of that element.

If this is done in a single BidiUtil class, all the behaviours that are
specific to each element are mixed together. Implementation details that
should be kept within individual classes are being exposed to the rest
of them. Elements must be handled in lenghty sequences of if/else
statement using ‘instanceof’ and casts to the concrete class.

If a new FO element is being implemented this will be very likely that
it will be forgotten to add the appropriate ‘if’ statement for that
element in the BidiUtil class. If it’s not forgotten, it will be
difficult to find out where to put that statement.

Doing treatment specific to an object outside its implementation screams
for trouble and regressions as soon as a change is made in one or the
other place. Unless people are aware that they must keep an eye on that
BidiUtil class, which is unlikely for newcomer who don’t know the code
well.

This BidiUtil class has clearly been written in a procedural style, and
there are reasons why that style was abandoned years ago in favour of an
object-oriented paradigm, that allows to write more flexible,
maintainable programs, and easier to understand by people who are not
the original authors.

I’d like to know what’s your view on this?


> I realize this is a complex subject area that requires considerable domain
> knowledge, but you can take my word as a domain expert (having implemented
> this functionality multiple times in commercial products) that the approach
> I have taken is (1) the most consistent with the XSL-FO specification, (2)
> the behavior required to achieve the desired functionality, and (3) the
> minimum changes and points of dependency to existing code.
> 
> In contrast, a more distributed approach such as you suggest would (1)
> diverge from XSL-FO specified behavior, (2) increase and distribute the
> number of points of interaction with existing code so as to make behavior
> harder to understand, test, and debug, and, most telling, (3) not provide
> any functional or performance advantage.
> 
> Regards,
> Glenn


Thanks,
Vincent


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Glenn Adams
are you claiming my code is not maintainable by other developers? if so,
then please prove it objectively; otherwise, let's stop talking about this,
and move on with the merge vote

On Tue, Oct 25, 2011 at 1:21 AM, Simon Pepping wrote:

> On Mon, Oct 24, 2011 at 09:05:34PM +0800, Glenn Adams wrote:
> > Sixth, I am going to be maintaining this code. If anyone has a problem
> with
> > specific code during a merge or regression, they merely need ask me.
>
> That is a big no. There will always be a moment when someone else must
> or wants to work on this code. FOP code cannot depend on a single
> person, it must be maintainable by other developers.
>
> Simon
>


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Simon Pepping
On Mon, Oct 24, 2011 at 09:05:34PM +0800, Glenn Adams wrote:
> Sixth, I am going to be maintaining this code. If anyone has a problem with
> specific code during a merge or regression, they merely need ask me.

That is a big no. There will always be a moment when someone else must
or wants to work on this code. FOP code cannot depend on a single
person, it must be maintainable by other developers.

Simon


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Pascal Sancho
Hi,

I'm not sure that short variables names affect readability when long
mathematical formulas are used.
sometimes, code concision can help in understanding what code does:
 depending on what you can read and understand at a glance.
Readability should be in a place between concision and verbose,
depending on the threated topic.

that can be discussed, but this should not prevent from merging GA's works.

+1 for merging it now.

Le 24/10/2011 15:26, Eric Douglas a écrit :
> Short variable names should use less memory, which is mostly irrelevant
> these days.
> In an open project where other people could be working on the same code
> (or other code in the same package) it helps if all names are consistant.
> Personally I could never figure out what variable naming conventions
> are.  Each class I write seems to provide reason to use an entirely new
> convention.
> As long as someone who has never seen your code before can determine the
> purpose of each variable, I'd say you're good.
> If that requires comments, definitely add comments.  When in doubt, comment.
> 
> 
> *From:* Glenn Adams [mailto:gl...@skynav.com]
> *Sent:* Monday, October 24, 2011 9:06 AM
> *To:* fop-dev@xmlgraphics.apache.org
> *Subject:* Re: Merge Request - Temp_ComplexScripts into Trunk
> 
> 
> On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl  <mailto:georg.datt...@geneon.de>> wrote:
> 
> Hello Glenn,
> 
> > (2) there is no standard for symbol length documented in FOP practice
> > or enforced by checkstyle; I decline to exchange my choice of symbols
> > with longer symbols simply because you prefer it that way; I have
> > offered to add comments to my uses, and that is the most I'm willing
> > to do to address this matter;
> 
> You probably spent more years programming than I am alive, so please
> excuse me if that’s a stupid question: What is the
> reasoning/advantage behind those short variable names?
> 
> 
> First, I don't use short names everywhere. Mostly I just use in local
> variables, but generally not as class variables.
> 
> Second, I was trained in Physics and Mathematics, which uses short
> variable names (E = M C ^ 2).
> 
> Third, I started programming in the 1960s with BAL 360, APL, then
> FORTRAN IV. We use short names there.
> 
> Fourth, I happen to have a good memory and I have no trouble remembering
> the meaning of variable names.
> 
> Fifth, I find that short names prevents making lines too long and gives
> me more room for comments.
> 
> Sixth, I am going to be maintaining this code. If anyone has a problem
> with specific code during a merge or regression, they merely need ask me.
> 
> Seventh, that's just my style, and I assert it is as valid as doing it
> with long names.
> 
> Eighth, asking me to adhere to an undocumented convention that is not
> otherwise enforced, and for which there is no evidence or analysis of
> having been previously followed in FOP contributions is unwarranted.
> 
> Ninth, spending time changing variable names is a waste of time when I
> could be working on adding support for other scripts.
> 
> I can probably throw in a few more random reasons, but this should be
> sufficient.
> 
> I've offered to add comments, take it or leave it.
> 

-- 
Pascal


RE: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Eric Douglas
Short variable names should use less memory, which is mostly irrelevant
these days.
In an open project where other people could be working on the same code
(or other code in the same package) it helps if all names are
consistant.
Personally I could never figure out what variable naming conventions
are.  Each class I write seems to provide reason to use an entirely new
convention.
As long as someone who has never seen your code before can determine the
purpose of each variable, I'd say you're good.
If that requires comments, definitely add comments.  When in doubt,
comment.



From: Glenn Adams [mailto:gl...@skynav.com] 
Sent: Monday, October 24, 2011 9:06 AM
To: fop-dev@xmlgraphics.apache.org
Subject: Re: Merge Request - Temp_ComplexScripts into Trunk



On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl 
wrote:


Hello Glenn,


> (2) there is no standard for symbol length documented in FOP
practice
> or enforced by checkstyle; I decline to exchange my choice of
symbols
> with longer symbols simply because you prefer it that way; I
have
> offered to add comments to my uses, and that is the most I'm
willing
> to do to address this matter;


You probably spent more years programming than I am alive, so
please excuse me if that's a stupid question: What is the
reasoning/advantage behind those short variable names?



First, I don't use short names everywhere. Mostly I just use in local
variables, but generally not as class variables.

Second, I was trained in Physics and Mathematics, which uses short
variable names (E = M C ^ 2).

Third, I started programming in the 1960s with BAL 360, APL, then
FORTRAN IV. We use short names there.

Fourth, I happen to have a good memory and I have no trouble remembering
the meaning of variable names.

Fifth, I find that short names prevents making lines too long and gives
me more room for comments.

Sixth, I am going to be maintaining this code. If anyone has a problem
with specific code during a merge or regression, they merely need ask
me.

Seventh, that's just my style, and I assert it is as valid as doing it
with long names.

Eighth, asking me to adhere to an undocumented convention that is not
otherwise enforced, and for which there is no evidence or analysis of
having been previously followed in FOP contributions is unwarranted.

Ninth, spending time changing variable names is a waste of time when I
could be working on adding support for other scripts.

I can probably throw in a few more random reasons, but this should be
sufficient.

I've offered to add comments, take it or leave it.



Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Glenn Adams
On Mon, Oct 24, 2011 at 8:26 PM, Georg Datterl wrote:

> Hello Glenn,
>
> > (2) there is no standard for symbol length documented in FOP practice
> > or enforced by checkstyle; I decline to exchange my choice of symbols
> > with longer symbols simply because you prefer it that way; I have
> > offered to add comments to my uses, and that is the most I'm willing
> > to do to address this matter;
>
> You probably spent more years programming than I am alive, so please excuse
> me if that’s a stupid question: What is the reasoning/advantage behind those
> short variable names?
>

First, I don't use short names everywhere. Mostly I just use in local
variables, but generally not as class variables.

Second, I was trained in Physics and Mathematics, which uses short variable
names (E = M C ^ 2).

Third, I started programming in the 1960s with BAL 360, APL, then FORTRAN
IV. We use short names there.

Fourth, I happen to have a good memory and I have no trouble remembering the
meaning of variable names.

Fifth, I find that short names prevents making lines too long and gives me
more room for comments.

Sixth, I am going to be maintaining this code. If anyone has a problem with
specific code during a merge or regression, they merely need ask me.

Seventh, that's just my style, and I assert it is as valid as doing it with
long names.

Eighth, asking me to adhere to an undocumented convention that is not
otherwise enforced, and for which there is no evidence or analysis of having
been previously followed in FOP contributions is unwarranted.

Ninth, spending time changing variable names is a waste of time when I could
be working on adding support for other scripts.

I can probably throw in a few more random reasons, but this should be
sufficient.

I've offered to add comments, take it or leave it.


AW: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Georg Datterl
Hello Glenn,

> (2) there is no standard for symbol length documented in FOP practice
> or enforced by checkstyle; I decline to exchange my choice of symbols
> with longer symbols simply because you prefer it that way; I have
> offered to add comments to my uses, and that is the most I'm willing
> to do to address this matter;

You probably spent more years programming than I am alive, so please excuse me 
if that’s a stupid question: What is the reasoning/advantage behind those short 
variable names?

Regards,

Georg Datterl

-- Kontakt --

Georg Datterl

Geneon media solutions gmbh
Gutenstetter Straße 8a
90449 Nürnberg

HRB Nürnberg: 17193
Geschäftsführer: Yong-Harry Steiert

Tel.: 0911/36 78 88 - 26
Fax: 0911/36 78 88 - 20

www.geneon.de

Weitere Mitglieder der Willmy MediaGroup:

IRS Integrated Realization Services GmbH:www.irs-nbg.de
Willmy PrintMedia GmbH:  www.willmy.de
Willmy Consult & Content GmbH:   www.willmycc.de




Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Glenn Adams
Vincent,

We apparently disagree on whether coding should be based on ideology or on
practical results. You appear to favor the former, I favor the latter. I
think we will have to leave it at that. I'm not going to alter my
programming style in order to adhere to your notion of ideal programming
practice. It would be one thing if there was an established consensus on
these matters based on objective reasoning, but there is not:

(1) the limit on file length appear to be arbitrary, and not a result of an
explicit reasoned process; i had reasons for coding the way I did (including
the use of nested classes), and I feel no need to alter or justify that
process; if someone can make an objectively reasoned argument on a case by
case basis, then I'm willing to consider refactoring at some point in the
future when other more important tasks are completed; e.g., I would be
willing to break out the nested class UnicodeBidiAlgorithm from
BidiUtil.java into a separate file (which would in address your comment
about separating bidi level determination from reordering);

(2) there is no standard for symbol length documented in FOP practice or
enforced by checkstyle; I decline to exchange my choice of symbols with
longer symbols simply because you prefer it that way; I have offered to add
comments to my uses, and that is the most I'm willing to do to address this
matter;

Note that in both of these cases, I am offering to take concrete steps to
address your comments, though not necessarily in the manner or to the full
extent you would prefer. This is called compromise, an important aspect of
working in a team.

G.

On Mon, Oct 24, 2011 at 6:54 PM, Vincent Hennebert wrote:

> On 22/10/11 01:22, Glenn Adams wrote:
> > inline
> >
> > On Sat, Oct 22, 2011 at 12:04 AM, Chris Bowditch <
> bowditch_ch...@hotmail.com
> >> wrote:
> >
> >>
> >> Since Thunderhead also needs this feature we are willing to invest some
> >> time into it too. Currently my team are telling me it would take 9
> person
> >> months to port this code into our branch of FOP, partly because of some
> >> merge conflicts, but also partly because we are not comfortable with
> some
> >> aspects of the code as it has already been pointed out. The estimate
> would
> >> include the time to refactor long files into small ones, deal with the
> >> variable names and restructuring the code.
> >>
> >
> > I would advise against this, since it would it is functionally
> unnecessary
> > and since it will make future merges more difficult. I will be making
> > additional changes as more features in this area are added.
> >
> > I don't see what refactoring long files into small ones buys you.
> However,
> > if you make a reasoned argument for factoring specific long files (i.e.,
> why
> > such factoring improves architecture, modularity, etc), rather than
> simply
> > say "all long files must be refactored", then I will seriously discuss
> doing
> > so post merge.
>
> When I read this I’m really puzzled because that should really be the
> other way around: what is the reason to keep those classes so big (and
> that must be a really good one)? Most likely the Single Responsibility
> Principle is being violated in those classes.
>
> BidiUtil is a good example of this: it computes Bidi levels on the FO
> tree, /and/ also re-orders areas on the area tree. Those two things
> should most probably be done in two separate classes. Especially since
> from the quick look I had they seem to be using two distinct sets or
> private methods.
>
>
> >> I appreciate your commitment to add comments to short identifiers
> >> declarations, so at least it will be easier for the team to translate
> the
> >> short variables to longer equivalents. Just so we are clear on what you
> >> propose, do you mean this:
> >>
> >> int gi = 0; // Glyph Index
> >>
> >
> > Yes. Note that I already do this in most cases, such as:
> >
> > private static void resolveExplicit ( int[] wca, int defaultLevel, int[]
> ea
> > ) {
> > int[] es = new int [ MAX_LEVELS ];  /* embeddings stack */
> > int ei = 0; /* embeddings stack index
> */
> > int ec = defaultLevel;  /* current embedding
> level
> > */
> > for ( int i = 0, n = wca.length; i < n; i++ ) {
> > int bc = wca [ i ]; /* bidi class of current
> > char */
> > int el; /* embedding level to
> assign
> > to current char */
> > switch ( bc ) {
> > case LRE:   // start left-to-right
> > embedding
> > case RLE:   // start right-to-left
> > embedding
> >  case LRO:   // start
> left-to-right
> > override
> > case RLO:   // start right-to-left
> > override
> > {
> > int en; /* new embedding level */
> > if ( ( 

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-24 Thread Vincent Hennebert
On 22/10/11 01:22, Glenn Adams wrote:
> inline
> 
> On Sat, Oct 22, 2011 at 12:04 AM, Chris Bowditch > wrote:
> 
>>
>> Since Thunderhead also needs this feature we are willing to invest some
>> time into it too. Currently my team are telling me it would take 9 person
>> months to port this code into our branch of FOP, partly because of some
>> merge conflicts, but also partly because we are not comfortable with some
>> aspects of the code as it has already been pointed out. The estimate would
>> include the time to refactor long files into small ones, deal with the
>> variable names and restructuring the code.
>>
> 
> I would advise against this, since it would it is functionally unnecessary
> and since it will make future merges more difficult. I will be making
> additional changes as more features in this area are added.
> 
> I don't see what refactoring long files into small ones buys you. However,
> if you make a reasoned argument for factoring specific long files (i.e., why
> such factoring improves architecture, modularity, etc), rather than simply
> say "all long files must be refactored", then I will seriously discuss doing
> so post merge.

When I read this I’m really puzzled because that should really be the
other way around: what is the reason to keep those classes so big (and
that must be a really good one)? Most likely the Single Responsibility
Principle is being violated in those classes.

BidiUtil is a good example of this: it computes Bidi levels on the FO
tree, /and/ also re-orders areas on the area tree. Those two things
should most probably be done in two separate classes. Especially since
from the quick look I had they seem to be using two distinct sets or
private methods.


>> I appreciate your commitment to add comments to short identifiers
>> declarations, so at least it will be easier for the team to translate the
>> short variables to longer equivalents. Just so we are clear on what you
>> propose, do you mean this:
>>
>> int gi = 0; // Glyph Index
>>
> 
> Yes. Note that I already do this in most cases, such as:
> 
> private static void resolveExplicit ( int[] wca, int defaultLevel, int[] ea
> ) {
> int[] es = new int [ MAX_LEVELS ];  /* embeddings stack */
> int ei = 0; /* embeddings stack index */
> int ec = defaultLevel;  /* current embedding level
> */
> for ( int i = 0, n = wca.length; i < n; i++ ) {
> int bc = wca [ i ]; /* bidi class of current
> char */
> int el; /* embedding level to assign
> to current char */
> switch ( bc ) {
> case LRE:   // start left-to-right
> embedding
> case RLE:   // start right-to-left
> embedding
>  case LRO:   // start left-to-right
> override
> case RLO:   // start right-to-left
> override
> {
> int en; /* new embedding level */
> if ( ( bc == RLE ) || ( bc == RLO ) ) {
> en = ( ( ec & ~OVERRIDE ) + 1 ) | 1;
> } else {
> en = ( ( ec & ~OVERRIDE ) + 2 ) & ~1;
> }
> if ( en < ( MAX_LEVELS + 1 ) ) {
> es [ ei++ ] = ec;
> if ( ( bc == LRO ) || ( bc == RLO ) ) {
> ec = en | OVERRIDE;
> } else {
> ec = en & ~OVERRIDE;
> }
> } else {
> // max levels exceeded, so don't change level or
> override
> }
> el = ec;
> break;
> }
> ...
> 
> What I'm agreeing to do in the relative near future (after merge, before new
> release) is to add such comments to those places where I have not already
> done so, which are probably a minority of such cases.

This is good to hear, although it only marginally helps. Again, what’s
the rationale behind having 2 or 3 letter variables when every course
about programming will emphasise the importance of having reasonably
long, self-describing variable names? What amount of typing is there to
save when any modern IDE will auto-complete variable names?

Explaining the purpose of a variable in a comment creates two problems:
first, it forces the developer to constantly scroll from where the
variable is being used to where it is declared, in order to remember
what its purpose is. By doing so, they lose the context in which the
variable is used and have to read the code again. This makes it
extremely painful and difficult to understand the code.

But more importantly, there is no guarantee that the comment is
accurate. It’s notorious that comments tend to be left behind when
changes are made to the code. Which means that they quickly become
misleading or even plai

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Glenn Adams
inline

On Sat, Oct 22, 2011 at 12:04 AM, Chris Bowditch  wrote:

>
> Since Thunderhead also needs this feature we are willing to invest some
> time into it too. Currently my team are telling me it would take 9 person
> months to port this code into our branch of FOP, partly because of some
> merge conflicts, but also partly because we are not comfortable with some
> aspects of the code as it has already been pointed out. The estimate would
> include the time to refactor long files into small ones, deal with the
> variable names and restructuring the code.
>

I would advise against this, since it would it is functionally unnecessary
and since it will make future merges more difficult. I will be making
additional changes as more features in this area are added.

I don't see what refactoring long files into small ones buys you. However,
if you make a reasoned argument for factoring specific long files (i.e., why
such factoring improves architecture, modularity, etc), rather than simply
say "all long files must be refactored", then I will seriously discuss doing
so post merge.


> I appreciate your commitment to add comments to short identifiers
> declarations, so at least it will be easier for the team to translate the
> short variables to longer equivalents. Just so we are clear on what you
> propose, do you mean this:
>
> int gi = 0; // Glyph Index
>

Yes. Note that I already do this in most cases, such as:

private static void resolveExplicit ( int[] wca, int defaultLevel, int[] ea
) {
int[] es = new int [ MAX_LEVELS ];  /* embeddings stack */
int ei = 0; /* embeddings stack index */
int ec = defaultLevel;  /* current embedding level
*/
for ( int i = 0, n = wca.length; i < n; i++ ) {
int bc = wca [ i ]; /* bidi class of current
char */
int el; /* embedding level to assign
to current char */
switch ( bc ) {
case LRE:   // start left-to-right
embedding
case RLE:   // start right-to-left
embedding
 case LRO:   // start left-to-right
override
case RLO:   // start right-to-left
override
{
int en; /* new embedding level */
if ( ( bc == RLE ) || ( bc == RLO ) ) {
en = ( ( ec & ~OVERRIDE ) + 1 ) | 1;
} else {
en = ( ( ec & ~OVERRIDE ) + 2 ) & ~1;
}
if ( en < ( MAX_LEVELS + 1 ) ) {
es [ ei++ ] = ec;
if ( ( bc == LRO ) || ( bc == RLO ) ) {
ec = en | OVERRIDE;
} else {
ec = en & ~OVERRIDE;
}
} else {
// max levels exceeded, so don't change level or
override
}
el = ec;
break;
}
...

What I'm agreeing to do in the relative near future (after merge, before new
release) is to add such comments to those places where I have not already
done so, which are probably a minority of such cases.

G.


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Chris Bowditch

On 21/10/2011 15:13, Glenn Adams wrote:

Chris,


Hi Glenn,



 I would really like to see an acknowledgement from Glenn that there 
are some imperfections that need addressing.


I wasn't aware I had given anyone the impression of presenting a 
perfect submission. In fact, one of my favorite quotes is Voltaire's 
/le mieux est l'ennemi du bien/ "the best [perfect] is the enemy of 
the good", which a former colleague Charlie Sandbank (now deceased) 
used to love to cite at least once a day in ITU committee meetings.


Many thanks for taking the time to write this long e-mail. I do 
appreciate it. I reached this impression because I see Vincent and Peter 
giving feedback but I've yet to see any of the suggestions actioned. It 
could be that some of their suggestions aren't appropriate, but I do 
believe some of them are good points.




My coding philosophy is by and large based on a step-wise refinement 
process:


(1) make it (some subset of possible features) work
(2) make sure its correct (and regression testable) - or if following 
BDD, then do this first
(3) make it (more) understandable/maintainable, i.e., re-factor, 
improve comments, documentation, etc

(4) optionally, make it faster and smaller
(5) optionally, add more features
(6) go to (1)

Right now, I've finished steps (1) and (2) to a certain degree for an 
initial set of features, a degree that I think is sufficient to merit 
moving it into trunk. I have not yet seriously addressed (3) through 
(6). It would seem strange to expect that all these points have been 
addressed at this point in the process.


Thanks for clarifying. Just to be clear, I didn't mean to say that 
reaching the end of development was a pre-requisite to merging to trunk. 
It just seemed like a major milestone and therefore seemed like a 
suitable prompt for the opening of a discussion about our concerns.




If this work goes into the trunk, it will provide greater exposure to 
help drive and prioritize the remaining steps. There are trade-offs in 
time and money about how to spend my effort on steps (3) to (6). I 
have a well defined set of issues already waiting for item (5) [1], 
so, post merge, I can spend my time on these features or work on (3) 
or (4), or could attempt to divide my time between them. The bottom 
line is that it is a process that started some time ago and will 
continue for an indefinite time into the future. The current request 
for merging is just one step in that process.


That makes sense to me.



I expect to be maintaining this code and feature set for the 
indefinite future, according to the desires of my sponsor, Basis 
Technologies, who has a definite interest in the production use of an 
internationalized FOP, as well as others who have expressed a similar 
interest. As a consequence, there is little chance that any other FOP 
dev is going to have to work on this code any time soon. Of course, if 
they want to do so, I would certainly welcome community contributions 
to additional features, testing, optimization, documentation, etc., in 
this area. I have no personal need to *own* this code if others wish 
to help, and I certainly encourage it; however, at the same time, I do 
have a sponsor who is willing to continue investing in improving FOP 
in this area, and that willingness should not be discounted.


Since Thunderhead also needs this feature we are willing to invest some 
time into it too. Currently my team are telling me it would take 9 
person months to port this code into our branch of FOP, partly because 
of some merge conflicts, but also partly because we are not comfortable 
with some aspects of the code as it has already been pointed out. The 
estimate would include the time to refactor long files into small ones, 
deal with the variable names and restructuring the code.




[1] http://skynav.trac.cvsdude.com/fop/report/1

Regarding the comments about line length, file length etc., I will 
note that, at least with line length, I have maintained the existing 
(but arbitrary) limit of 100 in existing files. In the case of new 
files, I have chosen to use a longer line length that works better for 
my development process. I use GNU EMACS on a wide-screen MacBook Pro 
that has an effective width of 200 columns, and which, when this limit 
is exceeded, wraps the line (on display only). I find that arbitrary 
line lengths like 100 curtail my coding style, and as I've been coding 
for 40+ years, it's a pretty well established style (though back in 
the days when I wrote Fortran IV using an IBM 026 card punch 
, I 
had to stick with 80 columns). [If line length followed Moore's Law, 
we would be using lines of length 1760, starting from 1967 with 80 
columns.]


I personally don't have a problem with line length, but I think the File 
length is a small issue that we would like to address. I think the code 
would be easier to maintain if the larger classes were brok

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Vincent Hennebert
On 21/10/11 09:36, Simon Pepping wrote:
> I am pleased to learn that you are also in need of this new
> functionality.
> 
> I share some of Vincent and Peter's concerns about technical points of
> the code. On the other hand, this is the only implementation of
> complex scripts we have, created by Glenn, in the style of Glenn. It
> is an initial implementation, and it is normal that it requires
> further work, maybe even design changes to make it more flexible. Does
> keeping it in a branch make that further work easier? Merging it into
> trunk will enhance its visibility, and make it available to more
> users.

If it’s merged into Trunk, anyone who makes changes to the Trunk that
break the Complex Scripts feature will have to fix what is breaking. And
for the reasons I’ve given earlier, I believe that this would put too
much of a burden on developers.


> Simon
> 
> On Thu, Oct 20, 2011 at 02:02:10PM +0100, Chris Bowditch wrote:
>> On 19/10/2011 19:32, Simon Pepping wrote:
>>
>> Hi Simon,
>>
>> I think you misunderstood my mail. I don't want to stop the merge. I
>> simply thought it was an appropriate time to discuss some concerns
>> that Vincent and Peter had identified. You are preaching to the
>> converted about the need for supporting Complex scripts. It is an
>> urgent requirement for us too.
>>
>> If we don't discuss our concerns over the code now, then when do we
>> discuss it?
>>
>> Vincent and Peter will be replying to this thread shortly and will
>> outline their primary concerns then.
>>
>> Thanks,
>>
>> Chris


Vincent


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Clay Leeds
Quick question about this.

Please forgive my naïveté but, does this code affect processing
if you're not using ComplexScript support?

Thanks,

Clay

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Glenn Adams
Chris,

I would really like to see an acknowledgement from Glenn that there are
some imperfections that need addressing.
I wasn't aware I had given anyone the impression of presenting a perfect
submission. In fact, one of my favorite quotes is Voltaire's  *le mieux est
l'ennemi du bien* "the best [perfect] is the enemy of the good", which a
former colleague Charlie Sandbank (now deceased) used to love to cite at
least once a day in ITU committee meetings.

My coding philosophy is by and large based on a step-wise refinement
process:

(1) make it (some subset of possible features) work
(2) make sure its correct (and regression testable) - or if following BDD,
then do this first
(3) make it (more) understandable/maintainable, i.e., re-factor, improve
comments, documentation, etc
(4) optionally, make it faster and smaller
(5) optionally, add more features
(6) go to (1)

Right now, I've finished steps (1) and (2) to a certain degree for an
initial set of features, a degree that I think is sufficient to merit moving
it into trunk. I have not yet seriously addressed (3) through (6). It would
seem strange to expect that all these points have been addressed at this
point in the process.

If this work goes into the trunk, it will provide greater exposure to help
drive and prioritize the remaining steps. There are trade-offs in time and
money about how to spend my effort on steps (3) to (6). I have a well
defined set of issues already waiting for item (5) [1], so, post merge, I
can spend my time on these features or work on (3) or (4), or could attempt
to divide my time between them. The bottom line is that it is a process that
started some time ago and will continue for an indefinite time into the
future. The current request for merging is just one step in that process.

I expect to be maintaining this code and feature set for the indefinite
future, according to the desires of my sponsor, Basis Technologies, who has
a definite interest in the production use of an internationalized FOP, as
well as others who have expressed a similar interest. As a consequence,
there is little chance that any other FOP dev is going to have to work on
this code any time soon. Of course, if they want to do so, I would certainly
welcome community contributions to additional features, testing,
optimization, documentation, etc., in this area. I have no personal need to
*own* this code if others wish to help, and I certainly encourage it;
however, at the same time, I do have a sponsor who is willing to continue
investing in improving FOP in this area, and that willingness should not be
discounted.

[1] http://skynav.trac.cvsdude.com/fop/report/1

Regarding the comments about line length, file length etc., I will note
that, at least with line length, I have maintained the existing (but
arbitrary) limit of 100 in existing files. In the case of new files, I have
chosen to use a longer line length that works better for my development
process. I use GNU EMACS on a wide-screen MacBook Pro that has an effective
width of 200 columns, and which, when this limit is exceeded, wraps the line
(on display only). I find that arbitrary line lengths like 100 curtail my
coding style, and as I've been coding for 40+ years, it's a pretty well
established style (though back in the days when I wrote Fortran IV using an IBM
026 card 
punch,
I had to stick with 80 columns). [If line length followed Moore's Law, we
would be using lines of length 1760, starting from 1967 with 80 columns.]

By the way, I would note that the FOP Coding Conventions [2] do not specify
a maximum line length. In searching through the FOP DEV archives, I also
don't see an explicit discussion about line length (though I may have missed
it). What I do see is the initial introduction of the checkstyle target by
Jeremias in 2002 [4], where it appears he basically adjusted the checkstyle
rules to pass most of the extant code at that time.

[2] http://xmlgraphics.apache.org/fop/dev/conventions.html
[3]
http://marc.info/?l=fop-dev&w=2&r=1&s=%2Bcheckstyle+%2Bline+%2Blength&q=b
[4] http://marc.info/?l=fop-dev&m=103124169719869&w=2

My opinion about the various length limits (parameter count, method length,
file length, line length) as reported by checkstyle is that these should
interpreted as guidelines, and not hard rules. In the case of existing
files, it makes sense to attempt to adhere to them when possible (and I have
done this), while recognizing that some exceptions are inevitable. In the
case of new files, I agree they are reasonable targets, but I would not
readily agree to slavish adherence. Some latitude should be afforded to
individual coders, particularly when they are adding a substantial amount of
code in new files.

Regarding identifier length, neither the FOP coding conventions [2] nor
checkstyle indicates or checks for any kind of limitation; so I will
respectfully decline to change my code based on other autho

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Chris Bowditch

On 21/10/2011 09:36, Simon Pepping wrote:

Hi Simon,


I am pleased to learn that you are also in need of this new
functionality.

I share some of Vincent and Peter's concerns about technical points of
the code. On the other hand, this is the only implementation of
complex scripts we have, created by Glenn, in the style of Glenn. It
is an initial implementation, and it is normal that it requires
further work, maybe even design changes to make it more flexible. Does
keeping it in a branch make that further work easier? Merging it into
trunk will enhance its visibility, and make it available to more
users.


I'm not opposing the merge, I simply saw it as an appropriate milesone 
at which to open the debate on our concerns. It feels like we are making 
some progress here, so thanks for helping the debate along. I would 
really like to see an acknowledgement from Glenn that there are some 
imperfections that need addressing. If I saw that then I would give my 
full backing, but even without that I would vote +0 for the merge for 
the reasons you highlight above.


Thanks,

Chris



Simon

On Thu, Oct 20, 2011 at 02:02:10PM +0100, Chris Bowditch wrote:

On 19/10/2011 19:32, Simon Pepping wrote:

Hi Simon,

I think you misunderstood my mail. I don't want to stop the merge. I
simply thought it was an appropriate time to discuss some concerns
that Vincent and Peter had identified. You are preaching to the
converted about the need for supporting Complex scripts. It is an
urgent requirement for us too.

If we don't discuss our concerns over the code now, then when do we
discuss it?

Vincent and Peter will be replying to this thread shortly and will
outline their primary concerns then.

Thanks,

Chris







Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Simon Pepping
I am pleased to learn that you are also in need of this new
functionality.

I share some of Vincent and Peter's concerns about technical points of
the code. On the other hand, this is the only implementation of
complex scripts we have, created by Glenn, in the style of Glenn. It
is an initial implementation, and it is normal that it requires
further work, maybe even design changes to make it more flexible. Does
keeping it in a branch make that further work easier? Merging it into
trunk will enhance its visibility, and make it available to more
users.

Simon

On Thu, Oct 20, 2011 at 02:02:10PM +0100, Chris Bowditch wrote:
> On 19/10/2011 19:32, Simon Pepping wrote:
> 
> Hi Simon,
> 
> I think you misunderstood my mail. I don't want to stop the merge. I
> simply thought it was an appropriate time to discuss some concerns
> that Vincent and Peter had identified. You are preaching to the
> converted about the need for supporting Complex scripts. It is an
> urgent requirement for us too.
> 
> If we don't discuss our concerns over the code now, then when do we
> discuss it?
> 
> Vincent and Peter will be replying to this thread shortly and will
> outline their primary concerns then.
> 
> Thanks,
> 
> Chris
> 


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Simon Pepping
On Thu, Oct 20, 2011 at 02:53:54PM +0100, Vincent Hennebert wrote:
> Here are the sizes of some new files:
> 1075 src/java/org/apache/fop/fonts/GlyphSequence.java
> 1089 src/java/org/apache/fop/fonts/GlyphProcessingState.java
> 1269 
> src/codegen/unicode/java/org/apache/fop/text/bidi/GenerateBidiTestData.java
> 2034 src/java/org/apache/fop/layoutmgr/BidiUtil.java
> 3449 test/java/org/apache/fop/complexscripts/util/TTXFile.java
> 
> This latter one contains more than 50 field
> declarations, and the Handler.startElement method alone is more than
> 1800 lines long.
> 
> Also, the o.a.f.fonts.truetype.TTFFile class has now grown to
> 5502 lines. That’s 3 times its original size which was already too big.
> I regularly find myself looking at bits of this class, and I would be
> unable to do so on a 5500 line class.

I agree that these are undesirably long.
 
> I don’t think it needs to be explained why big classes are undesirable?
> 
> Also, most files disable Checkstyle checks, the most important ones
> being line length and white space. Many files have too long lines which
> makes it a pain to read through, having to horizontally scroll all the
> time. We agreed on a certain coding style in the project and it would be
> good if new code could adhere to it.

I agree that this is not in compliance with the code style of the FOP
project.

> Speaking of variable names, here is a method picked from
> o.a.f.fonts.GlyphSequence:
> /**
>  * Merge overlapping and abutting sub-intervals.
>  */
> private static int[] mergeIntervals ( int[] ia ) {
> int ni = ia.length;
> int i, n, nm, is, ie;
> // count merged sub-intervals
> for ( i = 0, n = ni, nm = 0, is = ie = -1; i < n; i += 2 ) {
> int s = ia [ i + 0 ];
> int e = ia [ i + 1 ];
> if ( ( ie < 0 ) || ( s > ie ) ) {
> is = s;
> ie = e;
> nm++;
> } else if ( s >= is ) {
> if ( e > ie ) {
> ie = e;
> }
> }
> }
> int[] mi = new int [ nm * 2 ];
> // populate merged sub-intervals
> for ( i = 0, n = ni, nm = 0, is = ie = -1; i < n; i += 2 ) {
> int s = ia [ i + 0 ];
> int e = ia [ i + 1 ];
> int k = nm * 2;
> if ( ( ie < 0 ) || ( s > ie ) ) {
> is = s;
> ie = e;
> mi [ k + 0 ] = is;
> mi [ k + 1 ] = ie;
> nm++;
> } else if ( s >= is ) {
> if ( e > ie ) {
> ie = e;
> }
> mi [ k - 1 ] = ie;
> }
> }
> return mi;
> }
> 

It could be refactored as follows:

/**
 * Merge overlapping and abutting sub-intervals.
 * @param inputArray The array to be merged
 */
private static int[] mergeIntervals ( int[] inputArray ) {
int numMerge = 0;
// count merged sub-intervals
for ( int i = 0, iCurrent = -1, iNext = -1; i < inputArray.length; i += 2 ) 
{
int current = inputArray [ i + 0 ];
int next = inputArray [ i + 1 ];
if ( ( iNext < 0 ) || ( current > iNext ) ) {
iCurrent = current;
iNext = next;
numMerge++;
} else if ( current >= iCurrent ) {
if ( next > iNext ) {
iNext = next;
}
}
}
int[] returnArray = new int [ numMerge * 2 ];
// populate merged sub-intervals
for ( int i = 0, numMerge2 = 0, iCurrent = -1, iNext = -1; i < 
inputArray.length; i += 2 ) {
int current = inputArray [ i + 0 ];
int next = inputArray [ i + 1 ];
int numMerge2Square = numMerge2 * 2;
if ( ( iNext < 0 ) || ( current > iNext ) ) {
iCurrent = current;
iNext = next;
returnArray [ numMerge2Square + 0 ] = iCurrent;
returnArray [ numMerge2Square + 1 ] = iNext;
numMerge2++;
} else if ( current >= iCurrent ) {
if ( next > iNext ) {
iNext = next;
}
returnArray [ numMerge2Square - 1 ] = iNext;
}
}
return returnArray;
}

Many variables are now limited in scope to the loops. Only numMerge
and returnArray are visible outside the loop. The names make it
somewhat clearer what the code does (if I guessed the names right).

BTW Is there a requirement that the length of the input array is even?
If not, inputArray [ i + 1 ] will raise an exception if i == n-1.

So, yes, I agree with your objections against the actual format of the
code.

Simon


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-21 Thread Glenn Adams
On Thu, Oct 20, 2011 at 10:31 PM, Peter Hancock wrote:

>
> From the surface I would have been very much in favor of supporting a
> merge in the near future, however I have had the chance to review some
> areas of the complex script branch and I have some concerns.
> The treatment of Unicode Bidi spans from the creation of the FO Tree
> through to the construction of the Area Tree and I would have liked to
> have seen the complex scripts solution integrate the Unicode Bidi
> Algorithm more directly into the core process:  For example, the
> implementation performs a post process on the FO Tree to resolve the
> Bidi properties of FONodes relating to text. It would be preferable to
> see the construction of the FO Tree embracing this Bidi aspect:
> FONodes should be responsible for determining their own bidi state
> from the fo node semantics in context to the position in the tree.
> Such an implementation would immediately force the maintainer to
> consider how a change would effect the Bidi process.
>

Please review XSL-FO 1.1 Section 5.8, and, in particular:

"the algorithm is applied to a sequence of characters coming from the
content of one or more formatting objects. The sequence of characters is
created by processing a fragment of the formatting object tree. A *fragment* is
any contiguous sequence of children of some formatting object in the tree.
The sequence is created by doing a pre-order traversal of the fragment down
to the fo:character level."

"the final, text reordering step is not done during refinement. Instead, the
XSL equivalent of re-ordering is done during area tree generation"

The current implementation adheres to the XSL-FO specification in this
regard, while your suggestion that this behavior be isolated to individual
FONodes is contrary to the specification and does not permit correct
implementation of the functionality required.

I realize this is a complex subject area that requires considerable domain
knowledge, but you can take my word as a domain expert (having implemented
this functionality multiple times in commercial products) that the approach
I have taken is (1) the most consistent with the XSL-FO specification, (2)
the behavior required to achieve the desired functionality, and (3) the
minimum changes and points of dependency to existing code.

In contrast, a more distributed approach such as you suggest would (1)
diverge from XSL-FO specified behavior, (2) increase and distribute the
number of points of interaction with existing code so as to make behavior
harder to understand, test, and debug, and, most telling, (3) not provide
any functional or performance advantage.

Regards,
Glenn


RE: Merge Request - Temp_ComplexScripts into Trunk

2011-10-20 Thread Jonathan Levinson
Hi Simon,

I've contacted my management and asked what our teams can do to help test.  I 
report to our development not to our quality departments, and I can't speak for 
our quality departments.  I've contacted our international teams about what 
they can do to help test.

The bottom-line to our testing is that when a new FOP is released, we test our 
software on the new FOP, and testing is done with every application that uses 
FOP.  With an international FOP, international applications would be tested.  
That much is guaranteed by our normal testing process.  It is a tribute to the 
quality of FOP that we have never had to report a FOP issue, even though our 
reports can be quite complicated.  But I take it you would like us to get 
involved with the testing effort before a new FOP is released.  When you are 
discussing our involvement, are you discussing our testing the FOP that results 
from the merger onto the trunk, once that is accomplished?   I understand you 
are ironing out the details of what that merger would look like. 

You said bug reports should go to fop-users, but isn't it the case that .fo 
attachments won't be accepted by fop-users?  Don't bug reports have to be 
created through Bugzilla?  We can discuss what we see in terms of bugs on 
fop-users, but if we can't provide .fo files, won't our discussion be less 
helpful?

Best Regards,
Jonathan Levinson
Senior Software Developer
Object Group
InterSystems
+1 617-621-0600
jonathan.levin...@intersystems.com


> -Original Message-
> From: Simon Pepping [mailto:spepp...@leverkruid.eu]
> Sent: Thursday, October 20, 2011 3:19 AM
> To: fop-dev@xmlgraphics.apache.org
> Subject: Re: Merge Request - Temp_ComplexScripts into Trunk
> 
> Jonathan,
> 
> Obviously, FOP's strongest supporters over the past years do not require this
> new functionality. FOP needs the additional support of new stakeholders of 
> this
> new functionality. Could your teams test it on their documents and report 
> their
> findings to the fop-user email list?
> 
> Simon Pepping
> 
> On Wed, Oct 19, 2011 at 03:20:40PM -0400, Jonathan Levinson wrote:
> > We -- at InterSystems -- deploy an application that runs in upwards of 40
> countries, using many of the languages for which complex script support is
> required.
> >
> > We definitely need complex script support.  It is a requirement for us.



Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-20 Thread Peter Hancock
This is a tough one.

The need for complex script support in FOP is likely high on the wish
list of any global supporter of FOP and it is certainly in the
interest of the project to support. The amount of work that Glenn has
done is considerable: the volume of code, the test coverage and the
obvious depth of domain knowledge.

>From the surface I would have been very much in favor of supporting a
merge in the near future, however I have had the chance to review some
areas of the complex script branch and I have some concerns.
The treatment of Unicode Bidi spans from the creation of the FO Tree
through to the construction of the Area Tree and I would have liked to
have seen the complex scripts solution integrate the Unicode Bidi
Algorithm more directly into the core process:  For example, the
implementation performs a post process on the FO Tree to resolve the
Bidi properties of FONodes relating to text. It would be preferable to
see the construction of the FO Tree embracing this Bidi aspect:
FONodes should be responsible for determining their own bidi state
from the fo node semantics in context to the position in the tree.
Such an implementation would immediately force the maintainer to
consider how a change would effect the Bidi process.

The layout and area tree construction phase interact a little more
directly  with the Bidi process, however most of the Bidi work is
delegated to static methods of a utility class (..layoutmgr.BidiUtil,
also used heavily in the Bidi post-process of the FO Tree) and
consequently require many instanceof expressions because of
differences in Bidi behavior between the Area classes.  Areas should
be responsible for the character re-ordering process.

Having this functionality in FOP is desirable and I would encourage
steps to address these concerns, and those outlined by Chris and
Vincent.

Peter

On Thu, Oct 20, 2011 at 2:53 PM, Vincent Hennebert  wrote:
> The Complex Scripts feature is obviously a great enhancement and we
> would all love to have it implemented in FOP. However, that should not
> come at the expense of maintainability and the implementation of other
> new features.
>
> When I look at the code in the Temp_ComplexScripts branch, I have
> serious concerns regarding the latter two points. I would oppose merging
> the branch back to Trunk until those are resolved.
>
> Here are the sizes of some new files:
> 1075 src/java/org/apache/fop/fonts/GlyphSequence.java
> 1089 src/java/org/apache/fop/fonts/GlyphProcessingState.java
> 1269 
> src/codegen/unicode/java/org/apache/fop/text/bidi/GenerateBidiTestData.java
> 2034 src/java/org/apache/fop/layoutmgr/BidiUtil.java
> 3449 test/java/org/apache/fop/complexscripts/util/TTXFile.java
>
> This latter one contains more than 50 field
> declarations, and the Handler.startElement method alone is more than
> 1800 lines long.
>
> Also, the o.a.f.fonts.truetype.TTFFile class has now grown to
> 5502 lines. That’s 3 times its original size which was already too big.
> I regularly find myself looking at bits of this class, and I would be
> unable to do so on a 5500 line class.
>
> I don’t think it needs to be explained why big classes are undesirable?
>
> Also, most files disable Checkstyle checks, the most important ones
> being line length and white space. Many files have too long lines which
> makes it a pain to read through, having to horizontally scroll all the
> time. We agreed on a certain coding style in the project and it would be
> good if new code could adhere to it.
>
> Speaking of variable names, here is a method picked from
> o.a.f.fonts.GlyphSequence:
>    /**
>     * Merge overlapping and abutting sub-intervals.
>     */
>    private static int[] mergeIntervals ( int[] ia ) {
>        int ni = ia.length;
>        int i, n, nm, is, ie;
>        // count merged sub-intervals
>        for ( i = 0, n = ni, nm = 0, is = ie = -1; i < n; i += 2 ) {
>            int s = ia [ i + 0 ];
>            int e = ia [ i + 1 ];
>            if ( ( ie < 0 ) || ( s > ie ) ) {
>                is = s;
>                ie = e;
>                nm++;
>            } else if ( s >= is ) {
>                if ( e > ie ) {
>                    ie = e;
>                }
>            }
>        }
>        int[] mi = new int [ nm * 2 ];
>        // populate merged sub-intervals
>        for ( i = 0, n = ni, nm = 0, is = ie = -1; i < n; i += 2 ) {
>            int s = ia [ i + 0 ];
>            int e = ia [ i + 1 ];
>            int k = nm * 2;
>            if ( ( ie < 0 ) || ( s > ie ) ) {
>                is = s;
>                ie = e;
>                mi [ k + 0 ] = is;
>                mi [ k + 1 ] = ie;
>                nm++;
>            } else if ( s >= is ) {
>                if ( e > ie ) {
>                    ie = e;
>                }
>                mi [ k - 1 ] = ie;
>            }
>        }
>        return mi;
>    }
>
> Now I fully appreciate that one has to have some knowledge of an area to
> understand code relating to that a

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-20 Thread Vincent Hennebert
The Complex Scripts feature is obviously a great enhancement and we
would all love to have it implemented in FOP. However, that should not
come at the expense of maintainability and the implementation of other
new features.

When I look at the code in the Temp_ComplexScripts branch, I have
serious concerns regarding the latter two points. I would oppose merging
the branch back to Trunk until those are resolved.

Here are the sizes of some new files:
1075 src/java/org/apache/fop/fonts/GlyphSequence.java
1089 src/java/org/apache/fop/fonts/GlyphProcessingState.java
1269 src/codegen/unicode/java/org/apache/fop/text/bidi/GenerateBidiTestData.java
2034 src/java/org/apache/fop/layoutmgr/BidiUtil.java
3449 test/java/org/apache/fop/complexscripts/util/TTXFile.java

This latter one contains more than 50 field
declarations, and the Handler.startElement method alone is more than
1800 lines long.

Also, the o.a.f.fonts.truetype.TTFFile class has now grown to
5502 lines. That’s 3 times its original size which was already too big.
I regularly find myself looking at bits of this class, and I would be
unable to do so on a 5500 line class.

I don’t think it needs to be explained why big classes are undesirable?

Also, most files disable Checkstyle checks, the most important ones
being line length and white space. Many files have too long lines which
makes it a pain to read through, having to horizontally scroll all the
time. We agreed on a certain coding style in the project and it would be
good if new code could adhere to it.

Speaking of variable names, here is a method picked from
o.a.f.fonts.GlyphSequence:
/**
 * Merge overlapping and abutting sub-intervals.
 */
private static int[] mergeIntervals ( int[] ia ) {
int ni = ia.length;
int i, n, nm, is, ie;
// count merged sub-intervals
for ( i = 0, n = ni, nm = 0, is = ie = -1; i < n; i += 2 ) {
int s = ia [ i + 0 ];
int e = ia [ i + 1 ];
if ( ( ie < 0 ) || ( s > ie ) ) {
is = s;
ie = e;
nm++;
} else if ( s >= is ) {
if ( e > ie ) {
ie = e;
}
}
}
int[] mi = new int [ nm * 2 ];
// populate merged sub-intervals
for ( i = 0, n = ni, nm = 0, is = ie = -1; i < n; i += 2 ) {
int s = ia [ i + 0 ];
int e = ia [ i + 1 ];
int k = nm * 2;
if ( ( ie < 0 ) || ( s > ie ) ) {
is = s;
ie = e;
mi [ k + 0 ] = is;
mi [ k + 1 ] = ie;
nm++;
} else if ( s >= is ) {
if ( e > ie ) {
ie = e;
}
mi [ k - 1 ] = ie;
}
}
return mi;
}

Now I fully appreciate that one has to have some knowledge of an area to
understand code relating to that area, but no level of expertise,
however high, will help me to quickly understand this code. This is just
too easy to mistake one variable for another one when they differ by
only one letter.

Combined, these would make the code very hard to maintain by anyone
other than the original author, and this is why I’m opposed to merging
it to Trunk in its current form. When I commit code I do my very best to
make it as easy to read and understand by other people, and I would
really appreciate it I could have the same in return.


Thanks,
Vincent


On 19/10/11 19:32, Simon Pepping wrote:
> Over the past ten years computing has pervaded life in all its facets,
> and spread over the world. As a consequence computing is now used in
> all languages and all scripts.
> 
> When I open my devanagari test file in emacs, it just works. When I
> open it in firefox, it just works. The same when I open it in
> LibreOffice Writer. I am sure that, if I would open it in *the* *Word*
> processor, it would just work. When I process the file with FOP, it
> does not work. With the complex scripts functionality, it works,
> dependent on the use of supported or otherwise suitable fonts. (That
> is true for all above applications, but apparently those come
> configured with my system.)
> 
> So what does a person do who believes in the XML stack to maintain his
> documentation, and wants to send his documents in Hindi to his
> customers? See that XSL-FO with FOP is not a suitable solution for him
> because Hindi uses a complex script?
> 
> FOP needs the complex scripts functionality to remain a player in the
> global playing field.
> 
> This is for me the single overarching consideration to want this
> functionality in FOP's trunk code, and in, say, half a year in a
> release. All other considerations are minor, unless one wants to claim
> that this code will block FOP's further development and maintenance in
> the coming years.
> 
> Of course, not everybody needs this functionality, and there is a fear
> of increased maintenanc

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-20 Thread Chris Bowditch

On 19/10/2011 19:32, Simon Pepping wrote:

Hi Simon,

I think you misunderstood my mail. I don't want to stop the merge. I 
simply thought it was an appropriate time to discuss some concerns that 
Vincent and Peter had identified. You are preaching to the converted 
about the need for supporting Complex scripts. It is an urgent 
requirement for us too.


If we don't discuss our concerns over the code now, then when do we 
discuss it?


Vincent and Peter will be replying to this thread shortly and will 
outline their primary concerns then.


Thanks,

Chris


Over the past ten years computing has pervaded life in all its facets,
and spread over the world. As a consequence computing is now used in
all languages and all scripts.

When I open my devanagari test file in emacs, it just works. When I
open it in firefox, it just works. The same when I open it in
LibreOffice Writer. I am sure that, if I would open it in *the* *Word*
processor, it would just work. When I process the file with FOP, it
does not work. With the complex scripts functionality, it works,
dependent on the use of supported or otherwise suitable fonts. (That
is true for all above applications, but apparently those come
configured with my system.)

So what does a person do who believes in the XML stack to maintain his
documentation, and wants to send his documents in Hindi to his
customers? See that XSL-FO with FOP is not a suitable solution for him
because Hindi uses a complex script?

FOP needs the complex scripts functionality to remain a player in the
global playing field.

This is for me the single overarching consideration to want this
functionality in FOP's trunk code, and in, say, half a year in a
release. All other considerations are minor, unless one wants to claim
that this code will block FOP's further development and maintenance in
the coming years.

Of course, not everybody needs this functionality, and there is a fear
of increased maintenance overhead. But the question is: For whom do we
develop FOP? Also for the large part of the world that uses complex
scripts?

With the development of the complex scripts functionality, Glenn Adams
and his sponsor Basis Technologies have created a new reality, which
is not going to go away. If this functionality does not end up in FOP,
it will probably live on elsewhere. If the FOP team is fine with that,
say no to the merge request, and feel comfortable with a trusted niche
application.

Simon Pepping

On Wed, Oct 19, 2011 at 09:50:24AM +0100, Chris Bowditch wrote:

On 18/10/2011 19:55, Simon Pepping wrote:

I merged the ComplexScripts branch into trunk. Result:

Hi Simon,

As well of the question of how to do the merge there is also the
question should we do the merge? Of course this is a valuable
feature to the community, and Glenn has invested a lot of time in
its development but is it truely production ready? I asked Vincent
to take a look at the branch earlier in the year as it's a feature
we also need, but he had several concerns that have not be
adequately answered. Take a look at comment #30;
https://issues.apache.org/bugzilla/show_bug.cgi?id=49687#c30

I'm not sure why Vincent describes it as a "brief look" because he
spent several days on it. I also asked Peter to take a look and he
had similar concerns. 2 or 3 letter variable names are a barrier for
any committer wanting to maintain this code and I don't think it is
a sufficient argument to say that a pre-requisite to maintaining
this code is to be a domain expert. I would hope that any
experienced committer with a debugger should be able to solve some
bugs. Obviously certain problems will require domain expertise, but
the variables names are a key barrier to being able to maintain this
code.

I realise my comments might be a little controversial and I don't
mean any disrespect to Glenn or his work (which is largely
excellent), but we should at least discuss these topics before the
merge is completed.






Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-20 Thread Simon Pepping
Jonathan,

Obviously, FOP's strongest supporters over the past years do not
require this new functionality. FOP needs the additional support of
new stakeholders of this new functionality. Could your teams test it
on their documents and report their findings to the fop-user email
list?

Simon Pepping

On Wed, Oct 19, 2011 at 03:20:40PM -0400, Jonathan Levinson wrote:
> We -- at InterSystems -- deploy an application that runs in upwards of 40 
> countries, using many of the languages for which complex script support is 
> required.
> 
> We definitely need complex script support.  It is a requirement for us.


RE: Merge Request - Temp_ComplexScripts into Trunk

2011-10-19 Thread Jonathan Levinson
We -- at InterSystems -- deploy an application that runs in upwards of 40 
countries, using many of the languages for which complex script support is 
required.

We definitely need complex script support.  It is a requirement for us.

Thanks,
Jonathan Levinson
Senior Software Developer
Object Group
InterSystems
+1 617-621-0600
jonathan.levin...@intersystems.com


> -Original Message-
> From: Simon Pepping [mailto:spepp...@leverkruid.eu]
> Sent: Wednesday, October 19, 2011 2:32 PM
> To: fop-dev@xmlgraphics.apache.org
> Subject: Re: Merge Request - Temp_ComplexScripts into Trunk
> 
> Over the past ten years computing has pervaded life in all its facets, and 
> spread
> over the world. As a consequence computing is now used in all languages and 
> all
> scripts.
> 
> When I open my devanagari test file in emacs, it just works. When I open it in
> firefox, it just works. The same when I open it in LibreOffice Writer. I am 
> sure
> that, if I would open it in *the* *Word* processor, it would just work. When I
> process the file with FOP, it does not work. With the complex scripts
> functionality, it works, dependent on the use of supported or otherwise 
> suitable
> fonts. (That is true for all above applications, but apparently those come
> configured with my system.)
> 
> So what does a person do who believes in the XML stack to maintain his
> documentation, and wants to send his documents in Hindi to his customers? See
> that XSL-FO with FOP is not a suitable solution for him because Hindi uses a
> complex script?
> 
> FOP needs the complex scripts functionality to remain a player in the global
> playing field.
> 
> This is for me the single overarching consideration to want this 
> functionality in
> FOP's trunk code, and in, say, half a year in a release. All other 
> considerations
> are minor, unless one wants to claim that this code will block FOP's further
> development and maintenance in the coming years.
> 
> Of course, not everybody needs this functionality, and there is a fear of
> increased maintenance overhead. But the question is: For whom do we develop
> FOP? Also for the large part of the world that uses complex scripts?
> 
> With the development of the complex scripts functionality, Glenn Adams and his
> sponsor Basis Technologies have created a new reality, which is not going to 
> go
> away. If this functionality does not end up in FOP, it will probably live on
> elsewhere. If the FOP team is fine with that, say no to the merge request, and
> feel comfortable with a trusted niche application.
> 
> Simon Pepping
> 
> On Wed, Oct 19, 2011 at 09:50:24AM +0100, Chris Bowditch wrote:
> > On 18/10/2011 19:55, Simon Pepping wrote:
> > >I merged the ComplexScripts branch into trunk. Result:
> >
> > Hi Simon,
> >
> > As well of the question of how to do the merge there is also the
> > question should we do the merge? Of course this is a valuable feature
> > to the community, and Glenn has invested a lot of time in its
> > development but is it truely production ready? I asked Vincent to take
> > a look at the branch earlier in the year as it's a feature we also
> > need, but he had several concerns that have not be adequately
> > answered. Take a look at comment #30;
> > https://issues.apache.org/bugzilla/show_bug.cgi?id=49687#c30
> >
> > I'm not sure why Vincent describes it as a "brief look" because he
> > spent several days on it. I also asked Peter to take a look and he had
> > similar concerns. 2 or 3 letter variable names are a barrier for any
> > committer wanting to maintain this code and I don't think it is a
> > sufficient argument to say that a pre-requisite to maintaining this
> > code is to be a domain expert. I would hope that any experienced
> > committer with a debugger should be able to solve some bugs. Obviously
> > certain problems will require domain expertise, but the variables
> > names are a key barrier to being able to maintain this code.
> >
> > I realise my comments might be a little controversial and I don't mean
> > any disrespect to Glenn or his work (which is largely excellent), but
> > we should at least discuss these topics before the merge is completed.



Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-19 Thread Simon Pepping
Over the past ten years computing has pervaded life in all its facets,
and spread over the world. As a consequence computing is now used in
all languages and all scripts.

When I open my devanagari test file in emacs, it just works. When I
open it in firefox, it just works. The same when I open it in
LibreOffice Writer. I am sure that, if I would open it in *the* *Word*
processor, it would just work. When I process the file with FOP, it
does not work. With the complex scripts functionality, it works,
dependent on the use of supported or otherwise suitable fonts. (That
is true for all above applications, but apparently those come
configured with my system.)

So what does a person do who believes in the XML stack to maintain his
documentation, and wants to send his documents in Hindi to his
customers? See that XSL-FO with FOP is not a suitable solution for him
because Hindi uses a complex script?

FOP needs the complex scripts functionality to remain a player in the
global playing field.

This is for me the single overarching consideration to want this
functionality in FOP's trunk code, and in, say, half a year in a
release. All other considerations are minor, unless one wants to claim
that this code will block FOP's further development and maintenance in
the coming years.

Of course, not everybody needs this functionality, and there is a fear
of increased maintenance overhead. But the question is: For whom do we
develop FOP? Also for the large part of the world that uses complex
scripts?

With the development of the complex scripts functionality, Glenn Adams
and his sponsor Basis Technologies have created a new reality, which
is not going to go away. If this functionality does not end up in FOP,
it will probably live on elsewhere. If the FOP team is fine with that,
say no to the merge request, and feel comfortable with a trusted niche
application.

Simon Pepping

On Wed, Oct 19, 2011 at 09:50:24AM +0100, Chris Bowditch wrote:
> On 18/10/2011 19:55, Simon Pepping wrote:
> >I merged the ComplexScripts branch into trunk. Result:
> 
> Hi Simon,
> 
> As well of the question of how to do the merge there is also the
> question should we do the merge? Of course this is a valuable
> feature to the community, and Glenn has invested a lot of time in
> its development but is it truely production ready? I asked Vincent
> to take a look at the branch earlier in the year as it's a feature
> we also need, but he had several concerns that have not be
> adequately answered. Take a look at comment #30;
> https://issues.apache.org/bugzilla/show_bug.cgi?id=49687#c30
> 
> I'm not sure why Vincent describes it as a "brief look" because he
> spent several days on it. I also asked Peter to take a look and he
> had similar concerns. 2 or 3 letter variable names are a barrier for
> any committer wanting to maintain this code and I don't think it is
> a sufficient argument to say that a pre-requisite to maintaining
> this code is to be a domain expert. I would hope that any
> experienced committer with a debugger should be able to solve some
> bugs. Obviously certain problems will require domain expertise, but
> the variables names are a key barrier to being able to maintain this
> code.
> 
> I realise my comments might be a little controversial and I don't
> mean any disrespect to Glenn or his work (which is largely
> excellent), but we should at least discuss these topics before the
> merge is completed.


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-19 Thread Pascal Sancho
HI,

IMHO, "Production ready" should only cited before a FOP release, not for
a merge of branch to trunk.
At this stage, the only questions are about regression tests (and code
readability, since open source).

Merging the branch now should encourage more users to test these new
features and give feedback.

Le 19/10/2011 14:25, Glenn Adams a écrit :
> You ask about whether this merge is truly production ready? Is this a
> requirement for doing a merge from a temporary branch? How would you
> define truly production ready? Is FOP itself truly production
> ready? I've spend the last couple of months creating over 500,000 tests.
> I wonder if any other part of FOP is so thoroughly tested? Frankly
> speaking, I would like to put even more tests into place, and I will
> over time, but I'm not going to do that until the current work is merged.
>
-- 
Pascal


Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-19 Thread Glenn Adams
I provided a detailed comment on Vincent's brief review at:

https://issues.apache.org/bugzilla/show_bug.cgi?id=49687#c31

With the exception of the the following comment, the remaining comments are
editorial in nature or have no actionable response.

> How feasible is it to run the BIDI algorithm on a subset of the FO tree? If 
> I'm
> right, ATM it is launched on a whole page-sequence. When we refactor the code
> so that layout can be started before a whole page-sequence has been parsed, in
> order to avoid the infamous memory issue that a lot of users are running into,
> will that code allow to do it?

Regarding the point(s) at which the bidi algorithm is invoked, I agree there
are possible, alternative points of invocation. Some of which may distribute
the cost of its performance as opposed to concentrating that work in one
call. However, even if there are alternatives, it is not clear whether they
are desirable, and whether it would have any impact on memory usage or time
performance. Further, "when we refactor the code" is an unknown, future
possibility, and making a substantial change at this time in order to
perform certain optimizations that only come into play in the eventuality
that such code refactoring occurs seems to be an example of premature
optimization.

As regards to symbol name length, I would merely cite that the FOP Coding
Conventions  do not
specify length of an identifier (either short or long) as an established
convention. Let's not use this valuable contribution as a *cause célèbre* to
argue or establish new conventions that do not exist. Otherwise we will soon
be arguing over whether i, j, and k are reasonable identifiers for
enumeration variables.

You ask about whether this merge is truly production ready? Is this a
requirement for doing a merge from a temporary branch? How would you define
truly production ready? Is FOP itself truly production ready? I've spend the
last couple of months creating over 500,000 tests. I wonder if any other
part of FOP is so thoroughly tested? Frankly speaking, I would like to put
even more tests into place, and I will over time, but I'm not going to do
that until the current work is merged.

If there is a substantial bug or regression that would be caused by this
merger, then I'd be happy to address that problem before merging. I look
forward to any report of this nature.

If folks wish to keep pushing the issue of symbol name length, then please
enumerate, by source file and line number each case to which is objected.
Also, please provide objective comparative data showing how the reported
data diverges from statistically similar usage elsewhere in FOP. Please also
define an objective measure of the level of domain knowledge needed for
working with specific code in this merge for which a comment about symbol
length is an issue. In particular, please show how a reader skilled in the
arts of the code being considered would not be able to readily infer the
meaning of a specific symbol in cases where I have not explicitly provided a
comment at the place of definition.

G.

On Wed, Oct 19, 2011 at 4:50 PM, Chris Bowditch
wrote:

> On 18/10/2011 19:55, Simon Pepping wrote:
>
>> I merged the ComplexScripts branch into trunk. Result:
>>
>
> Hi Simon,
>
> As well of the question of how to do the merge there is also the question
> should we do the merge? Of course this is a valuable feature to the
> community, and Glenn has invested a lot of time in its development but is it
> truely production ready? I asked Vincent to take a look at the branch
> earlier in the year as it's a feature we also need, but he had several
> concerns that have not be adequately answered. Take a look at comment #30;
> https://issues.apache.org/**bugzilla/show_bug.cgi?id=**49687#c30
>
> I'm not sure why Vincent describes it as a "brief look" because he spent
> several days on it. I also asked Peter to take a look and he had similar
> concerns. 2 or 3 letter variable names are a barrier for any committer
> wanting to maintain this code and I don't think it is a sufficient argument
> to say that a pre-requisite to maintaining this code is to be a domain
> expert. I would hope that any experienced committer with a debugger should
> be able to solve some bugs. Obviously certain problems will require domain
> expertise, but the variables names are a key barrier to being able to
> maintain this code.
>
> I realise my comments might be a little controversial and I don't mean any
> disrespect to Glenn or his work (which is largely excellent), but we should
> at least discuss these topics before the merge is completed.
>
> Thanks
>
> Chris
>
>
>
>> --- Merging r981451 through r1185769 into '.':
>>
>> Summary of conflicts:
>>   Text conflicts: 58
>>   Tree conflicts: 126
>>
>> Most tree conflicts are probably an artifact of subversion. See
>>
>>> svn info lib/xmlgraphics-co

Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-19 Thread Chris Bowditch

On 18/10/2011 19:55, Simon Pepping wrote:

I merged the ComplexScripts branch into trunk. Result:


Hi Simon,

As well of the question of how to do the merge there is also the 
question should we do the merge? Of course this is a valuable feature to 
the community, and Glenn has invested a lot of time in its development 
but is it truely production ready? I asked Vincent to take a look at the 
branch earlier in the year as it's a feature we also need, but he had 
several concerns that have not be adequately answered. Take a look at 
comment #30; https://issues.apache.org/bugzilla/show_bug.cgi?id=49687#c30


I'm not sure why Vincent describes it as a "brief look" because he spent 
several days on it. I also asked Peter to take a look and he had similar 
concerns. 2 or 3 letter variable names are a barrier for any committer 
wanting to maintain this code and I don't think it is a sufficient 
argument to say that a pre-requisite to maintaining this code is to be a 
domain expert. I would hope that any experienced committer with a 
debugger should be able to solve some bugs. Obviously certain problems 
will require domain expertise, but the variables names are a key barrier 
to being able to maintain this code.


I realise my comments might be a little controversial and I don't mean 
any disrespect to Glenn or his work (which is largely excellent), but we 
should at least discuss these topics before the merge is completed.


Thanks

Chris



--- Merging r981451 through r1185769 into '.':

Summary of conflicts:
   Text conflicts: 58
   Tree conflicts: 126

Most tree conflicts are probably an artifact of subversion. See

svn info lib/xmlgraphics-commons-1.5svn.jar|tail -n 4

Tree conflict: local add, incoming add upon merge
   Source  left: (file) 
https://svn.apache.org/repos/asf/xmlgraphics/fop/trunk/lib/xmlgraphics-commons-1.5svn.jar@981450
   Source right: (file) 
https://svn.apache.org/repos/asf/xmlgraphics/fop/branches/Temp_ComplexScripts/lib/xmlgraphics-commons-1.5svn.jar@1185769

This will cause quite some work.

I also merged trunk into ComplexScripts. Result:

--- Merging r1177231 through r1185780 into '.':

Summary of conflicts:
   Text conflicts: 2
   Tree conflicts: 2

I resolved the text conflicts easily. Again the tree conflicts were
not real conflicts.

Both merges should result in the same code: trunk + ComplexScripts.

I did not commit the merge of trunk into ComplexScripts to the
repository. I do not think it would facilitate merging ComplexScripts
into trunk.

Simon

On Sat, Oct 15, 2011 at 06:17:49PM +0800, Glenn Adams wrote:

With this latest patch, I am satisfied that there is sufficient testing and
stability in the CS branch to support its merger into trunk. Therefore, I
request that such a merge be accomplished after applying patch 5 to the CS
branch as described below.






Re: Merge Request - Temp_ComplexScripts into Trunk

2011-10-18 Thread Simon Pepping
I merged the ComplexScripts branch into trunk. Result:

--- Merging r981451 through r1185769 into '.':

Summary of conflicts:
  Text conflicts: 58
  Tree conflicts: 126

Most tree conflicts are probably an artifact of subversion. See
>svn info lib/xmlgraphics-commons-1.5svn.jar|tail -n 4
Tree conflict: local add, incoming add upon merge
  Source  left: (file) 
https://svn.apache.org/repos/asf/xmlgraphics/fop/trunk/lib/xmlgraphics-commons-1.5svn.jar@981450
  Source right: (file) 
https://svn.apache.org/repos/asf/xmlgraphics/fop/branches/Temp_ComplexScripts/lib/xmlgraphics-commons-1.5svn.jar@1185769

This will cause quite some work.

I also merged trunk into ComplexScripts. Result:

--- Merging r1177231 through r1185780 into '.':

Summary of conflicts:
  Text conflicts: 2
  Tree conflicts: 2

I resolved the text conflicts easily. Again the tree conflicts were
not real conflicts.

Both merges should result in the same code: trunk + ComplexScripts.

I did not commit the merge of trunk into ComplexScripts to the
repository. I do not think it would facilitate merging ComplexScripts
into trunk.

Simon

On Sat, Oct 15, 2011 at 06:17:49PM +0800, Glenn Adams wrote:
> With this latest patch, I am satisfied that there is sufficient testing and
> stability in the CS branch to support its merger into trunk. Therefore, I
> request that such a merge be accomplished after applying patch 5 to the CS
> branch as described below.


Merge Request - Temp_ComplexScripts into Trunk

2011-10-15 Thread Glenn Adams
With this latest patch, I am satisfied that there is sufficient testing and
stability in the CS branch to support its merger into trunk. Therefore, I
request that such a merge be accomplished after applying patch 5 to the CS
branch as described below.

This newest patch primarily fills out the a thorough set of test cases for
CS functionality, particularly as related to Arabic and RTL scripts. In
particular, there are now approximately 562,000 test items covering:

   - bidi classes and algorithm - ~216,000 tests
   - opentype advanced typographic tables (gdef/gsub/gpos) - ~6000 tests
   - arabic script word forms - ~340,000 tests

On my MacBook Pro (5 years old), these tests take about 43 seconds (see
junit-complexscripts).

Note that there remains work to be done on CS support, including adding
support for:

   - additional scripts
   - additional output formats

At present, support is provided for:

   - Arabic, Hebrew, and Devanagari Scripts
   - PDF output format

I expect that additional support for other scripts and formats will be added
over time, either by myself, or other members of the community. However, the
absence of support for all complex scripts and all output formats should not
be a deterrent to active use of the support already present. It is now a
good time to broaden the user community of the CS features, and the best way
to do that is to bring it into the trunk at this time.

Regards,
Glenn

On Sat, Oct 15, 2011 at 6:00 PM,  wrote:

> https://issues.apache.org/bugzilla/show_bug.cgi?id=49687
>
> --- Comment #41 from Glenn Adams  2011-10-15 10:00:16
> UTC ---
> This patch is to be applied to revision 1183594 of branch
> Temp_ComplexScripts
> as follows:
>
> cd ${FOP}/branches/Temp_ComplexScripts
> svn update -r 1183594 --force
> svn revert -R .
> svn status # check and remove any unexpected changes prior to patching
> gzcat ${DOWNLOADS}/fop-i18n.arabic-patch-5.diff.gz | patch -p0
> ant resgen-complexscripts clean
> svn add test/java/org/apache/fop/complexscripts/gsub
> svn add test/java/org/apache/fop/complexscripts/gdef
> svn add test/java/org/apache/fop/complexscripts/gpos
> svn add
> test/java/org/apache/fop/complexscripts/arabic/GenerateArabicTestData.java
> svn add test/java/org/apache/fop/complexscripts/arabic/ArabicTestCase.java
> svn add
> test/java/org/apache/fop/complexscripts/arabic/ArabicTestConstants.java
> svn add test/java/org/apache/fop/layoutengine/LayoutEngineTestUtils.java
> svn add test/java/org/apache/fop/util/NumberConverterTestCase.java
> svn add
>
> test/layoutengine/standard-testcases/region-body_column-count_1_writing-mode_rl.xml
> svn add
> test/layoutengine/standard-testcases/page-sequence-force-page-count-odd.xml
> svn add
>
> test/layoutengine/standard-testcases/region-body_column-count_2_writing-mode_rl.xml
> svn add test/resources/complexscripts/arab/data
> svn propdel svn:mime-type
> test/resources/complexscripts/arab/data/arab-001.txt
> svn add src/java/org/apache/fop/util/NumberConverter.java
> svn rm --force
> test/java/org/apache/fop/complexscripts/arabic/ArabicScriptTestSuite.java
> svn commit ...
>
> This patch includes the following:
>
> * bug fixes
> * new layout engine tests for right-to-left writing mode
> * new test utility for using TTX files in testing advanced typographic
> tables
> * new test cases for GDEF/GSUB/GPOS advanced typographic tables
> * new test cases for ~85000 arabic word forms against four fonts
> * new implementation of number formatter for number to string conversion,
> which
>  adds support for arabic, hebrew, thai, and kana scripts
>
> See milestone "Patch 5" at http://skynav.trac.cvsdude.com/fop/report/6 for
> further details.
>
> Regards,
> Glenn
>
> --
> Configure bugmail:
> https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
> --- You are receiving this mail because: ---
> You are the assignee for the bug.
>