Re: Difficulties with Flat XML under source control

2012-06-22 Thread Johannes Sixt
Am 20.06.2012 14:48, schrieb Thorsten Behrens:
 Johannes Sixt wrote:
 - Measurements change. E.g. (just to pick one case), in
 style:graphic-properties the draw:visible-area-width changes from
 6.088cm to 6.089cm. Is there a remedy to avoid changes of this kind?

 Ah; nasty, some rounding problem / internal representation issue -
 possibly again looking at the code we could do better here to make it
 more predictable; possibly using more precision we could do better
 (doubles instead of floats) ?

 Probably. Looking at this again, these changes seem to happen only for
 draw:visible-area-*. Hence, it may also be a matter of conversion
 between screen dimensions (pixels?) and cm/mm/in/etc.

 Hrm, yeah - and we *really* don't want this slow drift - any chance
 you can file a bug with a preferrably small sample doc?

Here we go:

https://bugs.freedesktop.org/show_bug.cgi?id=51334

draw:visible-area-width and -height are properties that pertain only to
OLE objects, IIUC.

-- Hannes
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


RE: Difficulties with Flat XML under source control

2012-06-21 Thread Dennis E. Hamilton
I think it is necessary to look at round-trip out-in conversion preservation.

For out-in (which this is, presumably), you want to record a decimal expression 
of the internal value that will convert back to the exact internal value on 
re-input.  (The in-out case is that the input conversion provide whatever 
internal representation that will convert to the read value on re-output.  
Without additional information, it is generally very difficult to have these be 
the same.)

It is also desirable, of course, that any other ODF consumer use the same 
technique so that its in-out conversion satisfies the out-in condition of the 
original source of the decimal expression of the value.  

There are old technical papers on how to have this work.  The name David Matula 
comes to mind.

There might be solutions in the conversions that exist in the basic Java 
classes for float data types.  I think this was addressed in Common Lisp also.  

-Original Message-
From: libreoffice-bounces+dennis.hamilton=acm@lists.freedesktop.org 
[mailto:libreoffice-bounces+dennis.hamilton=acm@lists.freedesktop.org] On 
Behalf Of Thorsten Behrens
Sent: Wednesday, June 20, 2012 05:49
To: Johannes Sixt
Cc: libreoffice-dev
Subject: Re: Difficulties with Flat XML under source control

Johannes Sixt wrote:
  - Measurements change. E.g. (just to pick one case), in
  style:graphic-properties the draw:visible-area-width changes from
  6.088cm to 6.089cm. Is there a remedy to avoid changes of this kind?
  
  Ah; nasty, some rounding problem / internal representation issue -
  possibly again looking at the code we could do better here to make it
  more predictable; possibly using more precision we could do better
  (doubles instead of floats) ?
 
 Probably. Looking at this again, these changes seem to happen only for
 draw:visible-area-*. Hence, it may also be a matter of conversion
 between screen dimensions (pixels?) and cm/mm/in/etc.
 
Hrm, yeah - and we *really* don't want this slow drift - any chance
you can file a bug with a preferrably small sample doc?

Thanks,

-- Thorsten

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Difficulties with Flat XML under source control

2012-06-21 Thread Stephan Bergmann

On 06/20/2012 03:07 PM, Dennis E. Hamilton wrote:

I think it is necessary to look at round-trip out-in conversion preservation.

For out-in (which this is, presumably), you want to record a decimal expression 
of the internal value that will convert back to the exact internal value on 
re-input.  (The in-out case is that the input conversion provide whatever 
internal representation that will convert to the read value on re-output.  
Without additional information, it is generally very difficult to have these be 
the same.)

It is also desirable, of course, that any other ODF consumer use the same 
technique so that its in-out conversion satisfies the out-in condition of the 
original source of the decimal expression of the value.

There are old technical papers on how to have this work.  The name David Matula 
comes to mind.

There might be solutions in the conversions that exist in the basic Java 
classes for float data types.  I think this was addressed in Common Lisp also.


Hasn't there been progress in that field recently?  Wait, yes, 
http://dl.acm.org/citation.cfm?id=1806623 Printing floating-point 
numbers quickly and accurately with integers by Florian Loitsch.


Stephan
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Difficulties with Flat XML under source control

2012-06-21 Thread Thorsten Behrens
Stephan Bergmann wrote:
 Hasn't there been progress in that field recently?  Wait, yes,
 http://dl.acm.org/citation.cfm?id=1806623 Printing floating-point
 numbers quickly and accurately with integers by Florian Loitsch.
 
Nice catch - and some code is here: http://code.google.com/p/double-conversion/

Cheers,

-- Thorsten


pgpiLP7w9vaFU.pgp
Description: PGP signature
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Difficulties with Flat XML under source control

2012-06-21 Thread Michael Stahl
On 21/06/12 14:07, Stephan Bergmann wrote:
 On 06/20/2012 03:07 PM, Dennis E. Hamilton wrote:
 I think it is necessary to look at round-trip out-in conversion preservation.

 For out-in (which this is, presumably), you want to record a decimal 
 expression of the internal value that will convert back to the exact 
 internal value on re-input.  (The in-out case is that the input conversion 
 provide whatever internal representation that will convert to the read value 
 on re-output.  Without additional information, it is generally very 
 difficult to have these be the same.)

 It is also desirable, of course, that any other ODF consumer use the same 
 technique so that its in-out conversion satisfies the out-in condition of 
 the original source of the decimal expression of the value.

 There are old technical papers on how to have this work.  The name David 
 Matula comes to mind.

 There might be solutions in the conversions that exist in the basic Java 
 classes for float data types.  I think this was addressed in Common Lisp 
 also.
 
 Hasn't there been progress in that field recently?  Wait, yes, 
 http://dl.acm.org/citation.cfm?id=1806623 Printing floating-point 
 numbers quickly and accurately with integers by Florian Loitsch.

i am in awe that it's possible to get a paper on this topic published in
this day and age; one would think this kind of problem would have been
solved 30 years ago, and the developers of popular office suites were
just ignorant of the solutions :)


___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Difficulties with Flat XML under source control

2012-06-21 Thread Michael Stahl
On 17/06/12 22:10, Johannes Sixt wrote:
 - The text:list xml:id=list533178598 changes. That xml:id does not
 seem to be used anywhere. Can I just remove it? What will I lose?

these are sadly auto-generated, which is a bug in itself; they are used
in ODF itself for continuations, i.e. there can be another list that
continues an existing list by referring to its text:id/xml:id;  then
there is another use in ODF 1.2 where RDF metadata can refer to the
element by its xml:id, but that only works if the xml:id is actually
persistent, i.e. the same value that is imported is then exported again;
making the ids persistent requires extending the Writer core, which is a
bit of work...

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Difficulties with Flat XML under source control

2012-06-20 Thread Miklos Vajna
On Tue, Jun 19, 2012 at 07:56:08PM +0200, Johannes Sixt j...@kdbg.org wrote:
  The code to poke at is in:
  
  xmloff/
  and
  sw/source/filter/xml/
 
 Been there, done that. But it's way over my head (and time budget). See
 
 http://thread.gmane.org/gmane.comp.documentfoundation.libreoffice.devel/23528/focus=23543

Still, once you have such a clean script it would be nice to see what
tricks does it do, so we could (step by step) fix LO itself; in the long
term then you would not need such a filter. ;-)
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Difficulties with Flat XML under source control

2012-06-20 Thread Thorsten Behrens
Johannes Sixt wrote:
  - Measurements change. E.g. (just to pick one case), in
  style:graphic-properties the draw:visible-area-width changes from
  6.088cm to 6.089cm. Is there a remedy to avoid changes of this kind?
  
  Ah; nasty, some rounding problem / internal representation issue -
  possibly again looking at the code we could do better here to make it
  more predictable; possibly using more precision we could do better
  (doubles instead of floats) ?
 
 Probably. Looking at this again, these changes seem to happen only for
 draw:visible-area-*. Hence, it may also be a matter of conversion
 between screen dimensions (pixels?) and cm/mm/in/etc.
 
Hrm, yeah - and we *really* don't want this slow drift - any chance
you can file a bug with a preferrably small sample doc?

Thanks,

-- Thorsten


pgpmbQ8ftan0B.pgp
Description: PGP signature
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Difficulties with Flat XML under source control

2012-06-20 Thread Thorsten Behrens
Dennis E. Hamilton wrote:
 For out-in (which this is, presumably), you want to record a
 decimal expression of the internal value that will convert back to
 the exact internal value on re-input.  (The in-out case is that
 the input conversion provide whatever internal representation that
 will convert to the read value on re-output.  Without additional
 information, it is generally very difficult to have these be the
 same.)
 
 It is also desirable, of course, that any other ODF consumer use
 the same technique so that its in-out conversion satisfies the
 out-in condition of the original source of the decimal expression
 of the value.  

Hi Dennis,

yes - but in a first approximation, one can probably relax this a
bit (for the use case at hand): only _after_ the first save
operation this needs to hold. Also, most people would probably be
contempt with this to work for *one* ODF editing application.

 It is also desirable, of course, that any other ODF consumer use
 the same technique so that its in-out conversion satisfies the
 out-in condition of the original source of the decimal expression
 of the value.  
 
Note that there's a difference between spreadsheet values (for which
I think de facto the above holds true - likely everyone stores those
in IEEE doubles), and other content: consumers might employ rather
complex transformations to arrive at internal values, given e.g. a
gradient center coordinate - asking for common behaviour is very
close to asking for a common ODF application model.

Cheers,

-- Thorsten


pgp9ixmZUauRP.pgp
Description: PGP signature
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


RE: Difficulties with Flat XML under source control

2012-06-20 Thread Dennis E. Hamilton
It occurs to me that Postscript and PDF have dealt with this for imaging models 
that work consistently.  Here, the in is to a renderer, but the model for 
representation of decimal expressions of find-sensitivity values seems to have 
been handled (for years).  Those specifications may be some help too.

 - Dennis

-Original Message-
From: Thorsten [mailto:netsr...@googlemail.com] On Behalf Of Thorsten Behrens
Sent: Wednesday, June 20, 2012 06:32
To: Dennis E. Hamilton
Cc: 'libreoffice-dev'
Subject: Re: Difficulties with Flat XML under source control

Dennis E. Hamilton wrote:
 For out-in (which this is, presumably), you want to record a
 decimal expression of the internal value that will convert back to
 the exact internal value on re-input.  (The in-out case is that
 the input conversion provide whatever internal representation that
 will convert to the read value on re-output.  Without additional
 information, it is generally very difficult to have these be the
 same.)
 
 It is also desirable, of course, that any other ODF consumer use
 the same technique so that its in-out conversion satisfies the
 out-in condition of the original source of the decimal expression
 of the value.  

Hi Dennis,

yes - but in a first approximation, one can probably relax this a
bit (for the use case at hand): only _after_ the first save
operation this needs to hold. Also, most people would probably be
contempt with this to work for *one* ODF editing application.

 It is also desirable, of course, that any other ODF consumer use
 the same technique so that its in-out conversion satisfies the
 out-in condition of the original source of the decimal expression
 of the value.  
 
Note that there's a difference between spreadsheet values (for which
I think de facto the above holds true - likely everyone stores those
in IEEE doubles), and other content: consumers might employ rather
complex transformations to arrive at internal values, given e.g. a
gradient center coordinate - asking for common behaviour is very
close to asking for a common ODF application model.

Cheers,

-- Thorsten

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Difficulties with Flat XML under source control

2012-06-19 Thread Michael Meeks
Hi Johannes,

On Sun, 2012-06-17 at 22:10 +0200, Johannes Sixt wrote:
 I want to place a software manual under source control. It seems most
 feasible to use a flat XML format, in particular, .fodt.

Yes - that's a good plan :-)

 But I have some difficulties because when LO 3.5.4 opens a .fodt and
 saves it again without making any changes, the resulting file changes
 nevertheless.

Right - this is a regular annoyance ! :-)

 I'm writing a small tool that transforms the XML into a canonical format
 so that only substantial changes remain. The question is: Which
 transformations are allowed?

Oh - so ... why write an external tool to do this, and not just fix it
in LibreOffice ! ? :-)

We'd be -very- interested in some patches that we can apply that will
sort the automatic styles, and generate them with consistent naming in a
sensible order :-)

 (This seems to work so far.)

The style rendering sounds sensible.

 But there are other changes:
 
 - office:meta changes. It's not a problem, I don't care about this.

Some level of sorting here might help too.

 - office:settings changes. I don't know, yet, whether I mind or not.
 
 - The draw:frame draw:z-index=251 attribute changes. Can I just
 replace the z-index with 1 or 2? What will happen?

Odd :-) perhaps when we have smaller changes we can chase these
oddnesses down better.

 - The text:list xml:id=list533178598 changes. That xml:id does not
 seem to be used anywhere. Can I just remove it? What will I lose?

No idea; if it's unused just try removing it and see what happens.

 - Measurements change. E.g. (just to pick one case), in
 style:graphic-properties the draw:visible-area-width changes from
 6.088cm to 6.089cm. Is there a remedy to avoid changes of this kind?

Ah; nasty, some rounding problem / internal representation issue -
possibly again looking at the code we could do better here to make it
more predictable; possibly using more precision we could do better
(doubles instead of floats) ?

 Any insights are welcome!

So - the best place to fix this stuff is inside LibreOffice itself :-)
then it is permanently fixed for everyone: you are not the only problem
with this pain - soon we'll be using flat odf for our templates and will
suffer the same way :-) 

The code to poke at is in:

xmloff/
and
sw/source/filter/xml/

It's not too hard to build libreoffice, checkout:

http://www.libreoffice.org/developers-2/

Patches are very much more than welcome ! :-)

Thanks !

Michael.

-- 
michael.me...@suse.com  , Pseudo Engineer, itinerant idiot

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Difficulties with Flat XML under source control

2012-06-19 Thread Johannes Sixt
Michael,

thanks for your feedback!

Am 19.06.2012 10:48, schrieb Michael Meeks:
 On Sun, 2012-06-17 at 22:10 +0200, Johannes Sixt wrote:
 I'm writing a small tool that transforms the XML into a canonical format
 so that only substantial changes remain. The question is: Which
 transformations are allowed?
 
   Oh - so ... why write an external tool to do this, and not just fix it
 in LibreOffice ! ? :-)

Because I'm using git, and then it's just a matter of a simple 'clean
filter'. :-)

 - office:meta changes. It's not a problem, I don't care about this.
 
   Some level of sorting here might help too.

Not only that. Most of the stuff is irrelevant (diverse counts, editing
duration, time of last edit). That should just be removed if the
document is placed under source control. Such stuff leads to merge
conflicts almost by definition.

(And, BTW, to be able to keep different modifications of the manual in
different branches and *merge* them again is the whole point of this
excercise.)

 - office:settings changes. I don't know, yet, whether I mind or not.

I'll try removing this entire section and hope that LO does something
sensible.

 - The text:list xml:id=list533178598 changes. That xml:id does not
 seem to be used anywhere. Can I just remove it? What will I lose?
 
   No idea; if it's unused just try removing it and see what happens.

The ids are sometimes used in a text:continue-list attribute. Hence,
they can't be stripped out blindly.

 - Measurements change. E.g. (just to pick one case), in
 style:graphic-properties the draw:visible-area-width changes from
 6.088cm to 6.089cm. Is there a remedy to avoid changes of this kind?
 
   Ah; nasty, some rounding problem / internal representation issue -
 possibly again looking at the code we could do better here to make it
 more predictable; possibly using more precision we could do better
 (doubles instead of floats) ?

Probably. Looking at this again, these changes seem to happen only for
draw:visible-area-*. Hence, it may also be a matter of conversion
between screen dimensions (pixels?) and cm/mm/in/etc.

   So - the best place to fix this stuff is inside LibreOffice itself :-)
 then it is permanently fixed for everyone: you are not the only problem
 with this pain - soon we'll be using flat odf for our templates and will
 suffer the same way :-) 
 
   The code to poke at is in:
 
   xmloff/
 and
   sw/source/filter/xml/

Been there, done that. But it's way over my head (and time budget). See

http://thread.gmane.org/gmane.comp.documentfoundation.libreoffice.devel/23528/focus=23543

-- Hannes
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Difficulties with Flat XML under source control

2012-06-18 Thread Johannes Sixt
I want to place a software manual under source control. It seems most
feasible to use a flat XML format, in particular, .fodt.

But I have some difficulties because when LO 3.5.4 opens a .fodt and
saves it again without making any changes, the resulting file changes
nevertheless.

I'm writing a small tool that transforms the XML into a canonical format
so that only substantial changes remain. The question is: Which
transformations are allowed?

- I bring the styles under office:automatic-styles into a canonical
order. Do styles in this section only reference style from
office:styles section (e.g. via style:parent-style-name), which occurs
earlier in the file?

- I give the automatic style canonical names because due to the
re-ordering they are re-numbered, which leads to a whealth of unwanted
changes in text:span style-name=... attributes.

(This seems to work so far.)

But there are other changes:

- office:meta changes. It's not a problem, I don't care about this.

- office:settings changes. I don't know, yet, whether I mind or not.

- The draw:frame draw:z-index=251 attribute changes. Can I just
replace the z-index with 1 or 2? What will happen?

- The text:list xml:id=list533178598 changes. That xml:id does not
seem to be used anywhere. Can I just remove it? What will I lose?

- Measurements change. E.g. (just to pick one case), in
style:graphic-properties the draw:visible-area-width changes from
6.088cm to 6.089cm. Is there a remedy to avoid changes of this kind?

Any insights are welcome!

Thanks,
-- Hannes
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice