Re: Difficulties with Flat XML under source control
Am 20.06.2012 14:48, schrieb Thorsten Behrens: Johannes Sixt wrote: - Measurements change. E.g. (just to pick one case), in style:graphic-properties the draw:visible-area-width changes from 6.088cm to 6.089cm. Is there a remedy to avoid changes of this kind? Ah; nasty, some rounding problem / internal representation issue - possibly again looking at the code we could do better here to make it more predictable; possibly using more precision we could do better (doubles instead of floats) ? Probably. Looking at this again, these changes seem to happen only for draw:visible-area-*. Hence, it may also be a matter of conversion between screen dimensions (pixels?) and cm/mm/in/etc. Hrm, yeah - and we *really* don't want this slow drift - any chance you can file a bug with a preferrably small sample doc? Here we go: https://bugs.freedesktop.org/show_bug.cgi?id=51334 draw:visible-area-width and -height are properties that pertain only to OLE objects, IIUC. -- Hannes ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
RE: Difficulties with Flat XML under source control
I think it is necessary to look at round-trip out-in conversion preservation. For out-in (which this is, presumably), you want to record a decimal expression of the internal value that will convert back to the exact internal value on re-input. (The in-out case is that the input conversion provide whatever internal representation that will convert to the read value on re-output. Without additional information, it is generally very difficult to have these be the same.) It is also desirable, of course, that any other ODF consumer use the same technique so that its in-out conversion satisfies the out-in condition of the original source of the decimal expression of the value. There are old technical papers on how to have this work. The name David Matula comes to mind. There might be solutions in the conversions that exist in the basic Java classes for float data types. I think this was addressed in Common Lisp also. -Original Message- From: libreoffice-bounces+dennis.hamilton=acm@lists.freedesktop.org [mailto:libreoffice-bounces+dennis.hamilton=acm@lists.freedesktop.org] On Behalf Of Thorsten Behrens Sent: Wednesday, June 20, 2012 05:49 To: Johannes Sixt Cc: libreoffice-dev Subject: Re: Difficulties with Flat XML under source control Johannes Sixt wrote: - Measurements change. E.g. (just to pick one case), in style:graphic-properties the draw:visible-area-width changes from 6.088cm to 6.089cm. Is there a remedy to avoid changes of this kind? Ah; nasty, some rounding problem / internal representation issue - possibly again looking at the code we could do better here to make it more predictable; possibly using more precision we could do better (doubles instead of floats) ? Probably. Looking at this again, these changes seem to happen only for draw:visible-area-*. Hence, it may also be a matter of conversion between screen dimensions (pixels?) and cm/mm/in/etc. Hrm, yeah - and we *really* don't want this slow drift - any chance you can file a bug with a preferrably small sample doc? Thanks, -- Thorsten ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: Difficulties with Flat XML under source control
On 06/20/2012 03:07 PM, Dennis E. Hamilton wrote: I think it is necessary to look at round-trip out-in conversion preservation. For out-in (which this is, presumably), you want to record a decimal expression of the internal value that will convert back to the exact internal value on re-input. (The in-out case is that the input conversion provide whatever internal representation that will convert to the read value on re-output. Without additional information, it is generally very difficult to have these be the same.) It is also desirable, of course, that any other ODF consumer use the same technique so that its in-out conversion satisfies the out-in condition of the original source of the decimal expression of the value. There are old technical papers on how to have this work. The name David Matula comes to mind. There might be solutions in the conversions that exist in the basic Java classes for float data types. I think this was addressed in Common Lisp also. Hasn't there been progress in that field recently? Wait, yes, http://dl.acm.org/citation.cfm?id=1806623 Printing floating-point numbers quickly and accurately with integers by Florian Loitsch. Stephan ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: Difficulties with Flat XML under source control
Stephan Bergmann wrote: Hasn't there been progress in that field recently? Wait, yes, http://dl.acm.org/citation.cfm?id=1806623 Printing floating-point numbers quickly and accurately with integers by Florian Loitsch. Nice catch - and some code is here: http://code.google.com/p/double-conversion/ Cheers, -- Thorsten pgpiLP7w9vaFU.pgp Description: PGP signature ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: Difficulties with Flat XML under source control
On 21/06/12 14:07, Stephan Bergmann wrote: On 06/20/2012 03:07 PM, Dennis E. Hamilton wrote: I think it is necessary to look at round-trip out-in conversion preservation. For out-in (which this is, presumably), you want to record a decimal expression of the internal value that will convert back to the exact internal value on re-input. (The in-out case is that the input conversion provide whatever internal representation that will convert to the read value on re-output. Without additional information, it is generally very difficult to have these be the same.) It is also desirable, of course, that any other ODF consumer use the same technique so that its in-out conversion satisfies the out-in condition of the original source of the decimal expression of the value. There are old technical papers on how to have this work. The name David Matula comes to mind. There might be solutions in the conversions that exist in the basic Java classes for float data types. I think this was addressed in Common Lisp also. Hasn't there been progress in that field recently? Wait, yes, http://dl.acm.org/citation.cfm?id=1806623 Printing floating-point numbers quickly and accurately with integers by Florian Loitsch. i am in awe that it's possible to get a paper on this topic published in this day and age; one would think this kind of problem would have been solved 30 years ago, and the developers of popular office suites were just ignorant of the solutions :) ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: Difficulties with Flat XML under source control
On 17/06/12 22:10, Johannes Sixt wrote: - The text:list xml:id=list533178598 changes. That xml:id does not seem to be used anywhere. Can I just remove it? What will I lose? these are sadly auto-generated, which is a bug in itself; they are used in ODF itself for continuations, i.e. there can be another list that continues an existing list by referring to its text:id/xml:id; then there is another use in ODF 1.2 where RDF metadata can refer to the element by its xml:id, but that only works if the xml:id is actually persistent, i.e. the same value that is imported is then exported again; making the ids persistent requires extending the Writer core, which is a bit of work... ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: Difficulties with Flat XML under source control
On Tue, Jun 19, 2012 at 07:56:08PM +0200, Johannes Sixt j...@kdbg.org wrote: The code to poke at is in: xmloff/ and sw/source/filter/xml/ Been there, done that. But it's way over my head (and time budget). See http://thread.gmane.org/gmane.comp.documentfoundation.libreoffice.devel/23528/focus=23543 Still, once you have such a clean script it would be nice to see what tricks does it do, so we could (step by step) fix LO itself; in the long term then you would not need such a filter. ;-) ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: Difficulties with Flat XML under source control
Johannes Sixt wrote: - Measurements change. E.g. (just to pick one case), in style:graphic-properties the draw:visible-area-width changes from 6.088cm to 6.089cm. Is there a remedy to avoid changes of this kind? Ah; nasty, some rounding problem / internal representation issue - possibly again looking at the code we could do better here to make it more predictable; possibly using more precision we could do better (doubles instead of floats) ? Probably. Looking at this again, these changes seem to happen only for draw:visible-area-*. Hence, it may also be a matter of conversion between screen dimensions (pixels?) and cm/mm/in/etc. Hrm, yeah - and we *really* don't want this slow drift - any chance you can file a bug with a preferrably small sample doc? Thanks, -- Thorsten pgpmbQ8ftan0B.pgp Description: PGP signature ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: Difficulties with Flat XML under source control
Dennis E. Hamilton wrote: For out-in (which this is, presumably), you want to record a decimal expression of the internal value that will convert back to the exact internal value on re-input. (The in-out case is that the input conversion provide whatever internal representation that will convert to the read value on re-output. Without additional information, it is generally very difficult to have these be the same.) It is also desirable, of course, that any other ODF consumer use the same technique so that its in-out conversion satisfies the out-in condition of the original source of the decimal expression of the value. Hi Dennis, yes - but in a first approximation, one can probably relax this a bit (for the use case at hand): only _after_ the first save operation this needs to hold. Also, most people would probably be contempt with this to work for *one* ODF editing application. It is also desirable, of course, that any other ODF consumer use the same technique so that its in-out conversion satisfies the out-in condition of the original source of the decimal expression of the value. Note that there's a difference between spreadsheet values (for which I think de facto the above holds true - likely everyone stores those in IEEE doubles), and other content: consumers might employ rather complex transformations to arrive at internal values, given e.g. a gradient center coordinate - asking for common behaviour is very close to asking for a common ODF application model. Cheers, -- Thorsten pgp9ixmZUauRP.pgp Description: PGP signature ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
RE: Difficulties with Flat XML under source control
It occurs to me that Postscript and PDF have dealt with this for imaging models that work consistently. Here, the in is to a renderer, but the model for representation of decimal expressions of find-sensitivity values seems to have been handled (for years). Those specifications may be some help too. - Dennis -Original Message- From: Thorsten [mailto:netsr...@googlemail.com] On Behalf Of Thorsten Behrens Sent: Wednesday, June 20, 2012 06:32 To: Dennis E. Hamilton Cc: 'libreoffice-dev' Subject: Re: Difficulties with Flat XML under source control Dennis E. Hamilton wrote: For out-in (which this is, presumably), you want to record a decimal expression of the internal value that will convert back to the exact internal value on re-input. (The in-out case is that the input conversion provide whatever internal representation that will convert to the read value on re-output. Without additional information, it is generally very difficult to have these be the same.) It is also desirable, of course, that any other ODF consumer use the same technique so that its in-out conversion satisfies the out-in condition of the original source of the decimal expression of the value. Hi Dennis, yes - but in a first approximation, one can probably relax this a bit (for the use case at hand): only _after_ the first save operation this needs to hold. Also, most people would probably be contempt with this to work for *one* ODF editing application. It is also desirable, of course, that any other ODF consumer use the same technique so that its in-out conversion satisfies the out-in condition of the original source of the decimal expression of the value. Note that there's a difference between spreadsheet values (for which I think de facto the above holds true - likely everyone stores those in IEEE doubles), and other content: consumers might employ rather complex transformations to arrive at internal values, given e.g. a gradient center coordinate - asking for common behaviour is very close to asking for a common ODF application model. Cheers, -- Thorsten ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: Difficulties with Flat XML under source control
Hi Johannes, On Sun, 2012-06-17 at 22:10 +0200, Johannes Sixt wrote: I want to place a software manual under source control. It seems most feasible to use a flat XML format, in particular, .fodt. Yes - that's a good plan :-) But I have some difficulties because when LO 3.5.4 opens a .fodt and saves it again without making any changes, the resulting file changes nevertheless. Right - this is a regular annoyance ! :-) I'm writing a small tool that transforms the XML into a canonical format so that only substantial changes remain. The question is: Which transformations are allowed? Oh - so ... why write an external tool to do this, and not just fix it in LibreOffice ! ? :-) We'd be -very- interested in some patches that we can apply that will sort the automatic styles, and generate them with consistent naming in a sensible order :-) (This seems to work so far.) The style rendering sounds sensible. But there are other changes: - office:meta changes. It's not a problem, I don't care about this. Some level of sorting here might help too. - office:settings changes. I don't know, yet, whether I mind or not. - The draw:frame draw:z-index=251 attribute changes. Can I just replace the z-index with 1 or 2? What will happen? Odd :-) perhaps when we have smaller changes we can chase these oddnesses down better. - The text:list xml:id=list533178598 changes. That xml:id does not seem to be used anywhere. Can I just remove it? What will I lose? No idea; if it's unused just try removing it and see what happens. - Measurements change. E.g. (just to pick one case), in style:graphic-properties the draw:visible-area-width changes from 6.088cm to 6.089cm. Is there a remedy to avoid changes of this kind? Ah; nasty, some rounding problem / internal representation issue - possibly again looking at the code we could do better here to make it more predictable; possibly using more precision we could do better (doubles instead of floats) ? Any insights are welcome! So - the best place to fix this stuff is inside LibreOffice itself :-) then it is permanently fixed for everyone: you are not the only problem with this pain - soon we'll be using flat odf for our templates and will suffer the same way :-) The code to poke at is in: xmloff/ and sw/source/filter/xml/ It's not too hard to build libreoffice, checkout: http://www.libreoffice.org/developers-2/ Patches are very much more than welcome ! :-) Thanks ! Michael. -- michael.me...@suse.com , Pseudo Engineer, itinerant idiot ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: Difficulties with Flat XML under source control
Michael, thanks for your feedback! Am 19.06.2012 10:48, schrieb Michael Meeks: On Sun, 2012-06-17 at 22:10 +0200, Johannes Sixt wrote: I'm writing a small tool that transforms the XML into a canonical format so that only substantial changes remain. The question is: Which transformations are allowed? Oh - so ... why write an external tool to do this, and not just fix it in LibreOffice ! ? :-) Because I'm using git, and then it's just a matter of a simple 'clean filter'. :-) - office:meta changes. It's not a problem, I don't care about this. Some level of sorting here might help too. Not only that. Most of the stuff is irrelevant (diverse counts, editing duration, time of last edit). That should just be removed if the document is placed under source control. Such stuff leads to merge conflicts almost by definition. (And, BTW, to be able to keep different modifications of the manual in different branches and *merge* them again is the whole point of this excercise.) - office:settings changes. I don't know, yet, whether I mind or not. I'll try removing this entire section and hope that LO does something sensible. - The text:list xml:id=list533178598 changes. That xml:id does not seem to be used anywhere. Can I just remove it? What will I lose? No idea; if it's unused just try removing it and see what happens. The ids are sometimes used in a text:continue-list attribute. Hence, they can't be stripped out blindly. - Measurements change. E.g. (just to pick one case), in style:graphic-properties the draw:visible-area-width changes from 6.088cm to 6.089cm. Is there a remedy to avoid changes of this kind? Ah; nasty, some rounding problem / internal representation issue - possibly again looking at the code we could do better here to make it more predictable; possibly using more precision we could do better (doubles instead of floats) ? Probably. Looking at this again, these changes seem to happen only for draw:visible-area-*. Hence, it may also be a matter of conversion between screen dimensions (pixels?) and cm/mm/in/etc. So - the best place to fix this stuff is inside LibreOffice itself :-) then it is permanently fixed for everyone: you are not the only problem with this pain - soon we'll be using flat odf for our templates and will suffer the same way :-) The code to poke at is in: xmloff/ and sw/source/filter/xml/ Been there, done that. But it's way over my head (and time budget). See http://thread.gmane.org/gmane.comp.documentfoundation.libreoffice.devel/23528/focus=23543 -- Hannes ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Difficulties with Flat XML under source control
I want to place a software manual under source control. It seems most feasible to use a flat XML format, in particular, .fodt. But I have some difficulties because when LO 3.5.4 opens a .fodt and saves it again without making any changes, the resulting file changes nevertheless. I'm writing a small tool that transforms the XML into a canonical format so that only substantial changes remain. The question is: Which transformations are allowed? - I bring the styles under office:automatic-styles into a canonical order. Do styles in this section only reference style from office:styles section (e.g. via style:parent-style-name), which occurs earlier in the file? - I give the automatic style canonical names because due to the re-ordering they are re-numbered, which leads to a whealth of unwanted changes in text:span style-name=... attributes. (This seems to work so far.) But there are other changes: - office:meta changes. It's not a problem, I don't care about this. - office:settings changes. I don't know, yet, whether I mind or not. - The draw:frame draw:z-index=251 attribute changes. Can I just replace the z-index with 1 or 2? What will happen? - The text:list xml:id=list533178598 changes. That xml:id does not seem to be used anywhere. Can I just remove it? What will I lose? - Measurements change. E.g. (just to pick one case), in style:graphic-properties the draw:visible-area-width changes from 6.088cm to 6.089cm. Is there a remedy to avoid changes of this kind? Any insights are welcome! Thanks, -- Hannes ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice