Re: DocFormats - Open source OOXML implementation
Peter Kelly wrote: On 16 Aug 2014, at 5:26 am, Andrea Pescetti wrote: Does this mean that $ dfutil/dfutil filename.docx filename.html $ dfutil/dfutil filename.html filename2.docx should produce a "filename2.docx" that is quite similar to "filename.docx"? It is failing rather badly (invalid OOXML output in the second conversion, ZIP container clearly missing files and possible breaking order) in a simple test I did with a 1-page docx file. I'm not surprised this is the first issue to come up :$ There's a *lot* of knowledge I need to document for others; questions from you and others are the best way to motivate me to get that written ;) I've also been fixing (or breaking, who knows!) some documentation on my clone (my "fork" as Github likes to call it) but I'll submit a pull request only when basic things work. Since the filename.html you generated does, it tries to map these to elements in the docx file, failing badly. OK, but the following fails equally badly (producing an invalid OOXML file, even though this time it looks more consistent in size and internal content with filename.docx): $ dfutil/dfutil filename.docx filename.html Created filename.html $ dfutil/dfutil filename.html filename.docx What the best channel to report this issue and the 38 tests that are failing in my setup (provided they are all expected to pass)? - Include a hash of the .docx file (or relevant parts of it) in the HTML file, e.g. as a meta element or as part of the prefix on all id attributes Seems a good idea. Perhaps having it as a meta element will be enough, unless it makes sense for some reason to link each attribute to a specific .docx file. Still, this won't solve the problem above. Regards, Andrea. - To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org For additional commands, e-mail: dev-h...@openoffice.apache.org
Re: DocFormats - Open source OOXML implementation
On 16 Aug 2014, at 5:26 am, Andrea Pescetti wrote: > Does this mean that > $ dfutil/dfutil filename.docx filename.html > $ dfutil/dfutil filename.html filename2.docx > should produce a "filename2.docx" that is quite similar to "filename.docx"? > It is failing rather badly (invalid OOXML output in the second conversion, > ZIP container clearly missing files and possible breaking order) in a simple > test I did with a 1-page docx file. > > What is the best channel to report issues? Currently just email to me (or here on the list), but we should ideally get a dedicated mailing list/bug tracking system set up for it soon. -- Dr. Peter M. Kelly Founder, UX Productivity pe...@uxproductivity.com http://www.uxproductivity.com/ http://www.kellypmk.net/ PGP key: http://www.kellypmk.net/pgp-key (fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966) signature.asc Description: Message signed with OpenPGP using GPGMail
Re: DocFormats - Open source OOXML implementation
On 16 Aug 2014, at 5:26 am, Andrea Pescetti wrote: > On 15/08/2014 Peter Kelly wrote: >> Those of you interested in OOXML may want to have a look at my own >> implementation of (a subset of) the spec, which is part of a library >> I've just made available as open source (license is ASLv2): >> https://github.com/uxproductivity/DocFormats > > It's very interesting. I hope that in future it may become relevant to > OpenOffice or to Apache at large. > >> The design is based on bidirectional transformation, as a way of >> achieving non-destructive editing of foreign file formats. This permits >> incremental implementation of a given spec without risking data loss due >> to incomplete features, since unsupported features of a given file >> format are left untouched on save. > > Does this mean that > $ dfutil/dfutil filename.docx filename.html > $ dfutil/dfutil filename.html filename2.docx > should produce a "filename2.docx" that is quite similar to "filename.docx"? > It is failing rather badly (invalid OOXML output in the second conversion, > ZIP container clearly missing files and possible breaking order) in a simple > test I did with a 1-page docx file. I'm not surprised this is the first issue to come up :$ There's a *lot* of knowledge I need to document for others; questions from you and others are the best way to motivate me to get that written ;) What's happening here is that when the filename.html produced in the first step, each of its elements contains an id attribute containing a numeric identifier that refers to a specific element in the source docx file (specifically, the word/document.xml file within the package). These numeric identifiers are generated during parsing, and correspond to the position of the element in document order (so 1, 2, 3, etc.). When you convert from HTML to .docx, it uses the id attributes to re-establish these relationships, so that it knows which elements in the HTML file correspond to which elements in the .docx file. The problem you encountered stems from the fact that this mapping is only valid in specific circumstances - that is, when the .docx file being updated is exactly the same as its original. If this is not the case, then the identifier assigned to a given node will different whenever there are other nodes that have been inserted between it. So for example if you do the following: dfutil filename.docx filename.html # Modify filename.html dfutil filename.html filename.docx dfutil filename.html filename.docx Then the third run will fail, because in the second the docx file will have been updated based on the changes in the HTML, changing the sequence numbers assigned to each node, and then on the second run the mapping will be valid. The conversion works on the assumption that the docx file is the same as the original. The way that UX Write uses the library, it ensures this is the case, but the library does not check for this (and yes, it should; more on this below). Your case is similar, though in this case you're creating a new docx file, not updating an existing one. However what it actually does in this case is to create an empty .docx file, and then "update" that based on the HTML. In doing so, it assumes that the HTML does not contain any mappings (that is, id attributes with the prefix "bdt"). Since the filename.html you generated does, it tries to map these to elements in the docx file, failing badly. The only workaround for this at present is to manually edit the HTML file and remove all id attributes. The quickest way to do this is with the following command: sed -i '' -E ' s/ id="word[0-9]+"//' filename.html Then, when you run dfutil, it will see that there is no mapping for any of the elements in the HTML file, and thus avoid the problems in the output you observed. Now, onto the fix: The library needs to have some way of checking that the HTML file being used as part of an update operation has a mapping (id attributes) that match the docx file being updated (in the case of creating a new file, this is just an empty docx file). In the even that this is not the case, it could still do the update, but would act as if the entire document had been replaced with a completely new one. The solution I'll likely implement (and this should really be my first task, given the potential for problems like the above is this): - Include a hash of the .docx file (or relevant parts of it) in the HTML file, e.g. as a meta element or as part of the prefix on all id attributes - On update, have re-compute the hash of the .docx file and compare it against the one stored in the HTML file (if any), and if there's no match, treat the HTML file as a complete replacement of all content > > What is the best channel to report issues? -- Dr. Peter M. Kelly Founder, UX Productivity pe...@uxproductivity.com http://www.uxproductivity.com/ http://www.kellypmk.net/ PGP key: http://www.kellypmk.net/pgp-key (fingerprint 5435
How to add generate symbols for gdb?
Hi, I tried to configure with --enable-symbols to build with symbolic generation. Here is my command line: ./configure \ --with-dmake-url= http://dmake.apache-extras.org.codespot.com/files/dmake-4.12.tar.bz2 \ --with-epm-url= http://www.msweet.org/files/project2/epm-3.7-source.tar.gz \ --disable-odk \ --disable-binfilter \ --with-lang="en-US zh-TW" \ --enable-symbols Then in the module I wanted to debug, I built with build debug=t dbglevel=0 && deliver However gdb still told me that there was no debugging symbols found. Do I have to build all with debug=t to in order to get symbolic tables? What are the necessary and not necessary steps? Please suggest. Thanks. Best Regards
Re: [RELEASE]: propose RC2 on revision 1616946
On 12/08/2014 Louis Suárez-Potts wrote: On 2014-08-12, at 08:33, Jürgen Schmidt wrote: it could but confluence is a piece of shit and I have to save so many times until it is really saved correct that I personally don't have interest to edit it for now. Do you know if a bug report has been filed against this with Atlasssian? They know. They even have blog posts out: http://blogs.atlassian.com/2011/11/why-we-removed-wiki-markup-editor-in-confluence-4/ ; fact is, imposing a WYSIWYG editor is fine in many cases, but not if one has to prepare a huge table built in large part by a script. Regards, Andrea. - To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org For additional commands, e-mail: dev-h...@openoffice.apache.org
Re: DocFormats - Open source OOXML implementation
On 15/08/2014 Peter Kelly wrote: Those of you interested in OOXML may want to have a look at my own implementation of (a subset of) the spec, which is part of a library I've just made available as open source (license is ASLv2): https://github.com/uxproductivity/DocFormats It's very interesting. I hope that in future it may become relevant to OpenOffice or to Apache at large. The design is based on bidirectional transformation, as a way of achieving non-destructive editing of foreign file formats. This permits incremental implementation of a given spec without risking data loss due to incomplete features, since unsupported features of a given file format are left untouched on save. Does this mean that $ dfutil/dfutil filename.docx filename.html $ dfutil/dfutil filename.html filename2.docx should produce a "filename2.docx" that is quite similar to "filename.docx"? It is failing rather badly (invalid OOXML output in the second conversion, ZIP container clearly missing files and possible breaking order) in a simple test I did with a 1-page docx file. What is the best channel to report issues? I'll be presenting on this at ApacheCon EU this November - see the talk "Addressing File Format Compatibility in Word Processors" at http://apacheconeu2014.sched.org Looking forward to see it live! Regards, Andrea. - To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org For additional commands, e-mail: dev-h...@openoffice.apache.org
On the use of C++ exception handling
Dear all, I've been investigating the use of C++ exception handling constructs in open-source C++ projects (including open office). Currently, I am conducting a survey on this subject and I would really appreciate if you could contribute to this research by answering a few questions. The survey is available on-line: https://pt.surveymonkey.com/s/exceptionHandling All the best, Rodrigo.
download preparation for Apache OpenOffice 4.1.1
Changes to necessary files for the download process from http://www.openoffice.org/download/ are in: http://svn.apache.org/viewvc/openoffice/ooo-site/trunk/content/download-NEXT2/ Final editing will be done later today (PDT). Files remaining in this area can just be copied to http://svn.apache.org/viewvc/openoffice/ooo-site/trunk/content/download/ and committed for release 4.1.1. -- - MzK "For evil to flourish, it only requires good men to do nothing." -- Simon Wiesenthal - To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org For additional commands, e-mail: dev-h...@openoffice.apache.org
Notification and alerts about future forum/wiki outages
Apache has a new status page and notification system for outages (there is no ongoing or planned outage at the moment, it is just for your information). You can find a report of current problems (none at the moment) at http://status.apache.org/ Everyone can also subscribe and receive alerts when the OpenOffice forum or wiki is down. You can do so at http://status.apache.org/pubsub/manage.cgi ; of course, everybody who has access to the relevant servers and can act in case of an outage is expected to subscribe, but you are free to subscribe just to be informed immediately in case of an outage (so heavy forum or wiki users are welcome to subscribe too). All new tools have been made available by Apache Infra. Regards, Andrea. - To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org For additional commands, e-mail: dev-h...@openoffice.apache.org
Re: [RELEASE]: RC2 due to dictionary updates
On 05/08/2014 Mathias Röllig wrote: Please have a look at issue 125348. It is indeed quite annoying. I can reproduce it with 4.1.1-RC3 Italian. Maybe not a blocker (especially since I can reproduce it with 4.1.0 too, and since there seems to be a workaround in my case), but still an issue deserving attention. Regards, Andrea. - To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org For additional commands, e-mail: dev-h...@openoffice.apache.org
How to change default template from source
Hello every one I've succeeded to built open office, and for next step I want the writer I build to create new file from my template. But I don`t know where should I put the template file. And how to change the source file. I am not asking how to do it from UI but asking from source. Could anyone help me? Best regards -- Aron
Re: [discussion] how to stop mails being sent to multiple lists.
On Thu, Aug 14, 2014 at 04:50:02PM +0200, Jürgen Schmidt wrote: > On 14/08/14 15:59, Rob Weir wrote: > > On Thu, Aug 14, 2014 at 9:15 AM, Jürgen Schmidt > > wrote: > >> On 14/08/14 14:02, jan i wrote: > >>> hi. > >>> > >>> Have you also noticed that the amount of AOO mails have exploded > >>> latelysounds good you think lots of activity > >>> > >>> SADLY the truth is different, we have a fair amoun of real mails, but > >>> something like 8 of 10 are sent to multiple ML. > >>> > >>> It is unwise to send the same mail to multiple lists, for a couple of > >>> reasons: > >>> - the discussion becomes scattered over multiple ML > >>> - you force your fellow community members to read the same mail several > >>> times > >>> - you waste our bandwith I find more annoying (and for sure they waste more bandwith) mails that reply to a single paragraph but don't clean the message from all the quoted text, like this one I'm sending on purpose (I didn't even remove the mailing list footer texts). For the annoying duplication in your mailboxes, there are solutions for the problem depending on the software you use; dovecot, for example, has a Sieve extension vnd.dovecot.duplicate http://hg.rename-it.nl/dovecot-2.1-pigeonhole/raw-file/tip/doc/rfc/spec-bosch-sieve-duplicate.txt that can solve it with one line (if duplicate { setflag "\\seen"; }). The main problem with mailing list is not the concept in itself, but the lack of knowledge of the proper tools. > >>> If most mails are copied to most MLthen why do we have so many ML, one > >>> solution could be to only have dev@ and let the others be an alias > >>> (absolutely not one I prefer). > >>> > >>> How can we stop this tendency to blow up our inboxes ? > >> > >> We don't do that normally but for example my mails regarding the > >> availability of the RC builds are important for at least 3 lists. And I > >> know that not all people are subscribed or reading dev. But I know that > >> people read (are subscribed) to QA and l10n. So how can I reach all > >> interested parties? For me it's simple I send an email to all lists. The > >> people who are subscribed to all lists potentially found this annoying > >> but for me it is more important to reach all. > >> > >> If you have a good idea how to solve this, I am interested to learn. > >> > > > > Sometimes I will send to one list and cc the others (bcc will look > > spammy to some tools), and then put in the first line of the body: > > "Responses to Foo list only". > > I did this as well several times but even this got ignored and > discussion were on more than one list. So I gave up this approach. > > Juergen > > > > > Regards, > > > > -Rob > > > > > >> > >> Juergen > >> > >> > >> - > >> To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org > >> For additional commands, e-mail: dev-h...@openoffice.apache.org > >> > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org > > For additional commands, e-mail: dev-h...@openoffice.apache.org > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org > For additional commands, e-mail: dev-h...@openoffice.apache.org > -- Ariel Constenla-Haile La Plata, Argentina signature.asc Description: Digital signature
Testing Apache Open Office with JDK 8 EA builds
Hi, I am from the OpenJDK QA group at Oracle, and we're trying to get an idea about how much community testing is happening on JDK EA builds and to encourage more of it to happen as part of our engagement in the Adoption Group in OpenJDK. [0] [1] I'm curious if you have been testing against JDK 7/8/9 EA builds, if you've run into showstopper issues, and if you'd like to continue to discuss the subject on the quality-discuss mailing list in OpenJDK, of course. [2] Rgds,Rory [0] https://wiki.openjdk.java.net/display/Adoption/Quality+Out+Reach [1] http://mail.openjdk.java.net/pipermail/adoption-discuss/2014-August/000305.html [2] http://mail.openjdk.java.net/pipermail/quality-discuss/ -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland - To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org For additional commands, e-mail: dev-h...@openoffice.apache.org
[VOTE]: Release Apache OpenOffice 4.1.1 (RC3)
Hi all, this is a call for vote on releasing the available release candidate (RC3) as Apache OpenOffice 4.1.1. Apache OpenOffice 4.1.1 is mainly a bugfix release with some important bugfixes. And we can provide again more complete UI translations and have now support for 41 languages. New languages for this release compared to 4.1.0 are Catalan, Catalan (Valencia AVL) and Catalan (Valencia RACV). Apache OpenOffice 4.1.1 is the continuation of high quality software releases. An overview of the integrated release issues can be found under: http://ci.apache.org/projects/openoffice/milestones/4.1.1-rc3-r1617669/AOO4.1.1_fixes.html The release candidate artifacts (source release, as well as binary releases for 41 languages) and further information how to verify and review Apache OpenOffice 4.1.1 can be found on the following wiki page: https://cwiki.apache.org/confluence/display/OOOUSERS/Development+Snapshot+Builds (alternative directly via http://ci.apache.org/projects/openoffice/milestones/4.1.1-rc3-r1617669) *.dmg files are still not recognized as binaries and have to be saved manually (save link as ...). The RC is based on the release branch AOO410, revision 1617669! And a fresh and clean RAT scan output of this revision can be found under http://ci.apache.org/projects/openoffice/milestones/4.1.1-rc3-r1617669/AOO4.1.1_RAT_Scan.html Please vote on releasing this package as Apache OpenOffice 4.1.1 The vote starts now and will be open until: Tuesday, 19 August: 2014-08-19 12:00am UTC+2. We invite all people to vote (non binding) on this RC. We would like to provide a release that is supported by the majority of our project members. [ ] +1 Release this package as Apache OpenOffice 4.1.1 [ ] 0 Don't care [ ] -1 Do not release this package because... - To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org For additional commands, e-mail: dev-h...@openoffice.apache.org