Re: [REVIEW-3-6] fdo#49819 - slightly inconsistent docx files causing grief ...
OK, I pushed to master http://cgit.freedesktop.org/libreoffice/core/commit/?id=5db7ac239278634c39cbb15f0173db0524b5dcd6 which *I* consider as a good solution. I checked around other unzip implementations and it does not look like they consider the timestamp differences as a brokenness. Cheers F. On 21/09/12 22:36, Michael Meeks wrote: Hi guys, I've not worked out where these odd ZIP container inconsistencies are coming from, but ... since people appear to see them, presumably it's worth being more accepting: Not 100% confident about this, the more I read SfxMedium friends, the more convinced I am we need some root+branch stream re-work, but hey. I'd love someone expert in writerfilter to review it, and (IMHO) we need to do the same work for PPTX / XLSX to ensure that we can prompt and enter the Repair mode. Then I think we need to undo the hack that is the fix for bug#54609 :-) Thoughts appreciated. Michael. ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: [REVIEW-3-6] fdo#49819 - slightly inconsistent docx files causing grief ...
On Mon, 2012-09-24 at 09:30 +0200, Fridrich Strba wrote: OK, I pushed to master http://cgit.freedesktop.org/libreoffice/core/commit/?id=5db7ac239278634c39cbb15f0173db0524b5dcd6 which *I* consider as a good solution. I checked around other unzip implementations and it does not look like they consider the timestamp differences as a brokenness. Right - it seems a bit paranoid; particularly with no 'repair' fallback. In general, I think the right approach is to ignore this inconsistency - particuarly if Microsoft/other-common-impl. is creating this sort of nonsense timestampage. Lets get that into -3-6 to nail the issue. Thanks :-) Michael. -- michael.me...@suse.com , Pseudo Engineer, itinerant idiot ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
[REVIEW-3-6] fdo#49819 - slightly inconsistent docx files causing grief ...
Hi guys, I've not worked out where these odd ZIP container inconsistencies are coming from, but ... since people appear to see them, presumably it's worth being more accepting: Not 100% confident about this, the more I read SfxMedium friends, the more convinced I am we need some root+branch stream re-work, but hey. I'd love someone expert in writerfilter to review it, and (IMHO) we need to do the same work for PPTX / XLSX to ensure that we can prompt and enter the Repair mode. Then I think we need to undo the hack that is the fix for bug#54609 :-) Thoughts appreciated. Michael. -- michael.me...@suse.com , Pseudo Engineer, itinerant idiot From ff300e59e74ee88aa6a4981b57a51af416c9e991 Mon Sep 17 00:00:00 2001 From: Michael Meeks michael.me...@suse.com Date: Fri, 21 Sep 2012 21:32:11 +0100 Subject: [PATCH] fdo#49819 - allow slightly inconsistent docx files to be repaired --- sfx2/source/doc/objstor.cxx|8 +++- writerfilter/inc/dmapper/DomainMapper.hxx |3 ++- writerfilter/inc/ooxml/OOXMLDocument.hxx |1 + writerfilter/source/dmapper/DomainMapper.cxx |6 +++--- writerfilter/source/filter/ImportFilter.cxx|7 +++ writerfilter/source/filter/RtfFilter.cxx |3 ++- writerfilter/source/ooxml/OOXMLStreamImpl.cxx |9 ++--- writerfilter/source/ooxml/OOXMLStreamImpl.hxx |2 +- .../debugservices/ooxml/OOXMLAnalyzeService.cxx|2 +- .../debugservices/ooxml/OOXMLTestService.cxx |2 +- 10 files changed, 27 insertions(+), 16 deletions(-) diff --git a/sfx2/source/doc/objstor.cxx b/sfx2/source/doc/objstor.cxx index cf8d0c2..9e7407a 100644 --- a/sfx2/source/doc/objstor.cxx +++ b/sfx2/source/doc/objstor.cxx @@ -33,6 +33,7 @@ #include com/sun/star/document/XExporter.hpp #include com/sun/star/document/FilterOptionsRequest.hpp #include com/sun/star/document/XInteractionFilterOptions.hpp +#include com/sun/star/packages/zip/ZipIOException.hpp #include com/sun/star/task/XInteractionHandler.hpp #include com/sun/star/task/XInteractionAskLater.hpp #include com/sun/star/task/FutureDocumentVersionProductUpdateRequest.hpp @@ -2219,7 +2220,12 @@ sal_Bool SfxObjectShell::ImportFrom( SfxMedium rMedium, bool bInsert ) } return xLoader-filter( aArgs ); -}catch(...) +} +catch (const packages::zip::ZipIOException) +{ +SetError( ERRCODE_IO_BROKENPACKAGE, Badness in the underlying package format. ); +} +catch(...) {} } diff --git a/writerfilter/inc/dmapper/DomainMapper.hxx b/writerfilter/inc/dmapper/DomainMapper.hxx index 4dbbe87..1d9b2f0 100644 --- a/writerfilter/inc/dmapper/DomainMapper.hxx +++ b/writerfilter/inc/dmapper/DomainMapper.hxx @@ -80,7 +80,8 @@ public: DomainMapper(const ::com::sun::star::uno::Reference ::com::sun::star::uno::XComponentContext xContext, ::com::sun::star::uno::Reference ::com::sun::star::io::XInputStream xInputStream, ::com::sun::star::uno::Reference ::com::sun::star::lang::XComponent xModel, -SourceDocumentType eDocumentType ); +bool bRepairStorage, +SourceDocumentType eDocumentType); virtual ~DomainMapper(); // Stream diff --git a/writerfilter/inc/ooxml/OOXMLDocument.hxx b/writerfilter/inc/ooxml/OOXMLDocument.hxx index f3365db..c97e2b2 100644 --- a/writerfilter/inc/ooxml/OOXMLDocument.hxx +++ b/writerfilter/inc/ooxml/OOXMLDocument.hxx @@ -248,6 +248,7 @@ public: static OOXMLStream::Pointer_t createStream(uno::Referenceuno::XComponentContext rContext, uno::Referenceio::XInputStream rStream, + bool bRepairStorage, OOXMLStream::StreamType_t nStreamType = OOXMLStream::DOCUMENT); static OOXMLStream::Pointer_t diff --git a/writerfilter/source/dmapper/DomainMapper.cxx b/writerfilter/source/dmapper/DomainMapper.cxx index 38a9961..cd4a17c 100644 --- a/writerfilter/source/dmapper/DomainMapper.cxx +++ b/writerfilter/source/dmapper/DomainMapper.cxx @@ -88,7 +88,8 @@ struct _PageSz DomainMapper::DomainMapper( const uno::Reference uno::XComponentContext xContext, uno::Reference io::XInputStream xInputStream, uno::Reference lang::XComponent xModel, -SourceDocumentType eDocumentType) : +bool bRepairStorage, +SourceDocumentType eDocumentType ) : LoggedProperties(dmapper_logger, DomainMapper), LoggedTable(dmapper_logger, DomainMapper), LoggedStream(dmapper_logger, DomainMapper), @@ -101,12 +102,11 @@ LoggedStream(dmapper_logger, DomainMapper), uno::makeAny( false ) ); //import document properties - try {
Re: [REVIEW-3-6] fdo#49819 - slightly inconsistent docx files causing grief ...
First of all, the timestamps in that file are really bogus, I cannot imagine an ooxml file produces in 2004, only if the machine has wrongly set the clock. Second, when working on a zip implementation for libcdr and for the LO shell extension, I realized that the timestamps were never really good to check for consistency with the local entry header and the central directory entry. I think that the best would be to compare them by crc. And one has to take into account that crc can be 0 and then the real crc is to be found just after the stream in a structure that is to be found by a magic and there, it is possible not to have even the real compressed and uncompressed sizes in the local entry header. My proposal would be to be much less strict in what we consider as corrupted for the zip-based documents. Instead check things that are for sure to be the same in the two structs, like the encryption, crc32 and compression type. Cheers F. On 21/09/12 22:36, Michael Meeks wrote: Hi guys, I've not worked out where these odd ZIP container inconsistencies are coming from, but ... since people appear to see them, presumably it's worth being more accepting: Not 100% confident about this, the more I read SfxMedium friends, the more convinced I am we need some root+branch stream re-work, but hey. I'd love someone expert in writerfilter to review it, and (IMHO) we need to do the same work for PPTX / XLSX to ensure that we can prompt and enter the Repair mode. Then I think we need to undo the hack that is the fix for bug#54609 :-) Thoughts appreciated. Michael. ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: [REVIEW-3-6] fdo#49819 - slightly inconsistent docx files causing grief ...
Ah, the timestamp actually is not a time_t, they are two shorts (2 bytes long) from which the first one gives the modification time and the second the modification date. So the difference can be some seconds. It is possible that some zip implementations dump the timestamps of the directory creation in the cdir entries and timestamps of the file addition to the local file header. So, the timestamp in the cdir should never be smaller then in local file header. Maybe it is good test of corruption to check that instead of identity. F. Sent from Samsung Mobile Fridrich Strba fridrich.st...@graduateinstitute.ch wrote: First of all, the timestamps in that file are really bogus, I cannot imagine an ooxml file produces in 2004, only if the machine has wrongly set the clock. Second, when working on a zip implementation for libcdr and for the LO shell extension, I realized that the timestamps were never really good to check for consistency with the local entry header and the central directory entry. I think that the best would be to compare them by crc. And one has to take into account that crc can be 0 and then the real crc is to be found just after the stream in a structure that is to be found by a magic and there, it is possible not to have even the real compressed and uncompressed sizes in the local entry header. My proposal would be to be much less strict in what we consider as corrupted for the zip-based documents. Instead check things that are for sure to be the same in the two structs, like the encryption, crc32 and compression type. Cheers F. On 21/09/12 22:36, Michael Meeks wrote: Hi guys, I've not worked out where these odd ZIP container inconsistencies are coming from, but ... since people appear to see them, presumably it's worth being more accepting: Not 100% confident about this, the more I read SfxMedium friends, the more convinced I am we need some root+branch stream re-work, but hey. I'd love someone expert in writerfilter to review it, and (IMHO) we need to do the same work for PPTX / XLSX to ensure that we can prompt and enter the Repair mode. Then I think we need to undo the hack that is the fix for bug#54609 :-) Thoughts appreciated. Michael. ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice