It's not clear to me that the DOCX files are broken. It is that import into OOo and LO is unsuccessful. (People will say that is because DOCX is not OOXML but I've never seen actual evidence for that, simply the claim as a weak justification for why DOC should be used instead. There's no question that DOC import/export tends to succeed more often.)
Perhaps Wolf can clarify the complete use case and what he means by "broken-xml docx." -----Original Message----- From: Dave Fisher [mailto:dave2w...@comcast.net] Sent: Wednesday, January 18, 2012 11:49 To: ooo-dev@incubator.apache.org Subject: Re: Has docs converter been created? Hi Wolf, On Jan 18, 2012, at 5:16 AM, Wolf Halton wrote: > The department I work for has just standardized on MS Office 2010 and I am > constantly getting broken-xml docx files that I cannot look at in either > OOo or LibreOffice 3.4. I was reading a thread on the forum from 2007 when > docx first came out and we were having to suggest people use MS Office for > docx. Are we any closer to being able to read and convert docx? I don't know about here at AOO, but if you can use Java then you will find support for docx, xlsx, and pptx in Apache POI. Nick Burch and Yegor Kozlov are mentors for Apache ODF Toolkit (incubating) and on the POI PMC. I am also on the POI PMC. You might want to take your questions about broken docx's to the poi developers list. http://poi.apache.org/mailinglists.html Regards, Dave > > -- > This Apt Has Super Cow Powers - http://sourcefreedom.com > Advancing Libraries Together - http://LYRASIS.org