Randomthots wrote:

I honestly have no idea how I would go about recovering data from something like that.

Even if you don't, the file format itself is less prone to damage. Because XML is well structured and clearly defined, when damage /does/ occur, it is often possible to *guess* what the data should have been. That's part of the beauty of XML.


Consider this example. Say the origial texst is:

<table:table-cell office:value-type="string">
    <text:p>Cell content</text:p>
</table:table-cell>

Suppose we lose a few bytes:

<table:table-cell ofxfice:valuetype="string">
    <text:p>Cell content</text:>
</table:table-cell>

You can still reconstruct the original XML. For that matter, so can OOo (to some degree).

Compare this with a binary data structure. It will, in some way or another, have the form of an n-ary tree (they all do). Suppose that a node gets deletted. Now you've lost everything below that node (possibly a few paragraphs). Or wose, it might make the file impossible to parse.

Now look at the XML again. Think of how many bytes you'd have to lose (and lose _sequentially_) for you to lose a "node".

Ain't it cool? :-)


I suppose you could just start out with some global find-and-replaces to get rid of the tags and try to get closer to something that looked like a csv. I guess an xml editor would be handy.

Yes. XML Tidy is your friend. It can spot errors in the XML structure. That's your first line of defence.


I've repaired damaged OOo files by hand (not many). One of them was a book by an Italian writer. It took him months to write, it was a few hundred pages. One day, as OOo was writing to the disk the power went and and the file got corrupted.

He sent me the file. I unzipped it and ran it through XML Tidy. Tidy complained about a mal-formed tag on row x column y. I went there, fixed the tag, and zipped it again. Voila, the file was fixed.

This took me 5min of work, and it saved months of work from this writer.

Maybe you could use some of the infamous *nix tools like grep and sed to pull stuff out. Not a trivial task in any case.

Tidy made it very easy. :-)

Cheers,
Daniel.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to