[Scribus] New wiki page about frameslist.py

Gregory Pittman Sat, 27 Oct 2007 12:26:11 -0400

After improving the script a bit, I have created a wiki page:

http://wiki.scribus.net/index.php/Extracting_All_Text_from_a_Document


which includes the script, frameslist.py for doing this. This will now 
recognize text and image frames by frame type rather than name. I've 
also dealt with the duplication problem in linked frames by testing for 
this -- see sample output and note at the bottom of the page.

Something I discovered as I tested it out is that Scribus's files have 
once again become not well-formed by XML standards, with the inclusion 
of '&#x5' (Ctrl-M) for hard carriage returns. It doesn't interfere with 
this script, but will pose a problem for XML parsing, and interestingly, 
when I used 'cat' to show the text file's contents in the console, lines 
with Ctrl-M did not display. kedit shows the Ctrl-M as  carriage 
returns, emacs displays '^M'. This was run in 1.3.3.10svn and also in 
1.3.4 (don't have 1.3.5svn on this computer).

Greg

[Scribus] New wiki page about frameslist.py

Reply via email to