Le Tigre wrote: >> However, if you follow the instructions at >> http://wiki.scribus.net/index.php/Web_optimised_PDF , you will find that >> (apart from compressing the PDF files, which you were not asking for) the >> text extracted by pdftotext now becomes an almost perfect representation >> of the original text. >> I haven't investigated this in detail, and there may be encoding issues, >> etc, but I found the results striking. >> >> > > Well it does work very fine indeed! > > So sla -> pdf -> ps -> pdf -> txt is a perfect process. > FWIW, here is my short script to pull out the contents of text frames and put them into a text file. It needs some tweaking -- for example, a better test for a text frame, rather than that the name of the frame begins with 'Text'.
Nonetheless, it's a way from within Scribus to extract the text content quickly. Because of its brevity, it should be easy to understand. Greg -------------- next part -------------- A non-text attachment was scrubbed... Name: frameslist.py Type: text/x-python Size: 1149 bytes Desc: not available Url : http://nashi.altmuehlnet.de/pipermail/scribus/attachments/20070525/a64857ee/attachment.py
