Hi all
I need the mnogosearch crawling process to process pdf documents with a 
Latin1 encoding. I run the command line interface as follows:

$ /usr/bin/php -q typo3/cli_dispatch.phpsh mnogosearch -v 3 -w

By looking at the cli_mnogosearch.php script, I found the following line:

  Mime application/pdf text/plain "pdftotext -enc UTF-8 $1 -"

I guess this forces pdftotext to use UTF-8 encoding. By changing that line 
to

  Mime application/pdf text/plain "pdftotext -enc Latin1 $1 -"

I get the desired result, but of course this is an unelegant solution that 
I'd like to avoid: I don't want to hardcode that in the extension code.
I tried by putting the above line into a text file, that I defined in the 
IncludeFile field of the mnogosearch configuration through the Ext 
Manager, but that didn't work: the encoding used by pdftotext looks like 
UTF-8, not Latin1.
Any hints?
Thanks in advance

Claudio
_______________________________________________
TYPO3-english mailing list
[email protected]
http://lists.netfielders.de/cgi-bin/mailman/listinfo/typo3-english

Reply via email to