- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: takerushi
Subject: Re: Converter
Hi Maxime,
1. I think I have resolved the content-type issue both in apache and
indexer.conf. But please double check with me.
In my apache httpd.conf file, I have added the following lines
###################
AddType application/x-tar .tgz
AddType application/msword .doc
AddType application/vnd.ms-powerpoint .ppt
AddType application/vnd.ms-excel .xls
AddType text/x-postscript .ps
AddType text/rtf .rtf
###################
In my indexer.conf file, I have edited following lines:
####################
CheckOnly \.tex$ \.texi$ \.xls$ \.doc$ \.texinfo$ \.ppt$
CheckOnly \.rtf$ \.pdf$ \.cdf$ \.ps$
#Disallow \.tex$ \.texi$ \.xls$ \.doc$ \.texinfo$
#Disallow \.rtf$ \.pdf$ \.cdf$ \.ps$
AddType application/pdf \.pdf$
AddType application/msword \.doc$
AddType application/vnd.ms-excel \.xls$
AddType application/msexcel \.xls$
AddType application/vnd.ms-powerpoint \.ppt$
AddType application/mspowerpoint \.ppt$
MIME application/pdf text/html /usr/bin/pdftotext -htmlmeta $in $out
MIME application/msword text/html "/usr/bin/AbiWord -to html $in"
MIME application/vnd.ms-excel text/html /usr/local/bin/xlhtml $in > $out
MIME application/msexcel text/html /usr/local/bin/xlhtml $in > $out
MIME application/vnd.ms-powerpoint text/html /usr/local/bin/ppthtml $in
MIME application/mspowerpoint text/html /usr/local/bin/ppthtml $in
####################
Please tell me anything wrong?
2. What is the best index command (with what option?) to reindex all these
stuff? Because i have been using ./indexer -a -u"%.ppt%" ./indexer -a
-u"%.doc%" and ./indexer -a -u"%.xls%" just to see how the indexer goes for
each file type. PDF files are indexed well and is able to appear in search
result. Just those MS office things are still missing.
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Read the full topic here:
http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=02;topic_id=1156294787