- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: dmitriy
Subject: Re: Про MaxDocsPerServer

indexer.conf (щас запускаю так)
DBAddr  mysql://mysql:[EMAIL 
PROTECTED]/search/?dbmode=cache?socket=/var/run/mysqld/mysqld.sock&cached=LocalHost:7000&stored=Localhost:7004
 
VarDir /projects/wsm/var
AccentExtensions yes
LocalCharset UTF-8
CrossWords yes
CollectLinks yes
DoStore yes

Include stopwords.conf
Include langmap.conf

MinWordLength 3
MaxWordLength 32
MaxDocSize 524288
MinDocSize 512
IndexDocSizeLimit 65536
URLSelectCacheSize 1024

include crawler_user_agent.conf

#will be used facility 
SyslogFacility local5
LogLevel 5

Disallow *.b    *.sh   *.md5  *.rpm
Disallow *.arj  *.tar  *.zip  *.tgz  *.gz   *.z     *.bz2 
Disallow *.lha  *.lzh  *.rar  *.zoo  *.ha   *.tar.Z
Disallow *.gif  *.jpg  *.jpeg *.bmp  *.tiff *.tif   *.xpm  *.xbm *.pcx
Disallow *.vdo  *.mpeg *.mpe  *.mpg  *.avi  *.movie *.mov  *.dat
Disallow *.mid  *.mp3  *.rm   *.ram  *.wav  *.aiff  *.ra
Disallow *.vrml *.wrl  *.png  *.psd
Disallow *.exe  *.com  *.cab  *.dll  *.bin  *.class *.ex_
Disallow *.tex  *.texi *.xls  *.doc  *.texinfo
Disallow *.rtf  *.pdf  *.cdf  *.ps
Disallow *.ai   *.eps  *.ppt  *.hqx
Disallow *.cpt  *.bms  *.oda  *.tcl
Disallow *.o    *.a    *.la   *.so 
Disallow *.pat  *.pm   *.m4   *.am   *.css
Disallow *.map  *.aif  *.sit  *.sea
Disallow *.m3u  *.qt   *.mov
Disallow *D=A *D=D *M=A *M=D *N=A *N=D *S=A *S=D *O=A *O=D
Disallow  Regex \.r[0-9][0-9]$ \.a[0-9][0-9]$ \.so\.[0-9]$

HoldBadHrefs 30d
DeleteOlder 21d
UseRemoteContentType yes

AddType image/x-xpixmap *.xpm
AddType image/x-xbitmap *.xbm
AddType image/gif       *.gif

AddType text/plain                      *.txt  *.pl *.js *.h *.c *.pm *.e
AddType text/html                       *.html *.htm

AddType text/rtf                        *.rtf
AddType application/pdf                 *.pdf
AddType application/msword              *.doc
AddType application/vnd.ms-excel        *.xls
AddType text/x-postscript               *.ps


AddType application/unknown *.*

ParserTimeOut 300
Period 7d


DefaultLang ru
MaxHops 64
TrackHops yes
MaxDocsPerServer -1
PopRankUseTracking no
MaxNetErrors 16
ReadTimeOut 15s
DocTimeOut 25s
NetErrorDelayTime 1d

Robots yes
DetectClones yes

Index yes
DoStore yes
ServerWeight 1
PopRankMethod Neo
PopRankSkipSameSite yes
PopRankFeedBack yes
PopRankNeoIterations 3

Section body                    1       256
Section title                   2       256
Section header.server           3       64
Section url                     4       128
Section url.file                5       128
Section url.path                6       128
Section url.host                7       128
Section url.proto               8       128
Section crosswords              9
Section Charset                 10      128
Section Content-Type            11      64

######################################################################

URLCharset UTF-8
Realm HrefOnly Regex ^http://.*\.(net|edu)/

IndexIf Content-Type text/plain

- - - - - - - - - - - - - - - - - - - - - - - - - - - -

Read the full topic here:
http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=05;topic_id=1151935941;page=3

Reply via email to