[dataparksearch] [Forum] Re: No
- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: ssharry Subject: Re: No Thank you! - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.com/cgi-bin/simpleforum.cgi?fid=02;topic_id=1217405250
[dataparksearch] [Forum] Re: No
- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: Maxime Subject: Re: No It looks like you have entered a Server command without trailing slash. Try correct it like this one: Server http://www.sina.com.cn/ - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.com/cgi-bin/simpleforum.cgi?fid=02;topic_id=1217405250
[dataparksearch] [Forum] Re: No
- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: ooptimum Subject: Re: No Если адрес не резолвится, то indexer честно об этом сообщает. Похоже, проблема в новом снэпшоте отсутствует. Завтра посмотрю внимательнее свежим глазом. Спасибо. - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=06;topic_id=1197747364;page=2
[dataparksearch] [Forum] Re: No
- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: Maxime Subject: Re: No Похоже, что не резолвится www.varorud.org, проверьте, пожалуйста, ваши DNS-сервера, указанные в /etc/resolv.conf Если DNS-сервера работают и этой сайт резолвится через nslookup, попробуйте последний снапшот: http://www.dataparksearch.org/dpsearch-4.49-13122007.tar.gz - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=06;topic_id=1197747364;page=2
[dataparksearch] [Forum] Re: No
- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: ooptimum Subject: Re: No [pre]spider dpsearch # sbin/indexer -qaimv5 -u http://www.varorud.org/ indexer.cfg[5754]: {00} URLDB: 8 records fetched . indexer[5754]: {00} DpsOpenCache: indexer[5754]: {00} Done. indexer[5754]: {00} indexer from dpsearch-4.48-mysql-freetds started with '/usr/local/dpsearch/etc/indexer.conf' indexer[5754]: {00} Chinese dictionary with 0 entries indexer[5754]: {00} Korean dictionary with 0 entries indexer[5754]: {00} Thai dictionary with 0 entries indexer[5754]: {00} LogsOnly: no indexer[5754]: {00} mutexes used: 256 indexer[5754]: {01} DpsOpenCache: indexer[5754]: {01} Done. indexer[5754]: {01} Target.body: indexer[5754]: {01} Target.Charset: indexer[5754]: {01} Target.Content-Language: indexer[5754]: {01} Target.Content-Length: 0 indexer[5754]: {01} Target.Content-Type: indexer[5754]: {01} Target.crc32: 0 indexer[5754]: {01} Target.crosswords: indexer[5754]: {01} Target.DP_ID: 3396233 indexer[5754]: {01} Target.E_URL: http://www.varorud.org/ indexer[5754]: {01} Target.Hops: 0 indexer[5754]: {01} Target.meta.description: indexer[5754]: {01} Target.meta.keywords: indexer[5754]: {01} Target.Pop_Rank: 0.25 indexer[5754]: {01} Target.PrevStatus: 0 indexer[5754]: {01} Target.Referrer-ID: 0 indexer[5754]: {01} Target.Since: 1197914870 indexer[5754]: {01} Target.Status: 0 indexer[5754]: {01} Target.title: indexer[5754]: {01} Target.url: http://www.varorud.org/ indexer[5754]: {01} Target.URL_ID: 125589599 indexer[5754]: {01} URL: http://www.varorud.org/ Subnet.pton: addr: @ net/mask:c16f0a00/fe00 [193.111.10.0/23] => 1 Subnet.pton: addr: @ net/mask:c38c8000/fe00 [195.140.128.0/23] => 1 Subnet.pton: addr: @ net/mask:d90bb000/f000 [217.11.176.0/20] => 1 Subnet.pton: addr: @ net/mask:c16f4c00/fe00 [193.111.76.0/23] => 1 Subnet.pton: addr: @ net/mask:d472/fe00 [212.114.0.0/23] => 1 Subnet.pton: addr: @ net/mask:c33a3500/ff00 [195.58.53.0/24] => 1 Subnet.pton: addr: @ net/mask:3e104300/ff00 [62.16.67.0/24] => 1 Subnet.pton: addr: @ net/mask:c33a2000/fff0 [195.58.32.0/28] => 1 Subnet.pton: addr: @ net/mask:c33a3d50/fff8 [195.58.61.80/29] => 1 Subnet.pton: addr: @ net/mask:d4a5b400/ff00 [212.165.180.0/24] => 1 Subnet.pton: addr: @ net/mask:d4a5b500/ff80 [212.165.181.0/25] => 1 Subnet.pton: addr: @ net/mask:52c61500/ff00 [82.198.21.0/24] => 1 Subnet.pton: addr: @ net/mask:52c61600/ff00 [82.198.22.0/24] => 1 Subnet.pton: addr: @ net/mask:3e400a00/ff00 [62.64.10.0/24] => 1 Subnet.pton: addr: @ net/mask:55098000/c000 [85.9.128.0/18] => 1 Subnet.pton: addr: @ net/mask:c243d300/ff00 [194.67.211.0/24] => 1 Subnet.pton: addr: @ net/mask:c243d400/ff00 [194.67.212.0/24] => 1 indexer[5754]: {01} No 'Server' command for url indexer[5754]: {01} Deleting http://www.varorud.org/ indexer[5754]: {01} Done (0 seconds, 0 documents, 0 bytes, 0.00 Kbytes/sec.) indexer[5754]: {00} Total 1 seconds, 0 documents, 0 bytes, 0.00 Kbytes/sec, 0.00 sec/doc, 0 bytes/doc. indexer[5754]: {00} Neo PopRank: 0 documents, 0 pas, 0.00 Kpas/sec, 0.00 sec/doc, 0.00 pas/doc.[/pre] - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=06;topic_id=1197747364;page=2
[dataparksearch] [Forum] Re: No
- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: Maxime Subject: Re: No Раскоментарьте, пожалуйста, #define DEBUG_MATCH 1 в заголовке src/match.c и пересоберите dpsearch, затем повторите команду sbin/indexer -qaimv5 -u http://www.varorud.org/ - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=06;topic_id=1197747364
[dataparksearch] [Forum] Re: No
- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: ooptimum Subject: Re: No [pre]spider dpsearch # sbin/indexer -qaimv5 -u http://www.varorud.org/ indexer.cfg[14726]: {00} URLDB: 8 records fetched indexer.cfg[14726]: {00} URLDB: http://www.1tv.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} Server applied: site_id: 1194709586 URL: http://www.1tv.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} URLDB: http://amcu.gki.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} Server applied: site_id: 169636386 URL: http://amcu.gki.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} URLDB: http://www.andoz.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} Server applied: site_id: 148809168 URL: http://www.andoz.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} URLDB: http://www.approach.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} Server applied: site_id: -449816968 URL: http://www.approach.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} URLDB: http://www.arzon-mobile.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} Server applied: site_id: 511035512 URL: http://www.arzon-mobile.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} URLDB: http://www.asiagrandhotel.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} Server applied: site_id: 1849837000 URL: http://www.asiagrandhotel.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} URLDB: http://www.asiatrade.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} Server applied: site_id: -149122977 URL: http://www.asiatrade.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} URLDB: http://avto777.tj/ indexer.cfg[14726]: {00} Allow by default indexer.cfg[14726]: {00} Server applied: site_id: 1002450279 URL: http://avto777.tj/ indexer.cfg[14726]: {00} Allow by default indexer[14726]: {00} DpsOpenCache: indexer[14726]: {00} Done. indexer[14726]: {00} indexer from dpsearch-4.48-mysql-freetds started with '/usr/local/dpsearch/etc/indexer.conf' indexer[14726]: {00} Chinese dictionary with 0 entries indexer[14726]: {00} Korean dictionary with 0 entries indexer[14726]: {00} Thai dictionary with 0 entries indexer[14726]: {00} LogsOnly: no indexer[14726]: {00} mutexes used: 256 indexer[14726]: {01} DpsOpenCache: indexer[14726]: {01} Done. indexer[14726]: {01} Target.body: indexer[14726]: {01} Target.Charset: indexer[14726]: {01} Target.Content-Language: indexer[14726]: {01} Target.Content-Length: 0 indexer[14726]: {01} Target.Content-Type: indexer[14726]: {01} Target.crc32: 0 indexer[14726]: {01} Target.crosswords: indexer[14726]: {01} Target.DP_ID: 3394939 indexer[14726]: {01} Target.E_URL: http://www.varorud.org/ indexer[14726]: {01} Target.Hops: 0 indexer[14726]: {01} Target.meta.description: indexer[14726]: {01} Target.meta.keywords: indexer[14726]: {01} Target.Pop_Rank: 0.25 indexer[14726]: {01} Target.PrevStatus: 0 indexer[14726]: {01} Target.Referrer-ID: 0 indexer[14726]: {01} Target.Since: 1197904904 indexer[14726]: {01} Target.Status: 0 indexer[14726]: {01} Target.title: indexer[14726]: {01} Target.url: http://www.varorud.org/ indexer[14726]: {01} Target.URL_ID: 125589599 indexer[14726]: {01} URL: http://www.varorud.org/ indexer[14726]: {01} No 'Server' command for url indexer[14726]: {01} Deleting http://www.varorud.org/ indexer[14726]: {01} Done (1 seconds, 0 documents, 0 bytes, 0.00 Kbytes/sec.) indexer[14726]: {00} Total 1 seconds, 0 documents, 0 bytes, 0.00 Kbytes/sec, 0.00 sec/doc, 0 bytes/doc. indexer[14726]: {00} Neo PopRank: 0 documents, 0 pas, 0.00 Kpas/sec, 0.00 sec/doc, 0.00 pas/doc. [/pre] - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=06;topic_id=1197747364
[dataparksearch] [Forum] Re: No
- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: Maxime Subject: Re: No Покажите, пожалуйста, вывод команды ./indexer -qaimv5 -u http://www.varorud.org/ - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=06;topic_id=1197747364
[dataparksearch] [Forum] Re: No
- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: ooptimum Subject: Re: No Кстати, я заметил, что у меня появилось множество таких сообщений о том, что нет команды Server для каких-то URL. Это появилось только в версии 4.48, до этого данные URL нормально индексировались и попадали в базу, конфиг не менялся с предыдущих версий. Например, вот типичный вывод: [pre] indexer[12959]: {05} URL: http://www.varorud.org/index.php?option=com_content&task=view&id=4344&Itemid=107 indexer[12959]: {05} No 'Server' command for url indexer[12959]: {05} Deleting http://www.varorud.org/index.php?option=com_content&task=view&id=4344&Itemid=107 [/pre] Хорошо, смотрим в БД: [pre] mysql> select count(*) from url where url like 'http://www.varorud.org%'; +--+ | count(*) | +--+ |76069 | +--+ 1 row in set (0.52 sec) mysql> select distinct server_id from url where url like 'http://www.varorud.org%'; ++ | server_id | ++ | 1845846513 | ++ 1 row in set (0.62 sec) mysql> select parent,url from server where rec_id=1845846513; ++-+ | parent | url | ++-+ | 0 | 193.111.10.0/23 | ++-+ 1 row in set (0.05 sec) [/pre] Видим, что все записи в таблице URL, ссылающиеся на http://www.varorud.org, ссылаются на один и тот же сервер, но это не сервер http://www.varorud.org, а 193.111.10.0/23, для которого у меня в indexer.conf есть следующий параметр: Subnet 193.111.10.0/23 Получается, что что-то изменилось во внутренней логике работы DPS. Интересно, что? - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=06;topic_id=1197747364
[dataparksearch] [Forum] Re: No
- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: Maxime Subject: Re: No If you would like to limit indexing by the folder specified only, you need to specify the following Server command: Server path file:///path/to/folder/ Please run indexer with -v5 switch specified, this enables maximal debug information, which includes why every page is accepted or rejected for indexing. - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=02;topic_id=1176625362
[dataparksearch] [Forum] Re: No
- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: Yelena Subject: Re: No yesterday the Server command worked whithout trailing slash(( today it doesn't work. I use Server command like Server file:///path/to/folder/ - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=02;topic_id=1176625362
[dataparksearch] [Forum] Re: No
- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: Maxime Subject: Re: No Check trailing slash first at Server command, as described above. - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=02;topic_id=1176625362
[dataparksearch] [Forum] Re: No
- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: Yelena Subject: Re: No > At 03:43:40 16/04/07, Scarlett wrote: >thank you! > >I know why! and why? I have the same problem - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=02&topic_id=1176625362