- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: betty
Subject: Re: one page url list
I tried, and still got the same no server command error.
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Read the full topic here:
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: Maxime
Subject: Re: one page url list
Try the similar one using regex:
Realm regex page nofollow .
Note: the point in the command above is important.
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Read the full topic here:
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: Maxime
Subject: Re: one page url list
Next one to try:
Server regex nofollow .
Please show more lines of indexer output if this don't help.
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Read the full topic here:
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: betty
Subject: Re: one page url list
Still the same, I wonder if it has anything to do with my other settings in
indexer.conf or other config files?
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Read the full topic here:
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: betty
Subject: Re: one page url list
The only server/realm/subnet command I added to the end of the index.conf file
is:
Server page nofollow *
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Read the full topic here:
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: Maxime
Subject: Re: one page url list
Sorry for misleading you, change it to the command:
Realm page nofollow *
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Read the full topic here:
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: Maxime
Subject: Re: one page url list
You have to put all Server/Realm/Subnet commands into indexer.conf file, the
url.txt fie is a simple lit of url, one per line.
Configuration loading, include url importing, is always performing in
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: betty
Subject: Re: one page url list
I put this in the index.conf:
Server world
Then run /usr/local/dpsearch/sbin/indexer -i -f
/usr/local/dpsearch/etc/url1.txt -v5
I got the error message:
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: Maxime
Subject: Re: one page url list
You have to specify a pattern with Server command, see
http://www.dataparksearch.org/dpsearch-follow.en.html
To allow indexing any url, you can add the following command:
Server world *
but
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: betty
Subject: Re: one page url list
I wanted to only index urls in the url.txt file, what is the server command for
this?
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Read the full topic here:
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: Maxime
Subject: Re: one page url list
Server page nofollow *
This Server command allows the indexing only for urls already in the database
and don't allow the following the hrefs in those documents.
- - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: betty
Subject: Re: one page url list
I added this command to index.conf and run /usr/local/dpsearch/sbin/indexer -i
-f /usr/local/dpsearch/etc/url1.txt -v5 again. I still got messages like
parsing http://abc.com/...;
Then I tested
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: Maxime
Subject: Re: one page url list
This is normal output for debug verbose level which you have set with -v5
switch for indexer.
Put trailing slashes to the site urls in you list, i.e.
http://www.dataparksearch.org/
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: betty
Subject: Re: one page url list
This is the output messages:
indexer[9937]: {01} Log 2F3 updated in 0.01 sec., ndel:1, nwrd:0
indexer[9937]: {01} Log 2F4 updated in 0.00 sec., ndel:1, nwrd:0
indexer[9937]: {01} Log 2F5 updated in
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: Maxime
Subject: Re: one page url list
Yes, it's normal, but it's not full. Please show the full output despite its
size.
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Read the full topic here:
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: Maxime
Subject: Re: one page url list
Run
/usr/local/dpsearch/sbin/indexer -ami -f /usr/local/dpsearch/etc/url1.txt -v4
2 err.log
then look into err.log file
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Read the full
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: betty
Subject: Re: one page url list
I did it, and where can I find the err.log file?
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Read the full topic here:
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: betty
Subject: Re: one page url list
I figured this has something to do with my recent changes to the index.conf,
cached.conf, searchd.conf, and stored.conf. I have reversed some of them, but
it still produced some errors and
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: Maxime
Subject: Re: one page url list
How long it's stopped ? Perhaps after few hunred of documents indexed it need
to flush up cached buffers an this operation requires some time.
This isn't an error message, this is debug info, if
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: betty
Subject: Re: one page url list
I also want to know how to index pages in the subdirectory of each url in the
url list.
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Read the full topic here:
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: Maxime
Subject: Re: one page url list
Do you have appropriate Server/Realm/Subnet command defined for those importing
urls ?
All Server commands assume path option by default, which mean indexing all
files in subdirectory.
So if you
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: betty
Subject: Re: one page url list
Where should I define server/realm/subnet command for the importing urls?
Inside the url.txt file or in the index.conf file?
I tried adding server site http://www.abc.com/page1.html; in the
22 matches
Mail list logo