Thanks, Kir. I've checked the database and it's OK.

Here I add some bit of information:

I'm running "index -N 20", although right now there are 27 index processes
on the system.

This is the process forest:

24287 ?        S      0:03      |   \_ index -N 20 -s 200
24288 ?        S      0:00      |       \_ index -N 20 -s 200
24289 ?        S      0:00      |       \_ index -N 20 -s 200
24290 ?        S      0:00      |       \_ index -N 20 -s 200
24291 ?        S      0:00      |       \_ index -N 20 -s 200
24292 ?        S      0:00      |       \_ index -N 20 -s 200
24295 ?        S      0:00      |       \_ index -N 20 -s 200
24296 ?        S      0:43      |           \_ index -N 20 -s 200
24297 ?        S      0:41      |           \_ index -N 20 -s 200
24298 ?        S      0:42      |           \_ index -N 20 -s 200
24299 ?        S      0:35      |           \_ index -N 20 -s 200
24300 ?        S      0:43      |           \_ index -N 20 -s 200
24301 ?        S      0:44      |           \_ index -N 20 -s 200
24302 ?        S      0:42      |           \_ index -N 20 -s 200
24303 ?        S      0:37      |           \_ index -N 20 -s 200
24304 ?        S      0:04      |           \_ index -N 20 -s 200
24305 ?        S      0:01      |           \_ index -N 20 -s 200
24306 ?        S      0:00      |           \_ index -N 20 -s 200
24307 ?        S      0:03      |           \_ index -N 20 -s 200
24308 ?        S      0:41      |           \_ index -N 20 -s 200
24309 ?        S      0:43      |           \_ index -N 20 -s 200
24310 ?        S      0:40      |           \_ index -N 20 -s 200
24311 ?        S      0:40      |           \_ index -N 20 -s 200
24312 ?        S      0:00      |           \_ index -N 20 -s 200
24313 ?        S      0:40      |           \_ index -N 20 -s 200
24314 ?        S      0:39      |           \_ index -N 20 -s 200
24315 ?        T      0:02      |           \_ index -N 20 -s 200

Doing strace I've found that the first level childs (24288, 24289, 24290, 
24291, 24292) are all waiting for input on file descriptor 3; process
24295 is polling his childs, and the others are doing this loop:

nanosleep({1, 0}, NULL)                 = 0
time([1034347672])                      = 1034347672
gettimeofday({1034347672, 467966}, NULL) = 0
fcntl64(0x3, 0x4, 0x802, 0x4)           = 0
read(3, 0x454ed008, 249856)             = -1 EAGAIN (Resource temporarily unavailable)
fcntl64(0x3, 0x4, 0x2, 0x4)             = 0
write(3, "\311\0\0\0\3SELECT url_id,site_id,next_"..., 205) = 205
read(3, "\1\0\0\1", 4)                  = 4
read(3, "\3", 1)                        = 1
read(3, "\31\0\0\2", 4)                 = 4
read(3, "\7urlword\6url_id\3\v\0\0\1\3\3\3B\0", 25) = 25
read(3, "\32\0\0\3", 4)                 = 4
read(3, "\7urlword\7site_id\3\v\0\0\1\3\3\1@\0", 26) = 26
read(3, "\"\0\0\4", 4)                  = 4
read(3, "\7urlword\17next_index_time\3\v\0\0\1\3\3\t"..., 34) = 34
read(3, "\1\0\0\5", 4)                  = 4
read(3, "\376", 1)                      = 1
read(3, "\1\0\0\6", 4)                  = 4
read(3, "\376", 1)                      = 1
gettimeofday({1034347672, 469338}, NULL) = 0


Looking at logs.txt I've found that index run OK during 9 hours, and then
started to log "Not available!" on every line with some few exeptions.

Now I can connect to the database and even do a "Index -S". Searchd is
working also. Only the indexer seems to be hung.


When I tied to stop the process with "index -E" it hungs when acquiring an
exclusive lock on the pid file:

open("/usr/local/aspseek/var/aspseek/pid", O_RDONLY) = 4
flock(4, LOCK_EX|LOCK_NB)               = -1 EAGAIN (Resource temporarily unavailable)
read(4, "\337^\0\0", 4)                 = 4
kill(24287, SIGTERM)                    = 0
flock(4, LOCK_EX <unfinished ...>

Also the father process 24287 seems to be in a uninterruptable state,
because it wont die (also I can't peek at what it's doing because strace
can't connect with it).


Then, I tried the following to stop it:

1.> fuser /usr/local/aspseek/var/aspseek/pid
/usr/local/aspseek/var/aspseek/pid: 24287 24295 24296 24315
2.> kill 24315 24296 24295
3.> index -E

And finally that stoped index.


Have you any idea of what may be going wrong ?


Thanks.
Ernesto.

On Thu, 10 Oct 2002, Kir Kolyshkin wrote:

> Check MySQL databases integrity using 'myisamchk' utility.
> 
> Ing. Ernesto Rapetti wrote:
> > I'm running aspseek 1.2.10 on a RedHat 7.3 since a few weeks, and it was
> > working alright until now, when the index process halts after several
> > hours.
> > 
> > The database now has a total of 248400 documents. On monday I observed the
> > error "MySQL server has gone away" reported on this list and set the
> > max_allowed_packet=10M variable on my.cnf as John advised.
> > 
> > In the new run index worked fine for 9 hours, and then begin to report the
> > following on logs.txt:
> > 
> > Got next      1 URLs for:   0.001 seconds. Queued docs:   630.Time
> > 1034163064-1034163065.Not available!
> > 
> > 
> > What is not available?
> > Doing a strace I notice this line:
> > 
> > read(3, 0x984c8b8, 81920)               = -1 EAGAIN (Resource temporarily
> > unavailable)
> > 
> > So maybe the connection to the database is dead.
> > 
> > Doing "mysqladmin processlists" shows a connection that is idle since
> > index is reporting the error, then from the database side the connection
> > is not closed but index can't fetch new documents.
> > 
> > I'll do a "index -E" and start the process again, but I wonder if I have
> > to repeat this process every day.
> > 
> > Ernesto.
> > 
> 
> 
> -- 
> [EMAIL PROTECTED]  7551596@ICQ  [EMAIL PROTECTED]
> 
> Dream like you'll live forever...Love like you've never been hurt...
> Work like you don't need the money...and Dance like nobody is watching!
>         -- Satchel Paige
> 

Reply via email to