Hi Andrew,
* Andrew M. Bishop <[EMAIL PROTECTED]> [10. Jan. 2004]:
> Gregor Zattler <[EMAIL PROTECTED]> writes:
>
> > I want to have a new full index of wwwoffles cache:
> > /var/chache/wwwoffle/search/htdig/{tmp,db,db-lasttime} are empty.
> > /var/chache/wwwoffle/search/htdig/scripts/wwwoffle-htdig-full,
> > precicely "htdig -i -c search/htdig/conf/htdig-full.conf" then
> > hangs endless, when indexing of *.pdf, *.doc etc. is enabled but
> > proceedes well, if not.

O. K. I disabled indexing of PDFs because I at least wanted an index of
html files.  But htdig hangs in either case...

> > I assume there is a problem with a single specific *.pdf, *.doc
> > file or such.  I would like to remove it to have a full index.
> > How can I find out which files were opend last?
>
> When you are using the wwwoffle-htdig-full script to do the indexing
> of the WWWOFFLE cache with htdig there is a log file created.  The
> file is called search/htdig/wwwoffle-htdig.log in the WWWOFFLE cache
> directory.  This will tell you what the last file was that htdig
> looked at and also it should have all htdig error messages.


Thank you for this hint, but it didn't help.

I'm running Debian/Sid, so wwwoffle-htdig.log is in /var/log.
wwwoffle-htdig-full is empty:

# ls -Al /var/log/wwwoffle-htdig.log
-rw-r--r--    1 root     root            0 2004-01-18 09:15 /var/log/wwwoffle-htdig.log



My wwwoffle cache isn't that big:

# du -sm /var/cache/wwwoffle/http/
1215    /var/cache/wwwoffle/http


In last months a wwwoffle-htdig-full run completed in ca. 4
hours.  Now it hangs forever:

# ps fax|grep tty6
  561 tty6     S      0:02 -bash
  740 tty6     S      0:00  \_ /bin/sh search/htdig/scripts/wwwoffle-htdig-full
  741 tty6     R    1457:55      \_ htdig -i -c search/htdig/conf/htdig-full.conf



But the files in /var/cache/wwwoffle/search/htdig/db were last updated
more than 24 hours ago:

# ls -Al /var/cache/wwwoffle/search/htdig/db
total 378612
-rw-r--r--    1 root     root     179151872 2004-01-17 16:11 db.docdb
-rw-r--r--    1 root     root     208151269 2004-01-17 16:11 db.wordlist



This time I enabled indexing of *.pdf *.doc and the like.  A few
days ago I run wwwoffle-htdig-full with indexing *.pdf *.doc
disabled.  Same result: htdig hangs, db files are not updated any
more...  


Do you have any idea how to debug this?



Thanx,

Gregor


P.S.:
It's not a problem of ressources:

top - 18:09:13 up 2 days,  6:10,  9 users,  load average: 2.00, 2.00, 2.22
Tasks:  83 total,   8 running,  75 sleeping,   0 stopped,   0 zombie
Cpu(s):  99.9% user,   0.1% system,   0.0% nice,   0.0% idle
Mem:    515940k total,   505904k used,    10036k free,    80872k buffers
Swap:   875500k total,   181708k used,   693792k free,    84192k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
  741 root      20   0 23116  732  676 R 49.9  0.1   1457:41 htdig
 1791 root      20   0  7560 7288  608 R 49.9  1.4  32:09.04 dpkg
15292 grfz      10   0  1048 1048  828 R  0.2  0.2   0:02.83 top
    1 root       8   0   476  444  420 S  0.0  0.1   0:05.02 init
    2 root       9   0     0    0    0 S  0.0  0.0   0:02.56 keventd


Reply via email to