Ok, I should not have assumed you were running some form of Unix.

If you have success with:

set TMPDIR=d//
./htdig -i -c ../conf/htdig.conf

Try

set DOC2HTML_LOG
set TMPDIR=d//
./htdig -i -c ../conf/htdig.conf

which should increase the information you get out of doc2html.pl

Did you have to modify doc2html to get it to work under DOS?  If so I would
like to know
what changes you made so that they can go into the next release.

If you are using the C-shell then that would become:

setenv DOC2HTML_LOG
setenv TMPDIR=d//
./htdig -i -c ../conf/htdig.conf

Whereas Bourne Shell and (I think) Bash it would be:

DOC2HTML_LOG=""
TMPDIR=d//
export DOC2HTML_LOG, TMPDIR
./htdig -i -c ../conf/htdig.conf


--
David Adams
Computing Services
Southampton University


----- Original Message -----
From: "Marcus Valentine" <[EMAIL PROTECTED]>
To: "David Adams" <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Tuesday, June 19, 2001 9:36 AM
Subject: Re: [htdig] doc2html.pl version 3 problems under windows NT


> At 09:10 19/06/01 +0100, David Adams wrote:
> >Doc2html.pl is not giving you any error messages, so it seems to be
working.
> >
> >Add
> >
> >DOC2HTML_LOG = ""
> >export DOC2HTML_LOG
> >
> >to the (Bourne shell) script that runs htdig and doc2html.pl will output
a
> >line for every file indexed,
> >which will include the number of bytes extracted and sent back to htdig.
>
> Sorry - I'm not with you.  Presently I'm running htdig from the dos
command
> line.  Are you saying I should create a script file and run it from within
> cygwin at the bash prompt? I tried
>
> DOC2HTML_LOG = ""
> ./htdig -i -c ../conf/htdig.conf
> export DOC2HTML_LOG
>
> but the "can't open file /tmp/htdext.???" problem recurs, which I fixed by
> running htdig from the dos command line and setting TMPDIR=d//
>
> Thanks
>
> >--
> >David Adams
> >Computing Services
> >Southampton University
> >
> >
> >----- Original Message -----
> >From: "Marcus Valentine" <[EMAIL PROTECTED]>
> >To: <[EMAIL PROTECTED]>
> >Sent: Monday, June 18, 2001 4:58 PM
> >Subject: Re: [htdig] doc2html.pl version 3 problems under windows NT
> >
> >
> >> At 14:11 14/06/01 -0500, Gilles Detillieux wrote:
> >> >According to Marcus Valentine:
> >> >> invoking doc2html.pl from htdig. When htdig spiders the site, for
each
> >pdf
> >> >> it comes across I get an error message like
> >> >>
> >> >> !!      Error: Couldn't open file '/cygdrive/d/htdext.326'
> >> >>
> >> >> This is to do with the temporary file used to pipe the output from
> >> >> doc2html.pl to htdig, yes?  I've tried various environment settings
of
> >tmp,
> >> >> tmpdir or whatever the hell it's trying to use (isn't there a
similar
> >issue
> >> >> with htmerge under NT, that thankfully I'm not suffering from)
> >tinkering
> >> >> around with both at the dos prompt and the bash prompt to no avail.
Can
> >> >> anyone shed some light on this?
> >> >
> >> >Both htdig and htmerge make use of the TMPDIR environment variable
(note
> >> >the name is all caps).  That error message seems to be coming from
> >> >pdftotext, though, and not htdig or doc2html.pl.  That means that the
> >> >file is being created and htdig is calling doc2html.pl, which in turn
> >> >is calling pdftotext.
> >>
> >> All this cygdrive stuff was getting too complicated, as I had
/cygwin/bin
> >> in my path. To simplify things, I took cygwin/bin out of my path and
put
> >> cygwin1.dll into its own directory, with that directory in the path.
> >>
> >> Next I installed activeware perl, as this appears to be the perl of
choice
> >> of successful win32 htdig users. Then I set TMPDIR=d//
> >>
> >> Now htdig runs, with no errors when it encounters a pdf.  For example
> >>
> >> 15:15:1:http://marcusv_pc:8080/toracomm/pdf/DS012_Design_Services.pdf:
> >> size = 69129
> >>
> >> But when I run htmerge, I get for example
> >>
> >> Deleted, no excerpt:
> >> 15/http://marcusv_pc:8080/toracomm/pdf/DS012_Design_Services.pdf
> >>
> >> Is the pdf being indexed or not?  Anyone got any ideas?
> >>
> >> Marcus Valentine
>
>


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to