[htdig] ssl Patch for htdig

2000-11-15 Thread Michael Arndt

Hello,

i would need a SSL-Version of htdig. n the Archives i found a Thread
about a SSL-Patch
for htdig.
Only Patchfile i found is:
ftp://ftp.ccsf.org/htdig-patches/3.1.5/ssl.1

and that does not apply against a "clean htdig".
Only Help would be aplying all patches manually.

Is anyone out there who has done this already ?
Or someone who can point me to a patch appliable against
a clean htdig or send me patched sources ?

merci
Micha


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] Additional variables for htsearch

2000-11-15 Thread Oliver Hoogvliet

Geoff Hutchison wrote:
 
 At 7:32 AM -0500 11/14/00, [EMAIL PROTECTED] wrote:
 I have a similar interest.  Appears that any environment variable, if set
 prior to invoking htsearch, is retained and can be used in output templates.
 
 Yes, all environment variables (i.e. CGI environment variables) are
 available to htsearch.
 
 At 12:16 PM +0100 11/14/00, Oliver Hoogvliet wrote:
 as I saw in the htsearch reference there are several standard variables
 to use in template files for htsearch. I would like to use more
 entry-specific variables that can be parsed by htsearch. Are there any
 options or possibilities for htdig and htsearch to save additional
 information which can be read out later with self-defined variables?
 
 It's probably a bit more than you want, but you can also use the
 allow_in_form attribute with your user-defined variable names. (The
 default being set in the conf file).
 
 http://www.htdig.org/attrs.html#allow_in_form
 
 Cheers,
 
Hello again,
thanks for the first information. Unfortunatley I couldn't use it for my
problem. I wil luse an example:

The following line show my "long.html"-file:
__
1  PTABLE WIDTH="479" CELLPADDING="0" CELLSPACING="0" BORDER="0"
2  TR
3  TDFONT FACE="arial, helvetica,geneva"
SIZE="-1"B$(TITLE)/B/FONT
4  BRFONT FACE="arial, helvetica,geneva" SIZE="-4"$(MODIFIED)
???/FONTBR
5  /TD/TR
6  TRTDFONT SIZE="-1" FACE="arial, helvetica,geneva"$(EXCERPT)
7  P align="right"Der vollstauml;ndige A
HREF="$(URL)$(ANCHOR)"Artikel/A
8  /FONT
9  /TD/TR
10 /TABLE 
_
At line 4 you can see three ???. At this place I would like to have one
of two alternative messages: (a) News  (b) Stories. 

Is there any possibility to get these alternatives with htdig by using
any META-Tag? Can this additional information be used to produce an
output by htsearch (i.e. with an user-defined variable)?  

If not, is there any alternative solution for that?

For your assistance I am very grateful.

Oliver Hoogvliet


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




[htdig] bibtex format

2000-11-15 Thread campbel

Hello,

Just wondering if there is an external parser in HtDig for the bibtex
format (bib).

Thanks,
Sheri


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] ssl Patch for htdig

2000-11-15 Thread Joe R. Jah

On Wed, 15 Nov 2000, Michael Arndt wrote:

 Date: Wed, 15 Nov 2000 13:56:05 +0100
 From: Michael Arndt [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: [htdig] ssl Patch for htdig
 
 Hello,
 
 i would need a SSL-Version of htdig. n the Archives i found a Thread
 about a SSL-Patch
 for htdig.
 Only Patchfile i found is:
 ftp://ftp.ccsf.org/htdig-patches/3.1.5/ssl.1
 
 and that does not apply against a "clean htdig".
 Only Help would be aplying all patches manually.
 
 Is anyone out there who has done this already ?
 Or someone who can point me to a patch appliable against
 a clean htdig or send me patched sources ?

It was reported that the older patch:

ftp://ftp.ccsf.org/htdig-patches/3.1.5/0ld/ssl.0

applies to a "clean htdig-3.1.5" with the -l switch: 

cd /path/to/htdig-3.1.5/
patch -p1 -l  /path/to/ssl.0

Regards,

Joe
-- 
 _/   _/_/_/   _/  __o
 _/   _/   _/  _/ __ _-\,_
 _/  _/   _/_/_/   _/  _/ ..(_)/ (_)
  _/_/ oe _/   _/.  _/_/ ah[EMAIL PROTECTED]



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] ssl Patch for htdig

2000-11-15 Thread Joe R. Jah

On Wed, 15 Nov 2000, Joe R. Jah wrote:

 Date: Wed, 15 Nov 2000 10:26:20 -0800 (PST)
 From: "Joe R. Jah" [EMAIL PROTECTED]
 To: Michael Arndt [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Subject: Re: [htdig] ssl Patch for htdig
 
 On Wed, 15 Nov 2000, Michael Arndt wrote:
 
  Date: Wed, 15 Nov 2000 13:56:05 +0100
  From: Michael Arndt [EMAIL PROTECTED]
  To: [EMAIL PROTECTED]
  Subject: [htdig] ssl Patch for htdig
  
  Hello,
  
  i would need a SSL-Version of htdig. n the Archives i found a Thread
  about a SSL-Patch
  for htdig.
  Only Patchfile i found is:
  ftp://ftp.ccsf.org/htdig-patches/3.1.5/ssl.1
  
  and that does not apply against a "clean htdig".
  Only Help would be aplying all patches manually.
  
  Is anyone out there who has done this already ?
  Or someone who can point me to a patch appliable against
  a clean htdig or send me patched sources ?
 
 It was reported that the older patch:
 
   ftp://ftp.ccsf.org/htdig-patches/3.1.5/0ld/ssl.0
 
 applies to a "clean htdig-3.1.5" with the -l switch: 
 
   cd /path/to/htdig-3.1.5/
   patch -p1 -l  /path/to/ssl.0

OK, I downloaded htdig-3.1.5.tar.gz; my htdig have been patched and
re-patched;)  I tested both versions of the patch, and found out that ssl.1
does not apply, but ssl.0, the old patch applies with -l switch.  I added
the following lines to the beginning of the patch and placed it in the
archives as: 

ftp://ftp.ccsf.org/htdig-patches/3.1.5/ssl.2
_
Tabs in this patch have been converted to spaces;(  In order to apply the
patch to a clean htdig-3.1.5 please use the -l switch: 

gunzip -c htdig-3.1.5.tar.gz | tar xf -
cd htdig-3.1.5
patch -p1 -l  /path/to/ssl.2
_


Regards,

Joe
-- 
 _/   _/_/_/   _/  __o
 _/   _/   _/  _/ __ _-\,_
 _/  _/   _/_/_/   _/  _/ ..(_)/ (_)
  _/_/ oe _/   _/.  _/_/ ah[EMAIL PROTECTED]



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] ssl Patch for htdig

2000-11-15 Thread Joshua Gerth


Speaking of ssl patches.  I also downloaded 3.1.5 and patched it with the
ssl.0 patch and the -l flag.  However, I then ran into the additional
problem that urls of the form:
https://myserver.com

were being directed to port 80, and that only urls of the form:
https://myserver.com:433

were actually going to the encrypted port.  So I hacked my copy so that
any url which starts with 
https

goes to port 433 by default but 'http' still goes to 80 by default.  Of
course, both can still be overridden by using the :port on the url.

Did anyone else hit this?  Would this patch be useful to anyone?  If so
I'll try to post it assuming I have the rights to do so.

Joshua

 OK, I downloaded htdig-3.1.5.tar.gz; my htdig have been patched and
 re-patched;)  I tested both versions of the patch, and found out that ssl.1
 does not apply, but ssl.0, the old patch applies with -l switch.  I added
 the following lines to the beginning of the patch and placed it in the
 archives as: 
 
   ftp://ftp.ccsf.org/htdig-patches/3.1.5/ssl.2
 _
 Tabs in this patch have been converted to spaces;(  In order to apply the
 patch to a clean htdig-3.1.5 please use the -l switch: 
 
 gunzip -c htdig-3.1.5.tar.gz | tar xf -
 cd htdig-3.1.5
 patch -p1 -l  /path/to/ssl.2
 _



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] 3.2b2 -- include:, config_dir

2000-11-15 Thread Doug Barton

[EMAIL PROTECTED] wrote:
 
 Documentation of these is less than entirely clear.
 
 As to "include:", what is the implied path:

What do you mean by "implied path?"  

I tried using an include: statement in a htdig.conf file that I'm
developing. I first tried "include: $config_dir/includefile," which worked
with rundig, but not with htsearch because htsearch couldn't seem to
determine the value of $config_dir. I don't remember if I tried it with an
absolute path to the include or not...


Doug
-- 
 Any sufficiently advanced technology is indistinguishable from magic.
 -- Arthur C. Clarke

   Do YOU Yahoo!?


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] 3.20/b2 -- SCORE Variable

2000-11-15 Thread Gilles Detillieux

According to [EMAIL PROTECTED]:
 I've been trying to experiment with the various weighting factors; by 
 specifying xxx_FACTOR values in the conf file.  
 
 Appears that, whatever values I use, long-format displays come out with a 
 single star.  
 
 Closely related, is there a simple way to actually see the value of $(SCORE) 
 and/or $(PERCENT) which was assigned to a given listing?  Documentation 
 implies that I'd have to alter the value of builtin-long; seems like there 
 ought to be a simpler way which doesn't require so many config changes.  
 
 (Or maybe what I'm really saying is that the "sample" config ought to include 
 actual specification of almost all parameters; possibly with 
 less-frequently-altered ones in a separate section).

There are bugs in the score calculation in 3.2.0b1  3.2.0b2, which are
fixed in the latest 3.2.0b3 development snapshot.  I suggest you give
that a try.  As for how the score is calculated, that was discussed at
length a few months ago (in August, I think), so you may want to search
the mailing list archives.

-- 
Gilles R. Detillieux  E-mail: [EMAIL PROTECTED]
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] ssl Patch for htdig

2000-11-15 Thread Joshua Gerth


 errr make that 443

Joshua

 Speaking of ssl patches.  I also downloaded 3.1.5 and patched it with the
 ssl.0 patch and the -l flag.  However, I then ran into the additional
 problem that urls of the form:
   https://myserver.com
 
 were being directed to port 80, and that only urls of the form:
   https://myserver.com:433
 
 were actually going to the encrypted port.  So I hacked my copy so that
 any url which starts with 
   https
 
 goes to port 433 by default but 'http' still goes to 80 by default.  Of
 course, both can still be overridden by using the :port on the url.
 
 Did anyone else hit this?  Would this patch be useful to anyone?  If so
 I'll try to post it assuming I have the rights to do so.
 
 Joshua
 
  OK, I downloaded htdig-3.1.5.tar.gz; my htdig have been patched and
  re-patched;)  I tested both versions of the patch, and found out that ssl.1
  does not apply, but ssl.0, the old patch applies with -l switch.  I added
  the following lines to the beginning of the patch and placed it in the
  archives as: 
  
  ftp://ftp.ccsf.org/htdig-patches/3.1.5/ssl.2
  _
  Tabs in this patch have been converted to spaces;(  In order to apply the
  patch to a clean htdig-3.1.5 please use the -l switch: 
  
  gunzip -c htdig-3.1.5.tar.gz | tar xf -
  cd htdig-3.1.5
  patch -p1 -l  /path/to/ssl.2
  _
 
 
 
 To unsubscribe from the htdig mailing list, send a message to
 [EMAIL PROTECTED]
 You will receive a message to confirm this.
 List archives:  http://www.htdig.org/mail/menu.html
 FAQ:http://www.htdig.org/FAQ.html
 
 



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] different search results

2000-11-15 Thread Gilles Detillieux

According to gkalter:
 Hope this mailing-list is the right one..;-)
 
 Today I got htdig to work pretty well on a site containing many
 PDF-Files.
 
 • Cobalt Raq2 micorserver (mips) with RedHat based Linux
 
 After updating the C++ Compiler (see mailing list) I got rid of the
 segmenatition
 error messages and htdig worked well.
 
 Cryptic outputs of the search form were solved by adding a ".cgi"
 extension to htsearch
 in the local cgi-bin folder. Solution also found in the list - thanks to
 all those helpful people!

I think the FAQ also has some pointers on getting the CGI to work.

 Because I wanted to get direct links to single PDF Pages out of the
 found excerpts I got
 the pdftodig.py script for external parsing of PDF-Files. (Do I have to
 mention that python
 IS NOT installed on Cobalt Raqs?) O.K. this problem could also be
 solved.

It would also be a fairly trivial change to the perl scripts conv_doc.pl
or doc2html.pl to make it replace form feeds in pdftotext output with
the correct HTML a name="..." tags for the anchors.  You'd then be
using an external converter, rather than an external parser, and possibly
avoiding parser-related problems.

 Now everything works pretty good with one little exception.
 
 Using a complete search string e.g. "Sensor" lists all matching
 documents and the text contains
 the search word (bold typeface) with a link to the specific single Page
 of the found PDF file.
 (Great!)
 
 Typing just a substring e.g. "Senso" in the search form seems to list
 same results. But unfortunately the links within
 the texts are gone.

Sounds like one of two problems:

1) the maximum_word_length setting is too low, so you're getting truncated
words in the database causing false matches which aren't found in the
excerpt.

2) the pdftodig.py script is somehow truncating the words for the word
records, or otherwise generating word records that don't match the words
in the header record it puts out.  Try running it manually on one of the
PDFs where you had problems with false matches, and see what it puts out
both in the "h" record and in the "w" records, to see if there are any
discrepancies.

Generally, entering just a substring in the search form isn't enough
to get a match, unless you're using the prefix or substring fuzzy match
algorithms.  However, the fuzzy match algorithms generate an expanded list
of matches so that all matched words should be highlighted.  It seems
to me that somehow you're getting substrings in your word database,
which it the cause of the problem.

-- 
Gilles R. Detillieux  E-mail: [EMAIL PROTECTED]
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] ssl Patch for htdig

2000-11-15 Thread Joe R. Jah

On Wed, 15 Nov 2000, Joshua Gerth wrote:

 Date: Wed, 15 Nov 2000 13:38:29 -0800 (PST)
 From: Joshua Gerth [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: Re: [htdig] ssl Patch for htdig
 
 
 Speaking of ssl patches.  I also downloaded 3.1.5 and patched it with the
 ssl.0 patch and the -l flag.  However, I then ran into the additional
 problem that urls of the form:
   https://myserver.com
 
 were being directed to port 80, and that only urls of the form:
   https://myserver.com:433
 
 were actually going to the encrypted port.  So I hacked my copy so that
 any url which starts with 
   https
 
 goes to port 433 by default but 'http' still goes to 80 by default.  Of
 course, both can still be overridden by using the :port on the url.
 
 Did anyone else hit this?  Would this patch be useful to anyone?  If so
 I'll try to post it assuming I have the rights to do so.

I only tested to see if the patch applies to a clean 3.1.5.  I am sure,
however, that your patch will be useful to someone;)  Go ahead an post it
to the list, or just upload it to: 

ftp://ftp.ccsf.org/incoming/

P.S.  It would be nice to document your patch; save potential users the
  guesswork and digging up relevant information in the list archives;)

Regards,

Joe
-- 
 _/   _/_/_/   _/  __o
 _/   _/   _/  _/ __ _-\,_
 _/  _/   _/_/_/   _/  _/ ..(_)/ (_)
  _/_/ oe _/   _/.  _/_/ ah[EMAIL PROTECTED]



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] How to exclude part of a html page ?

2000-11-15 Thread Gilles Detillieux

According to Davis, Iain E.:
  -Original Message-
  From: Geoff Hutchison [mailto:[EMAIL PROTECTED]]
  Sent: Monday, November 13, 2000 8:34 AM
  To: raphael hoffner
  Cc: [EMAIL PROTECTED]
  Subject: Re: [htdig] How to exclude part of a html page ?
  
  
  At 11:38 AM +0100 11/13/00, raphael hoffner wrote:
  My html page is in two part : the navigation (link in the site) and 
  the content, each are in a table
  I search how I can exclude the table who content the navigation
  
  See http://www.htdig.org/attrs.html#noindex_start
 
 At first glance, I thought this would solve my difficulty with indexing the
 navigation menu on my site as well.  However, this causes it to not follow
 links in the indicated area as well as not indexing the text so it indexed
 only the first page and stopped...  Not really what I wanted! (on the
 upside, the database was smaller! ;)
 
 I guess I could achieve what I want by doing something like:
 
 a href="..."!-- htdig_noindex--PageName!-- /htdig_noindex--/a for
 each link on the navigation bar...but that seems rather clunky to me. :)
 
 Is there a way to not index the text, but still have it follow the links?

Yes, but only on a whole-file basis, as far as I know.  See
http://www.htdig.org/FAQ.html#q4.15

-- 
Gilles R. Detillieux  E-mail: [EMAIL PROTECTED]
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] not showing child frames?

2000-11-15 Thread Gilles Detillieux

According to Oguz Altun:
  When searching pages that have frames, Htdig returns both the main page
 and
  the child frame. How can I get rid of that lonely frame? ;)
 
  Well you can exclude certain patterns using the restrict and exclude
  fields in the search form. Or you could add META robots tags to the
  frames that you don't want, e.g.:
 
  META name="ROBOTS" content="NOINDEX, NOFOLLOW"
 
 Well, I want framed pages to be indexed and returned, but just in their
 respective parent windows.

This is the big problem with frames.  Unless you define each page to
have its own unique frameset, there's no way to tell the browser with
just one URL to go back to a particular frameset and page combination.
If you do define each page as its own unique frameset, and you're 100%
consistent in the naming convention you use for this, then it would
probably be pretty easy to map the target page URLs back to their
corresponding frameset URLs.  However, I have yet to see a frames-based
web site where this would be possible.

-- 
Gilles R. Detillieux  E-mail: [EMAIL PROTECTED]
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] 3.2b2 -- include:, config_dir

2000-11-15 Thread Geoff Hutchison

On Wed, 15 Nov 2000, Doug Barton wrote:

   I tried using an include: statement in a htdig.conf file that I'm
 developing. I first tried "include: $config_dir/includefile," which worked
 with rundig, but not with htsearch because htsearch couldn't seem to
 determine the value of $config_dir. I don't remember if I tried it with an
 absolute path to the include or not...

There was a bug in the 3.2.0b1 and 3.2.0b2 releases as far as the include:
function. AFAIK, it is fixed in the latest snapshots.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/




To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] not showing child frames?

2000-11-15 Thread Geoff Hutchison

On Wed, 15 Nov 2000, Gilles Detillieux wrote:

 If you do define each page as its own unique frameset, and you're 100%
 consistent in the naming convention you use for this, then it would
 probably be pretty easy to map the target page URLs back to their
 corresponding frameset URLs.  However, I have yet to see a frames-based
 web site where this would be possible.

Someone once told me that you could do this with JavaScript. (I guess it
detected that it was the only frame and assembled the frameset.)

I have not seen a working example of this. Perhaps some JavaScript gurus
out there could provide one.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] 3.2b2 -- include:, config_dir

2000-11-15 Thread Geoff Hutchison

On Wed, 15 Nov 2000 [EMAIL PROTECTED] wrote:

 As to "include:", what is the implied path:
 A.  That of the conf file in which it appears?, or 

A is the correct behavior. As I mentioned in another message, there were
bugs in the config parser in 3.2.0b1 and 3.2.0b2 and AFAIK are fixed in
the current development snapshots. (Betas are for discovering bugs, after
all.)

 As to "config_dir", exactly when does the specification become effective?  
 Real ambiguity arises when the same .conf file includes both a "config_dir:" 
 statement, and one or more "include:" statements.

Yup. I'm not sure what good a config_dir assignment would do. For the
include files, it takes paths (relative or absolute) to the current file.
For the command-line programs, you need to use the -c flag and for
htsearch you have CONFIG_DIR assigned in compilation for security reasons.

If you think about it, specifying config_dir in a config file is
nonsensical. (What, are you saying that the config file that should be
read is one with the same name in a different directory?)

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] Multiple hits on the same page 'and' problem

2000-11-15 Thread Gilles Detillieux

According to Dhammika Gunawardena:
 I have 2 problems.
 
 This is an example of my first problem.
 
 I have the following contents on the same page:
 
 dog
 some text about dog
 
 
 good dog
 some text about good dog
 
 
 Now if I search for dog, only the first part appears in the search result.
 The results do not show that this page contains something about "good dog".
 I want my search results to show the same URL twice, one for "dog" and one
 for "good dog".

I think what you want is something similar to what Jim Cole implemented in
his multiple excerpt patch:

   ftp://ftp.ccsf.org/htdig-patches/3.1.5/Excerpts.0

The possible difference is his patch shows all matches in the excerpt, but
only shows the URL once, if I recall correctly.


 My second problem is related to 'and'
 Suppose my page has something like this:
 
 man - human
 
 good man - good human
 
 I search for "good man" , my results show the first part. I want it to
 display all matches.
 
 Please tell me how to configure htdig for these.

Well, if you're talking about phrase matching, i.e. only highlighting the
exact phrase "good man" and not the individual words, that's coming in
3.2.  See http://www.htdig.org/FAQ.html#q1.9

If you're not talking about phrase matching, then this is the same
problem as the first one, and therefore the same solution should work.

-- 
Gilles R. Detillieux  E-mail: [EMAIL PROTECTED]
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] bibtex format

2000-11-15 Thread Geoff Hutchison

On Wed, 15 Nov 2000 [EMAIL PROTECTED] wrote:

 Just wondering if there is an external parser in HtDig for the bibtex
 format (bib).

Not that I know of, but since it's text it wouldn't be hard to write one.
(Just filter out parts you don't want, if any.)

What sorts of things would you want to index? The whole thing?

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] bibtex format

2000-11-15 Thread Gilles Detillieux

According to [EMAIL PROTECTED]:
 Just wondering if there is an external parser in HtDig for the bibtex
 format (bib).

I don't know, but if there's a utility that produces plain text or HTML
output from bibtex files, it could easily be spliced into the doc2html
external converter script.  See http://www.htdig.org/files/contrib/parsers/

-- 
Gilles R. Detillieux  E-mail: [EMAIL PROTECTED]
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] not showing child frames?

2000-11-15 Thread Doug Barton

Geoff Hutchison wrote:
 
 On Wed, 15 Nov 2000, Gilles Detillieux wrote:
 
  If you do define each page as its own unique frameset, and you're 100%
  consistent in the naming convention you use for this, then it would
  probably be pretty easy to map the target page URLs back to their
  corresponding frameset URLs.  However, I have yet to see a frames-based
  web site where this would be possible.
 
 Someone once told me that you could do this with JavaScript. (I guess it
 detected that it was the only frame and assembled the frameset.)
 
 I have not seen a working example of this. Perhaps some JavaScript gurus
 out there could provide one.

I am not a javascript guru, but I do play one on TV. :)

HEAD
...
SCRIPT LANGUAGE="JavaScript"
!-- comment out the script so old browsers skip it
if (window.name != "tocframe")
location.href="index.html";
// --
/SCRIPT
/HEAD

In this example, if someone clicks on a link in my search results page that
actually references my table of contents frame, it uses a fake 302
(location) header to cause that "page" to load up in the proper frameset by
going to the site's home, which then loads the frameset, loads the TOC,
etc. Obviously you should replace "tocframe" with the real name of the
frame that your page should be in. This works really well, as long as your
search form itself is not in a frameset. You can extend this of course with
a little imagination. 

A really good site for Javascript (and other) tips and tricks is
http://www.zdnet.com/devhead/, although I don't recall off hand where I
picked this one up. 

HTH,

Doug
-- 
 Any sufficiently advanced technology is indistinguishable from magic.
 -- Arthur C. Clarke

   Do YOU Yahoo!?


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] not showing child frames?

2000-11-15 Thread Duke Hillard

This is possible with JavaScript.  See the examples at
"http://www.louisiana.edu/Academic/Bulletin/UN/" and
"http://www.louisiana.edu/Academic/Bulletin/UN/".  The
application used to create the pages was Trellix 2.1 which
was bundled with WordPerfect Suite 2000.  Other versions
of Trellix exist.  This particular one was selected for its ability
to import individual pages of a long document and to transform
each page into a unique URI with full side-bar navigation.  Both
of the sites mentioned above are indexed and ready to be searched
thanks to ht://dig and Trellix.  Attempt to open the body of any page
in a new frame and JavaScript detects that the parent frameset is not
present.  JavaScript then fetches the unique parent frame, which loads
the page in the intended frameset.

HTH,

-- Duke Hillard, University Webmaster, UL Lafayette


Geoff Hutchison wrote:

 On Wed, 15 Nov 2000, Gilles Detillieux wrote:

  If you do define each page as its own unique frameset, and you're 100%
  consistent in the naming convention you use for this, then it would
  probably be pretty easy to map the target page URLs back to their
  corresponding frameset URLs.  However, I have yet to see a frames-based
  web site where this would be possible.

 Someone once told me that you could do this with JavaScript. (I guess it
 detected that it was the only frame and assembled the frameset.)

 I have not seen a working example of this. Perhaps some JavaScript gurus
 out there could provide one.

 --
 -Geoff Hutchison
 Williams Students Online
 http://wso.williams.edu/

 
 To unsubscribe from the htdig mailing list, send a message to
 [EMAIL PROTECTED]
 You will receive a message to confirm this.
 List archives:  http://www.htdig.org/mail/menu.html
 FAQ:http://www.htdig.org/FAQ.html


begin:vcard 
n:Hillard;Duke
tel;work:337-482-5763
x-mozilla-html:TRUE
url:http://www.louisiana.edu/
org:University of Louisiana at Lafayette;University Computing Support Services
adr:;;P.O. Box 42770;Lafayette;LA;70504-2770;USA
version:2.1
email;internet:[EMAIL PROTECTED]
title:Computing Resources Coordinator
fn:Duke Hillard
end:vcard




To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html


Re: [htdig] not showing child frames?

2000-11-15 Thread Doug Barton

I just found a really good article on this topic at zdnet's aforementioned
site. http://www.zdnet.com/devhead/stories/articles/0,4413,2438662,00.html

Is this worth a FAQ entry? I could probably write something up

Enjoy,

Doug
PS, please don't hold the fact that it's a frontpage article against me. :)
-- 
 Any sufficiently advanced technology is indistinguishable from magic.
 -- Arthur C. Clarke

   Do YOU Yahoo!?


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] not showing child frames?

2000-11-15 Thread Geoff Hutchison

At 3:32 PM -0800 11/15/00, Doug Barton wrote:
   I just found a really good article on this topic at zdnet's 
aforementioned
site. http://www.zdnet.com/devhead/stories/articles/0,4413,2438662,00.html

   Is this worth a FAQ entry? I could probably write something up

Sure. It's probably not as common as parts of FAQ, but it's certainly 
a good thing to have there. One of these days the docs should be 
restructured to have a "tips and tricks" section or HOWTO or somesuch.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




[htdig] Dummies Guide to Restricting searchs

2000-11-15 Thread Terry Collins

I'm after the dummies guide to restricting searchs?

I have a number of email lists that I archive and make available on the
WWW with mhonarc.

I'm currently using htdig to allow searching of one list, but that is
invoked by 
http://www.woa.com.au/cgi-bin/htsearch as supplied.

I would like to expand the searching to a number of lists from one
database, but obviously some one chasing genealogical info doesn't want
linux stuff, etc

I understand from my reading that I can restrict the search to returning
only results from one list. i.e. http://www.woa.com.au/lists/list-name
by using exclude and restrict parameters when htsearc is called.

My problem is that I don't understand HOW to go about this?
Can anyone point me to some further info/how-to/etc?


 
--
   Terry Collins {:-)}}} Ph(02) 4627 2186 Fax(02) 4628 7861  
   email: [EMAIL PROTECTED]  www: http://www.woa.com.au  
   WOA Computer Services lan/wan, linux/unix, novell

 "People without trees are like fish without clean water"


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] Dummies Guide to Restricting searchs

2000-11-15 Thread Geoff Hutchison

At 12:14 PM +1100 11/16/00, Terry Collins wrote:
I understand from my reading that I can restrict the search to returning
only results from one list. i.e. http://www.woa.com.au/lists/list-name
by using exclude and restrict parameters when htsearc is called.

My problem is that I don't understand HOW to go about this?
Can anyone point me to some further info/how-to/etc?

See http://www.htdig.org/hts_form.html

(esp. the "restrict" and "exclude" portions)

Cheers,

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] Dummies Guide to Restricting searchs

2000-11-15 Thread Terry Collins

Geoff Hutchison wrote:
 
 At 12:14 PM +1100 11/16/00, Terry Collins wrote:
 I understand from my reading that I can restrict the search to returning
 only results from one list. i.e. http://www.woa.com.au/lists/list-name
 by using exclude and restrict parameters when htsearc is called.
 
 My problem is that I don't understand HOW to go about this?
 Can anyone point me to some further info/how-to/etc?
 
 See http://www.htdig.org/hts_form.html
 
 (esp. the "restrict" and "exclude" portions)

Thanks Geoff.

This is what I had read before posting, but the how-to-do-it eluded me
(html is rather limited). Then in the usual fashion of an illumination
coming as soon as you post to a list, I tried making the link read
http://www.woa.com.au/cgi-bin/htsearch?config=restrict=lists/slug 
which seems to work and it carries the restriction forward for
successive searches. Seems because at the moment, it is still doing
indexing of the other areas.

Is this autocarry forward of the restriction reliable?

However, I'd like to allow them to enter a search value for the first
search rather than return an error message first up.

I've also looked at somewhere else that had a form (something I've never
used before) and cut-and-pasted into my page.

form method="get" action="/cgi-bin/htsearch"
pinput type="text" size="30" name="words" value=""
input type="submit" value="Search"/p
/form

I changed action to read
action="/cgi-bin/htsearch?config=restrict=/lists/slug" but it instantly
drops the restriction off the form.

Is this the way to do it?
Can anyone tell me why the restriction is dropped?


--
   Terry Collins {:-)}}} Ph(02) 4627 2186 Fax(02) 4628 7861  
   email: [EMAIL PROTECTED]  www: http://www.woa.com.au  
   WOA Computer Services lan/wan, linux/unix, novell

 "People without trees are like fish without clean water"


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html