Re: htdig: virtual hosts revisited

1998-12-15 Thread Walter Hafner
Webmaster writes: Geoff Hutchison writes: We have lots of links on our website and it's annoying to see duplicates in search results. But the problem with duplicate detection is deciding which duplicate to use! My current thought is to use the document with the lower hopcount.

Re: htdig: virtual hosts revisited

1998-12-15 Thread Walter Hafner
Geoff Hutchison writes: At 5:28 AM -0500 12/14/98, Walter Hafner wrote: 1) The lack of support for German umlauts (äöüß) My suggestion would be to look at the locale option. Oops, sorry. I stand corrected. Missed that one. 2) The somewhat limited queries. I think you'll have to

htdig: CORE file when using htsearch

1998-12-15 Thread Mike Dagan
Hi, I am using the latest htdig on a linux machine. Every few minutes i find a core file in the cgi-bin directory. I used the "strings core" command to see if i can get any hints. Does it makes any sence to anyone? CORE CORE htsearch htsearch CORE htsearch CORE #lx msize: %lu stack: %lu

Re: htdig: virtual hosts revisited

1998-12-15 Thread John Grohol PsyD
Geoff Hutchison writes: We have lots of links on our website and it's annoying to see duplicates in search results. But the problem with duplicate detection is deciding which duplicate to use! My current thought is to use the document with the lower hopcount. Walter Hafner replied: As I

Re: htdig: virtual hosts revisited

1998-12-15 Thread Geoff Hutchison
At 5:28 AM -0500 12/15/98, Walter Hafner wrote: I'd like to have real substring search and case sensitive search. And while I'm dreaming, a regexp subset would be nice. :-) There's already a substring search. As for case_sensitive, it's an idea, but currently all words are stored as lowercase,

Re: htdig: virtual hosts revisited

1998-12-15 Thread Geoff Hutchison
At 8:00 AM -0500 12/15/98, John Grohol PsyD wrote: How about a file_aliases option? For instance, on our server, the index.html file is nearly always a symbolic link to the actual file, which is named something different. If I could put "index.html" into a file_aliases option, I would solve a

Re: htdig: CORE file when using htsearch

1998-12-15 Thread Geoff Hutchison
At 7:39 AM -0500 12/15/98, Mike Dagan wrote: I am using the latest htdig on a linux machine. Every few minutes i find a core file in the cgi-bin directory. I used the "strings core" command to see if i can get any hints. Does it makes any sence to anyone? There are lots of lines like this:

Re: htdig: virtual hosts revisited

1998-12-15 Thread Geoff Hutchison
At 2:56 AM -0500 12/15/98, Walter Hafner wrote: And a second thought on document checksums: Quite often I see root documents more than 100 kb in size. Why not just computing a checksum of the header. A HTTP 1.1 'HEAD' command would be sufficient and imho this would save a lot of bandwidth and

htdig: Digging a site which requires a browser (client) certificate

1998-12-15 Thread Denis Bazinet
One of the sites I am attempting to dig is an extranet site which has two-factor authentication: You need a certificate and username/password. My options are: 1. Have HTDIG use a certificate (???) and the -u option for username and password. Problem is ... how do you make HTDIG present a

htdig: Using Templates

1998-12-15 Thread Vishal Shah
I am a beginner, trying to learn various ways of customizing htdig. How do I use customized templates to display the results ? -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the

Re: htdig: Sorting results on date (2)

1998-12-15 Thread Gilles Detillieux
Yesterday, regarding my sort patch, I wrote: If you apply it to the 3.1.0b2 source, I can't promise it'll work, as I haven't tested it, but the patch program should be able to apply it. Turns out the patch applies, but it won't compile. It relied on a small change to the score calculation in

htdig: Newbie Query

1998-12-15 Thread INMASS/MRP
Good Morning! I guess the first question is whether I've ended up on the developers list or if this also is the best place for a cgi-ignoramus to ask questions about setting up the config for htdig. My ISP has given us a 'sample config' with about 5 lines of parameters. I've got the list of

Re: htdig: Newbie Query

1998-12-15 Thread Geoff Hutchison
My ISP has given us a 'sample config' with about 5 lines of parameters. I've got the list of configuration options and just wondered if there are specific rules as to how they should be added to the config file. Is there a particular order or do I just feed them in one per line and let the rest

No Subject

1998-12-15 Thread Geoff Hutchison
I'm happy to announce the release of ht://Dig 3.1.0b3. This release is the culmination of a lot of work by many people. Release notes for htdig-3.1.0b3 15 Dec 1998 This version adds only a few features and a significant number of bug fixes. This version has been pretty thoroughly

htdig: Still getting CORE file when using htsearch(2)

1998-12-15 Thread Mike Dagan
There are lots of lines like this: ILLEGAL PAGE TYPE: page: %lu type: %lu So I'd guess that your databases are corrupt. I'm sure you didn't want to hear that, but if it happens "every few minutes," it's probably something that affects most of your searches. I rebuild the database (using the

Re: htdig: CORE file when using htsearch

1998-12-15 Thread Gilles Detillieux
According to Geoff Hutchison: At 7:39 AM -0500 12/15/98, Mike Dagan wrote: I am using the latest htdig on a linux machine. Every few minutes i find a core file in the cgi-bin directory. I used the "strings core" command to see if i can get any hints. Does it makes any sence to anyone?

Re: htdig: virtual hosts revisited

1998-12-15 Thread Gilles Detillieux
According to Geoff Hutchison: As far as using the HEAD for the checksum, my point is that most documents we already GET, so we don't save any bandwidth. I'm also not completely sure that there's enough to ensure the checksums are unique. (This is why I'd want to test the feature very

htdig:start_url problem

1998-12-15 Thread Jesús Arribi
Hi all, I want to index the next documents: http://www.cesga.es/gacetatri/galego/textos/textosaxenda/arquivoscorunha.html http://www.cesga.es/gacetatri/galego/textos/textosaxenda/cinemascor.html http://www.cesga.es/gacetatri/galego/textos/textosaxenda/copascorunha.html

Re: htdig: Still getting CORE file when using htsearch(2)

1998-12-15 Thread Gilles Detillieux
According to Mike Dagan: I rebuild the database (using the -i) with htdig. And i still getting this core file. here is a reminder of what i get using the "strings core" command: Could you give us a stack backtrace from gdb, and info about which version of linux (kernel, distribution and

htdig: Multiple values for local_default_doc

1998-12-15 Thread Clyde Brown
Is there a way to specify multiple values for local_default_doc? What I want to do is this (in "htdig.conf"): local_default_doc:index.html index.htm index.shtml ... so that htdig will try to get index.html, then index.htm, then index.shtml from the local file system. But I get this

Re: htdig:start_url problem

1998-12-15 Thread Gilles Detillieux
According to Jesús Arribi: I want to index the next documents: http://www.cesga.es/gacetatri/galego/textos/textosaxenda/arquivoscorunha.html http://www.cesga.es/gacetatri/galego/textos/textosaxenda/cinemascor.html http://www.cesga.es/gacetatri/galego/textos/textosaxenda/copascorunha.html

htdig: Custom Templates

1998-12-15 Thread Vishal Shah
I am indexing two seperate sites on two seperate platforms. The results should show the first few documents from each of the site. How do I customize my template to do this ? -- To unsubscribe from the htdig mailing list, send

htdig: Updating and merging

1998-12-15 Thread Clyde Brown
The "How it works" page in the ht://dig documentation says, "if you want to only update changed documents, these changes have to be merged into the searchable database." I'm not clear on how that is accomplished. Do I use the config file to limit digging to directories that I know have been

Re: htdig: Multiple values for local_default_doc

1998-12-15 Thread Geoff Hutchison
Is there a way to specify multiple values for local_default_doc? What I want to do is this (in "htdig.conf"): local_default_doc:index.html index.htm index.shtml ... so that htdig will try to get index.html, then index.htm, then index.shtml from the local file system. But I

htdig: problem indexing secure site

1998-12-15 Thread Rosemary
Hello, I am having problems with htdig being able to access my secure server site? Tried setting the variable local_urls, so http://bla.bla.com/=/directory/path/to/DocumentRoot If set in conjunction with start_url, it hangs and does not create db files. If used by itself, then www.htdig.org

htdig: max length of titles ?

1998-12-15 Thread Tom
Htdig is a great tool. My only question is : what would be the best way to limit the size of the output of a title? I am indexing the homepages on an isp and some of the users have really really really really long titles for whatever reason, and editing thier pages is not an option. I've read