Webmaster writes:
Geoff Hutchison writes:
We have lots of links on our website and it's annoying to see duplicates in
search results. But the problem with duplicate detection is deciding which
duplicate to use! My current thought is to use the document with the lower
hopcount.
Geoff Hutchison writes:
At 5:28 AM -0500 12/14/98, Walter Hafner wrote:
1) The lack of support for German umlauts (äöüß)
My suggestion would be to look at the locale option.
Oops, sorry. I stand corrected. Missed that one.
2) The somewhat limited queries.
I think you'll have to
Hi,
I am using the latest htdig on a linux machine. Every few minutes
i find a core file in the cgi-bin directory.
I used the "strings core" command to see if i can get any hints.
Does it makes any sence to anyone?
CORE
CORE
htsearch
htsearch
CORE
htsearch
CORE
#lx msize: %lu
stack:
%lu
Geoff Hutchison writes:
We have lots of links on our website and it's annoying to see duplicates in
search results. But the problem with duplicate detection is deciding which
duplicate to use! My current thought is to use the document with the lower
hopcount.
Walter Hafner replied:
As I
At 5:28 AM -0500 12/15/98, Walter Hafner wrote:
I'd like to have real substring search and case sensitive search. And
while I'm dreaming, a regexp subset would be nice. :-)
There's already a substring search. As for case_sensitive, it's an idea,
but currently all words are stored as lowercase,
At 8:00 AM -0500 12/15/98, John Grohol PsyD wrote:
How about a file_aliases option? For instance, on our server,
the index.html file is nearly always a symbolic link to the
actual file, which is named something different. If I could
put "index.html" into a file_aliases option, I would solve a
At 7:39 AM -0500 12/15/98, Mike Dagan wrote:
I am using the latest htdig on a linux machine. Every few minutes
i find a core file in the cgi-bin directory.
I used the "strings core" command to see if i can get any hints.
Does it makes any sence to anyone?
There are lots of lines like this:
At 2:56 AM -0500 12/15/98, Walter Hafner wrote:
And a second thought on document checksums: Quite often I see root
documents more than 100 kb in size. Why not just computing a checksum of
the header. A HTTP 1.1 'HEAD' command would be sufficient and imho this
would save a lot of bandwidth and
One of the sites I am attempting to dig is an extranet site which has
two-factor authentication: You need a certificate and
username/password. My options are:
1. Have HTDIG use a certificate (???) and the -u option for username
and password. Problem is ... how do you make HTDIG present a
I am a beginner, trying to learn various ways of customizing htdig.
How do I use customized templates to display the results ?
--
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the
Yesterday, regarding my sort patch, I wrote:
If you apply it to the 3.1.0b2 source, I can't promise it'll work, as
I haven't tested it, but the patch program should be able to apply it.
Turns out the patch applies, but it won't compile. It relied on a small
change to the score calculation in
Good Morning! I guess the first question is whether I've ended up on the
developers list or if this also is the best place for a cgi-ignoramus to
ask questions about setting up the config for htdig.
My ISP has given us a 'sample config' with about 5 lines of parameters.
I've got the list of
My ISP has given us a 'sample config' with about 5 lines of parameters.
I've got the list of configuration options and just wondered if there are
specific rules as to how they should be added to the config file. Is there
a particular order or do I just feed them in one per line and let the rest
I'm happy to announce the release of ht://Dig 3.1.0b3. This release is the
culmination of a lot of work by many people.
Release notes for htdig-3.1.0b3 15 Dec 1998
This version adds only a few features and a significant number of bug
fixes. This version has been pretty thoroughly
There are lots of lines like this:
ILLEGAL PAGE TYPE: page: %lu type: %lu
So I'd guess that your databases are corrupt. I'm sure you didn't want
to
hear that, but if it happens "every few minutes," it's probably
something that affects most of your searches.
I rebuild the database (using the
According to Geoff Hutchison:
At 7:39 AM -0500 12/15/98, Mike Dagan wrote:
I am using the latest htdig on a linux machine. Every few minutes
i find a core file in the cgi-bin directory.
I used the "strings core" command to see if i can get any hints.
Does it makes any sence to anyone?
According to Geoff Hutchison:
As far as using the HEAD for the checksum, my point is that most documents
we already GET, so we don't save any bandwidth. I'm also not completely
sure that there's enough to ensure the checksums are unique. (This is why
I'd want to test the feature very
Hi all,
I want to index the next documents:
http://www.cesga.es/gacetatri/galego/textos/textosaxenda/arquivoscorunha.html
http://www.cesga.es/gacetatri/galego/textos/textosaxenda/cinemascor.html
http://www.cesga.es/gacetatri/galego/textos/textosaxenda/copascorunha.html
According to Mike Dagan:
I rebuild the database (using the -i) with htdig. And i still getting
this core file. here is a reminder of what i get using the "strings
core" command:
Could you give us a stack backtrace from gdb, and info about which version
of linux (kernel, distribution and
Is there a way to specify multiple values for local_default_doc? What I
want to do is this (in "htdig.conf"):
local_default_doc:index.html index.htm index.shtml
... so that htdig will try to get index.html, then index.htm, then
index.shtml from the local file system. But I get this
According to Jesús Arribi:
I want to index the next documents:
http://www.cesga.es/gacetatri/galego/textos/textosaxenda/arquivoscorunha.html
http://www.cesga.es/gacetatri/galego/textos/textosaxenda/cinemascor.html
http://www.cesga.es/gacetatri/galego/textos/textosaxenda/copascorunha.html
I am indexing two seperate sites on two seperate platforms.
The results should show the first few documents from each of the site.
How do I customize my template to do this ?
--
To unsubscribe from the htdig mailing list, send
The "How it works" page in the ht://dig documentation says, "if you want to
only update changed documents, these changes have to be merged into the
searchable database." I'm not clear on how that is accomplished. Do I use
the config file to limit digging to directories that I know have been
Is there a way to specify multiple values for local_default_doc? What I
want to do is this (in "htdig.conf"):
local_default_doc:index.html index.htm index.shtml
... so that htdig will try to get index.html, then index.htm, then
index.shtml from the local file system. But I
Hello,
I am having problems with htdig being able to access my secure server
site?
Tried setting the variable local_urls, so
http://bla.bla.com/=/directory/path/to/DocumentRoot
If set in conjunction with start_url, it hangs and does not create db
files. If used by itself, then
www.htdig.org
Htdig is a great tool.
My only question is :
what would be the best way to limit the size of the output of a title?
I am indexing the homepages on an isp and some of the users have really
really really really long titles for whatever reason, and editing thier
pages is not an option. I've read
26 matches
Mail list logo