[htdig] Servlets anyone?

2000-03-14 Thread Lim Swee Tat

Does anyone know where to find the servlet implementation of htsearch?  I'm
having problems finding it, even though it is supposed  to be there.

Ciao
ST Lim

-- 
Lim Swee Tat
Software Engineer
NCS Corporate IS Dept
DID: (65) 774 9177



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



[htdig] Is there anyone can tell me how to test..

2000-03-14 Thread Gallant Chiu

I just installed the htdig1.3.5. but I don't know how
to test it. I have modified the htdig.conf and create
the database. And then I create a html form with a
submit button to call the 
htdig search but it doesn't work.

Is there any expert there can tell me what I should do
after finsih installed and how can I test it.


Thanks in advanced.


__
Do You Yahoo!?
Talk to your friends online with Yahoo! Messenger.
http://im.yahoo.com


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



[htdig] Nothing found

2000-03-14 Thread Croom

I have installed and configured ht://dig with its defaults.  I have run htdig and 
htmerge.  The /db files contain accurate information about the html files that were 
used in the dig and merge.  Then, however, when I attempt to do a search, I either get 
an error message (the page cannot be found) or, in certain instances, the system has 
believed that I wanted to download htsearch from its local location to who knows where.

I hope someone can help me past this.

Thanks,

Spike Parker


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



Re: [htdig] Using a Drop Down Option Box

2000-03-14 Thread Brett Baugh

"Simon Hyde - Webyte.co.uk" wrote:

> Hi, I am using htdig in a directory of businesses that I am
> creating, is it possible to remove all the other options from the
> search form and just have a drop down list of business types, so for
> example the end user would select plumber from the drop down list
> and then click search, and would then be presented with a list of
> plumbers ?? Thanks, Simon Hyde.

Just put a list of all the things you want in a select box called
"words". e.g.:


Plumbers
Carpenters
Zoo Keepers


Also, you'll need hidden fields for the things you would have put on
the form but didn't; e.g.,






etc, etc, etc, etc. (That "onChange" thing is just a scrap of
javascript that submits the form soon as they change that box; leave
it out if you want)

_
Brett Baugh
System Administrator/Unix Programmer
Cyberplex Interactive Media
http://austin.cyberplex.com
512.795.3050
"We have no intention of shipping another bloated OS and shoving it
down the throats of our users."
-- Paul Maritz, Microsoft group vice president



[htdig] Using a Drop Down Option Box

2000-03-14 Thread Simon Hyde - Webyte.co.uk



Hi,
 
I am using htdig in a directory of businesses that 
I am creating, is it possible to remove all the other options from the search 
form and just have a drop down list of business types, so for example the end 
user would select plumber from the drop down list and then click search, and 
would then be presented with a list of plumbers ??
 
Thanks,
 
Simon Hyde.


Re: [htdig] Multiple text boxes in search forms

2000-03-14 Thread mikeg

On Tue, 14 Mar 2000, Croom wrote:

> Is it possible to prompt for multiple keywords in a HTML search form,
> i.e. Lease Name, Lease Number, County, etc. & have the input from the
> forms be concatenated together to form a search string which would
> narrow down the search results?

Do an onsubmit javascript call to a function that concatenates the words
together into a hidden form field named 'words'.  I don't think there
should be any problem sending htdig extra form fields (from the
dropdown/text fields you named)... if so, just have the javascript remove
the values before submit.

This would, of course, require the client to be running javascript...

I would make sample code, but am preoccupied at the moment.

-- 
Mike Giles <[EMAIL PROTECTED]>   ICQ: 65152812
Systems Administrator   Livewire Incorporated
http://www.lw.net/  (352) 373-7090



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



[htdig] Multiple text boxes in search forms

2000-03-14 Thread Croom

Is it possible to prompt for multiple keywords in a HTML search form, i.e. Lease Name, 
Lease Number, County, etc. & have the input from the forms be concatenated together to 
form a search string which would narrow down the search results?  If so, 
pointers/examples would be appreciated!  We are new to htdig & are attempting to use 
it locally to search for archived text documents, & the abliity to narrow down the 
search results as much as possible is a requirement.   Also, we need to be able to 
search for documents by using dates as keywords.  Is that possible?

Mark Wright
Eastex Crude Company


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



Re: [htdig] How to change the directory for star.gif

2000-03-14 Thread Gilles Detillieux

According to wenlong:
> I try to use a different directory for htdig program. Everything works
> fine except the star won't show up to indicate the match status. The
> default directory for the star.gif is /htdig/star.gif.  Which file
> should I modify to change the directory of star.gif, such as
> "temp/htdig/star.gif"

Set either image_url_prefix or both star_blank and star_image in your
htdig.conf, and change the directory name that refers to this image file
in common/header.html and common/wrapper.html.

See
http://www.htdig.org/attrs.html#image_url_prefix
http://www.htdig.org/attrs.html#star_blank
http://www.htdig.org/attrs.html#star_image

If you want to relocate other graphics, such as the buttons or the
ht://Dig logo, you should change all references to these in htdig.conf
and common/*.html.

-- 
Gilles R. Detillieux  E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



[htdig] How to change the directory for star.gif

2000-03-14 Thread wenlong

Hello, All,

I try to use a different directory for htdig program. Everything works
fine except the star won't show up to indicate the match status. The
default directory for the star.gif is /htdig/star.gif.  Which file
should I modify to change the directory of star.gif, such as
"temp/htdig/star.gif"

Thanks in advance.

Wenlong
University Compuitng Center
University of Maine





To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



Re: [htdig] Identifying non-indexed URLs

2000-03-14 Thread Gilles Detillieux

According to me:
> No, you wouldn't see "not Parsable" in the output of htdig 3.1.5, as
> that message only appears in versions 3.1.0b1 and up.  In 3.1.5, the

Oops, I meant 3.2.0b1 and up.

-- 
Gilles R. Detillieux  E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



Re: [htdig] Identifying non-indexed URLs

2000-03-14 Thread Gilles Detillieux

According to Bigler, Tyson MT SSI:
> I probably didn't explain myself very well.  :-D  I need to identify the
> reason for the difference between the number of documents seen and the
> number of documents indexed (e.g. the number of documents indexed is always
> lower than the number of documents "seen").  I don't recall seeing "Not
> Parsable" in the output -- would I only see that in -vv mode?  I've used all
> of the 3.1.x versions (currently using 3.1.5).

No, you wouldn't see "not Parsable" in the output of htdig 3.1.5, as
that message only appears in versions 3.1.0b1 and up.  In 3.1.5, the
message would be "not HTML" for any document it cannot parse, as I said
in my last e-mail.  You'd get that message with one or more -v options.

With two -v options (-vv), htdig will tell you about level 1 or level
2 rejections of URLs, and with three verbose options it will further
explain the reason for level 1 rejection, of which there may be several
(level 2 is because of limit_normalized).  The higher the verbose level,
though, the more output you have to wade through to get at these messages.

That should tell you all you need to know about why htdig is rejecting
URLs.  You may also need to look at why htmerge would reject some.
Reasons for this are less clearly explained in error messages.  The most
common message from htmerge (on a fresh database at least) is "Deleted, no
excerpt", which is usually because of a noindex directive in the document,
the document is disallowed by robots.txt, or server_max_docs was reached.

-- 
Gilles R. Detillieux  E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



Re: [htdig] indexing local pages problem

2000-03-14 Thread Gilles Detillieux

According to Wilfried Geis:
> All the pages are in the directory /http/userpages/johndoe/
> These pages can be called with: http://www.myserver.de/homepages/johndoe
> 
> In order to index the pages I have created a small html-file with all
> the links to the pages in it and have set
> local_urls:
> http://www.myserver.de/homepages/=/http/userpages/
> 
> This works fine, however, htdig does not find the appropriate
> index-pages locally and thus falls back to http-retrieval.
> 
> Then I wanted to force htdig to fetch the pages locally and have set:
> local_default_doc:  welcome.htm index.html
> local_urls_only:true
> 
> But with this setting htdig does not index anything.
> The only way it works is when my start-document (the script-generated
> index-page) contains the entire path including welcome.htm or
> index.html.
> 
> That looks to me like the setting 'local_default_doc' is not working. (I
> have upgraded to version 3.1.5)
> 
> Has anyone else problems with this setting or is there something I might
> have missed in the docs?

For URLs that point to directories, you must have the trailing "/" for
local indexing to work.  When indexing through HTTP, a directory URL that
lacks the trailing slash causes a redirect, to the same URL with the slash
appended, so that the directory index can be fetched normally.  When you
index files locally, this redirect does not occur, leaving htdig with no
fallback position when local_urls_only is true.

It would probably be a simple matter to change Document::RetrieveLocal()
to issue the redirect when the local file turns out to be a directory,
but so far no one has implemented this.

I use local_urls indexing almost exclusively myself, but this has never
been a problem for me because I never use incomplete URLs for directories,
to avoid forcing an unnecessary redirect.

-- 
Gilles R. Detillieux  E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



Re: [htdig] Questions please

2000-03-14 Thread Geoff Hutchison

On Tue, 14 Mar 2000, SiberSpace International Marketing wrote:

> We need a search engine that we will use on our site for about 30,000
> web sites relating to a certain field.

Let me get this straight. You want to index all the pages of these
30,000 or so websites. People will come to your page and search for them
to get individual pages that are a part of these sites. Do I have that
right, or do you want a catalogue of these 30,000 sites along the lines of
Yahoo? In other words, do you want full-text searching (which is what
ht://Dig provides)?

> Can the software also work with Hebrew and Hebrew web sites?

I haven't tried it myself, but if there's an 8-bit encoding for Hebrew and
a valid locale, it might work. Others may know from experience.

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



Re: [htdig] WebHost htdig PDF $

2000-03-14 Thread Geoff Hutchison


> [Aside: any Mindspring customers out there who successfully got the htdig
> PDF indexing to work? My suspicion is that 3.0.8b2 is too early to provide
> the external parser function. For sure it's too early for acroread.]

If you take a look at the release notes
 you'll see that external parser
support was added with version 3.0.7.

IMHO, if MindSpring has *still* not upgraded from 3.0.8b2, then all
MindSpring users should leave to another ISP. A list is available at
 (if there are any errors, please let me
know). Why should you jump ship? Beyond the recent security hole, which
will let users read your files (even MindSpring's files), there are
numerous bugs fixed between 3.0.8b2 and 3.1.5. Database corruption
problems and bugs in the URL-matching code spring to mind.

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



Re: [htdig] Databases on different Platforms?

2000-03-14 Thread Geoff Hutchison

On Tue, 14 Mar 2000, Bill Carlson wrote:

> Are the databases transportable across platforms? IE, if I dig on a SUN
> box, should I be able to move the resulting dbs to an intel box and expect
> htdig to work?

In general, no. We will be adding dump/load utilities to 3.2, but it's not
even in the CVS tree at the moment.

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



[htdig] WebHost htdig PDF $

2000-03-14 Thread Sean Downing

I am giving up on trying to get htdig (3.0.8b2) to index PDF files
(external_parsers) on my Mindspring web host. Anyone know of a web host
provider that's already worked this out? All I want is for htdig to index
my 1,000-odd PDF and HTML files.

[Aside: any Mindspring customers out there who successfully got the htdig
PDF indexing to work? My suspicion is that 3.0.8b2 is too early to provide
the external parser function. For sure it's too early for acroread.]


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



[htdig] Questions please

2000-03-14 Thread SiberSpace International Marketing

Hello

We need a search engine that we will use on our site for about 30,000
web sites relating to a certain field.

We will have an ISP that people will connect to us and use our search
engine for a list of web sites we will provide the url's for. Other's
can also enter the web site from outside of our ISP by going directly to

our web site.

Can this software help us?

Can the software also work with Hebrew and Hebrew web sites?

Thank you in advance!

Brad Fogel






To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



[htdig] indexing local pages problem

2000-03-14 Thread Wilfried Geis


Hi all,

I have been using htdig for quite a while on our site and it works
pretty well for us.

Recently I wanted to change the setup a bit so that it uses the local
file system rather than hitting our http-server for local pages. And I
think I have a few problems with it.

The setup is this:

All the pages are in the directory /http/userpages/johndoe/
These pages can be called with: http://www.myserver.de/homepages/johndoe

In order to index the pages I have created a small html-file with all
the links to the pages in it and have set
local_urls:
http://www.myserver.de/homepages/=/http/userpages/

This works fine, however, htdig does not find the appropriate
index-pages locally and thus falls back to http-retrieval.

Then I wanted to force htdig to fetch the pages locally and have set:
local_default_doc:  welcome.htm index.html
local_urls_only:true

But with this setting htdig does not index anything.
The only way it works is when my start-document (the script-generated
index-page) contains the entire path including welcome.htm or
index.html.

That looks to me like the setting 'local_default_doc' is not working. (I
have upgraded to version 3.1.5)

Has anyone else problems with this setting or is there something I might
have missed in the docs?

Any pointers to the solution would be welcome.

cheers


--

Wilfried Geis
ITM Research GmbH




To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



[htdig] Databases on different Platforms?

2000-03-14 Thread Bill Carlson

Hey all,

I'm just starting to work on this and wanted to check with you all before
I got too involved.

Are the databases transportable across platforms? IE, if I dig on a SUN
box, should I be able to move the resulting dbs to an intel box and expect
htdig to work?

Thanks!

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|




To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



Re: [htdig] Identifying non-indexed URLs

2000-03-14 Thread Gilles Detillieux

According to Bigler, Tyson MT SSI:
> Is it possible to log URLs which are not indexed?  The -vv flag will show
> level 1 & level 2 rejects due to explicit exceptions, but I'm interested in
> knowing which URLs were seen but not indexed because they weren't
> "parsable".  Is this easily done?

In the 3.1.x series, the message is a bit misleading.  With the -v flag,
htdig will report "not HTML" for any document it cannot parse.  In 3.2.x,
this message is changed to the more general "not Parsable".

-- 
Gilles R. Detillieux  E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



Re: [htdig] Identifying non-indexed URLs

2000-03-14 Thread Geoff Hutchison

On Tue, 14 Mar 2000, Bigler, Tyson MT SSI wrote:

> knowing which URLs were seen but not indexed because they weren't
> "parsable".  Is this easily done?

I'm not quite sure what you mean. I'm assuming you want some listing of
URLs included in  tags that are malformed?

For better or worse, the URL-parsing code doesn't reject malformed URLs.
So you should see them rejected by the normal means. Granted, I haven't
run it through every possible URL-ish input (malformed or not), so it's
possible there are bugs.

Remember, if you want to take a look at every URL seen, you can set
create_url_list.

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.



[htdig] Identifying non-indexed URLs

2000-03-14 Thread Bigler, Tyson MT SSI

Is it possible to log URLs which are not indexed?  The -vv flag will show
level 1 & level 2 rejects due to explicit exceptions, but I'm interested in
knowing which URLs were seen but not indexed because they weren't
"parsable".  Is this easily done?

Thanks for any insight!

Tyson

Tyson Bigler
Shell Services International, Inc.
EP Gas & Power, BTC Site Operations
3737 Bellaire Blvd. Room 1051B, Houston, TX  77025
713-245-7476



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.