From: greenough, dave [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 07, 2006 5:02 PM
To: Brockington,MJ,Michael,JPGA4 R
Subject: RE: [htdig] Problem indexing shtml filesThanks for the response, I am a little new to this so I apologize for any stupidity on my part.When I browse to pages on the site and look at properties from within my browser I get a type of "text/html", also I tried putting something like this in the page <meta http-equiv="content-type" content="text/html"> which didn't make any difference. My config file is a slightly modified sample config file, which I have included (with all of the comments stripped out) just in case you want to take a look.One more thing I noticed was the db.urls file does not appear to be updating and is empty, I am not sure if this is normal. I added the create_url_list to the config to create a list of urls and it is showing the urls though. Just caught my eye as I noticed somewhere that .shtml files are not supported if it is using local_urls and wondered it this could somehow be happening.I don't expect to get alot of results as I have only added keywords and descriptions to a couple of pages (/investments/index.shtml, /investments/test1.html) to keep my testing simple. The output from running the index through my isp's control panel is also included as indexlog.txtI hate to bother you with this, just driving me a little nuts, it seems like it should be something so simple.Dave.
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: November 6, 2006 4:27 AM
To: [EMAIL PROTECTED]; [email protected]
Subject: RE: [htdig] Problem indexing shtml filesDave,The item that stands out for me is your mention that changing the file type to .html makes things work okay. This makes me think that there must be a mis-match between what you have set for mime-type mapping within the htdig config, and what your web server is chucking out as the mime type for these files.If I am right, and you need more help, let me know...Regards,Mike
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of greenough, dave
Sent: Wednesday, November 01, 2006 5:12 PM
To: '[email protected]'
Subject: [htdig] Problem indexing shtml files
I am trying to setup htdig to search our site, and have run into a problem. Most pages on our site are .shtml files as they use server side includes to include common graphics, menu structure, sidebars on each page. I would like to only index by keywords and descriptions and would like to replace the normal excerpt with the meta description from the file.
Here is where the problem occurs, if I set all of the index factors to 0 except title, keywords, and description nothing gets indexed. If I set the text_factor to something above 0 then all of the files get indexed but the use_meta_description does not work.
If I rename the files to .html files everything indexes fine and the use_meta_description works like a charm. Ofcourse by doing this none of my pages would display properly.
Is there a way around this outside of renaming the files to .html and turning on the xbithack?
I am using version 3.1.6 of htdig (provided by an isp).
I have read some information in the mailing list archive etc but I could not find anything specific to this issue. To make sure pages would get indexed I created an html file with links to most of the pages on the site (as a lot of the page links are in the includes and have some _javascript_ with them). The start url's in the config file are all http:/ addresses as there was something mentioning that file based searches had issues with .shtml files without changing some of the htdig code.
Thank you,
Dave.
[EMAIL PROTECTED]
************************
This email and any attachments may contain confidential and privileged information. If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this e-mail and destroy any copies. Any dissemination or use of this information by a person other than the intended recipient is unauthorized and may be illegal. Unless otherwise stated, opinions expressed in this e-mail are those of the author and are not endorsed by the author's employer.
************************
This email and any attachments may contain confidential and privileged information. If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this e-mail and destroy any copies. Any dissemination or use of this information by a person other than the intended recipient is unauthorized and may be illegal. Unless otherwise stated, opinions expressed in this e-mail are those of the author and are not endorsed by the author's employer.
------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________ ht://Dig general mailing list: <[email protected]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

