Re: [htdig] How Long???
Hi Pat: It depends on what kind of hardware and resources it's running. Ours has 10 times less than that and runs in about 5 minutes. Since our own pages change radically every day, I'm running htdig to build the database from scratch every day. -- Noel Vargas Baltodano [EMAIL PROTECTED] Gerente de Sistemas Nicatechnologies, S.A. http://www.nicatech.com.ni To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] How Long???
Pat, We're running RHL 6.2 on a PIII700 w/128 RAM. But 20 HOURS It seems to me like you have a problem there. Try running a partial index (select a start_url with a URL which contains a few links) with -vvv options and see what happens. If it takes as long as 15 minutes, check the results of the htdig. Maybe you can put'em into a file and send it to the list. Pat Lennon wrote: WOW!! 5 minutes I have the turtle version..RedHAt Linux 6.1 P150 64 meg ram. It's benn running for approx 20 hours now. :) -P "Ing. Noel Vargas Baltodano" wrote: Hi Pat: It depends on what kind of hardware and resources it's running. Ours has 10 times less than that and runs in about 5 minutes. Since our own pages change radically every day, I'm running htdig to build the database from scratch every day. -- Noel Vargas Baltodano [EMAIL PROTECTED] Gerente de Sistemas Nicatechnologies, S.A. http://www.nicatech.com.ni To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html -- Noel Vargas Baltodano [EMAIL PROTECTED] Gerente de Sistemas Nicatechnologies, S.A. http://www.nicatech.com.ni To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re:[htdig] db.docdb db.wordlist
What version are you running? Did it come in one of Red Hat's .RPM packages or did you downloaded them? I'm using 3.1.5 (I think that's the right version number...), which came with RH 6.2 Pro. I installed the RPM, modified the config to the server's homepage (you may add another urls, it's up to you), then ran htdig. After that, I ran htmerge, and that was it. Had a little problem with user's accounts, which was promptly fixed (these guys really know their stuff), but I had no much problem. same thing here On Sat, 20 Jan 2001, Cormac Robinson wrote: I've just installed htdig and after running rundig I get the following error htmerge: Unable to open word list file '/home/httpd/docs/search/test/db/db.wordlist DB" problem ...: /home/httpd/docs/search/test/db/db.docdb no such file or directory. I've created a file in the db directory db.docdb - but I then get an error that it is not the correct file format. Is there an initial startup file I need to run on a first run... As far as I know the server I am running on is a redhat linux 6.0 Thanks To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html -- Noel Vargas Baltodano [EMAIL PROTECTED] Gerente de Sistemas Nicatechnologies, S.A. http://www.nicatech.com.ni To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] Memory requriements
I have a Linux box with approx 1 gig of html and pdf books. I want to use htdig for the search engine. I dont want to assume to much butwill 1 additional gig of hard disk cover the size of the index database. I figure double may be a safe starting point. Also what type of memory requirements should i consider at a minimum? The hardware is a Cyrix 150 64 meg ram redhat 6.2 apache webserver. I know this is a vague question...I would just like some reasonable starting points??? It's enough! The first time I installed htdig, i did it in that very same configuration and it ran quite well. It took a few minutes to dig. I guess that with 1GB of documents it will take *much* longer than that. Try to do the refresh in low traffic hours, so it won't hog the apache. -- Noel Vargas Baltodano [EMAIL PROTECTED] Gerente de Sistemas Nicatechnologies, S.A. http://www.nicatech.com.ni To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] HOWTO? setp-by-step?
Geordon I suppose that apache is running fine, right? The next question would be, what exactly do you want to do with htdig? Just have a search engine for your site or what? Geordon VanTassle wrote: It's version 3.1.5, on SuSE Pro 7.0 kernel 2.2.16. Apache version 1.3.14, PHP 4.0.4 FrontPAge extensions 4.0.4.3 Geordon Original Message On 1/18/01, 2:29:14 PM, "Ing. Noel Vargas Baltodano" [EMAIL PROTECTED] wrote regarding Re: [htdig] HOWTO? setp-by-step?: Hi Gordon: First of all, what version of htdig did you install? Does someone have a step-by-step HOWTO for ht://Dig ? I have downloaded and installed (configure;make;make install) everything and it is installed right. I just can't seem to get the bloody thing to WORK! Or, is there an issue using it on a webserver that uses the Microsoft FrontPAge extensions? TIA, Geordon To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html -- Noel Vargas Baltodano [EMAIL PROTECTED] Gerente de Sistemas Nicatechnologies, S.A. http://www.nicatech.com.ni To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] HOWTO? setp-by-step?
Geordon You have to edit the htdig.conf file according to your directory structure (start url, etc.) and your needs. Then you run htdig and htmerge. Try running htdig with the -i option to re-do the database from scratch and the -vvv option to see exactly what it is indexing. -- Noel Vargas Baltodano [EMAIL PROTECTED] Gerente de Sistemas Nicatechnologies, S.A. http://www.nicatech.com.ni To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
[htdig] Duplicates
Hi Everybody: I've succesfully ran Htdig, and it scanned every file I wanted to. The only thing now is that I get several duplicates. Is there a way to tell Htdig to display 'unique' URLs only? -- Noel Vargas Baltodano [EMAIL PROTECTED] Gerente de Sistemas Nicatechnologies, S.A. http://www.nicatech.com.ni To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] Indexing Restricted Pages
Hi Dave and Douglas: I'm a htdig newbie too. I haven't configured it yet in our production server. But I think I have a simple solution that may work. Why don't you create a username for the htdig software and allow access to this user to the protected directories to create the database? -- Noel Vargas Baltodano [EMAIL PROTECTED] Gerente de Sistemas Nicatechnologies, S.A. http://www.nicatech.com.ni To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
[htdig] Same problem with ~s
Hi everyone: I've been searching the previous mail contents and I haven't found a clear and proper answer to the problem of the http://www.domain.dom/~username indexing. Right now I'm using RH 6.2 with Apache web server on a test enviroment, without Internet access. I have several subdirs into the /httpd directory, and several accounts in the /home directory. All of 'em have the proper rights and are recognized by Apache. I just don't know how to make htdig to check the /httpd subdirs and the /~username URLs. If anyone is kind enough to explain it to me AS CLEAR as posible, or tell me where I can get the right answer to this problem, I'd be very grateful. Noel Vargas Baltodano [EMAIL PROTECTED] To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] Same problem with ~s
Thanx for the answer, Geoff. I'll try it and I will let you know how it went. Now I've got another one: Since the URLs to the ~ pages are in a different page than the Home Page, is there a way to declare more than one start_url? Can you give me an example of the syntax? On Sun, 26 Nov 2000, Geoff Hutchison wrote: At 11:39 AM -0600 11/26/00, Ing. Noel Vargas Baltodano wrote: I just don't know how to make htdig to check the /httpd subdirs and the /~username URLs. If anyone is kind enough to explain it to me AS CLEAR as posible, or tell me where I can get the right answer to this problem, I'd be very grateful. When indexing, htdig follows links from the start_url setting. If there's a link, either a standard a href="..." or a link tag in the header, it will be followed. But if you don't have a link to your user pages (for example), then it will never know they exist. So you can either make sure there are links from the start_url to the appropriate sections or make sure they're included in the start_url. Keep in mind you can insert files into the config file like so: start_url: `/path/to/url.file` -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html