[htdig] How to exclude particular files from indexing?
Dear all, how do I tell ht:/Dig not to index particular filenames (indexint.html and userdata.dat) ? Thanks, Martin To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
[htdig] Duplicate results with directories some questions/bugs?
Hello! Finally Ihave configured htdig. It works great. I have 2 doubts. I do not know if it is a bug or something bad configured. 1) If Isearch for a word that match the name of a directory and that is inside a file under that directory I get "duplicate" results, ( 6,8, 9 times the same link). How can it be fix ? What does it mean the links endings? ?N=D ?M=D ?S=D ... By the way, the subdirectory doc is a symlink. I do not know it it has something to do with this wrong result. As an example you have some following: Index of /recursos/doc/tecnica/inktomi Index of /recursos/doc/tecnica/inktomi Name Last modified Size Description [DIR] Parent Directory 16-Jan-2001 14:47 - [ ] robots_white_paper.pdf 09-Jan-2001 14:20 246k [TXT] instalar 10-Jan-2001 12:31 3k [ ] database.doc 09-Jan-2001 14:15 58k [ ] 4.0_Admin_Guide.pdf 09-Jan-2001 09:40 1 ... http://correo.testanet.com/recursos/doc/tecnica/inktomi/?N=D , 1027 bytes Index of /recursos/doc/tecnica/inktomi Index of /recursos/doc/tecnica/inktomi Name Last modified Size Description [DIR] Parent Directory 16-Jan-2001 14:47 - [TXT] instalar 10-Jan-2001 12:31 3k [ ] robots_white_paper.pdf 09-Jan-2001 14:20 246k [ ] database.doc 09-Jan-2001 14:15 58k [ ] 4.0_Admin_Guide.pdf 09-Jan-2001 09:40 1 ... http://correo.testanet.com/recursos/doc/tecnica/inktomi/?M=D , 1027 bytes Index of /recursos/doc/tecnica/inktomi Index of /recursos/doc/tecnica/inktomi Name Last modified Size Description [DIR] Parent Directory 16-Jan-2001 14:47 - [ ] 4.0_Admin_Guide.pdf 09-Jan-2001 09:40 1.4M [ ] robots_white_paper.pdf 09-Jan-2001 14:20 246k [ ] database.doc 09-Jan-2001 14:15 58k [TXT] instalar 10-Jan-2001 12:31 ... http://correo.testanet.com/recursos/doc/tecnica/inktomi/?S=D , 1027 bytes Index of /recursos/doc/tecnica/inktomi Index of /recursos/doc/tecnica/inktomi Name Last modified Size Description [DIR] Parent Directory 16-Jan-2001 14:47 - [ ] robots_white_paper.pdf 09-Jan-2001 14:20 246k [TXT] instalar 10-Jan-2001 12:31 3k [ ] database.doc 09-Jan-2001 14:15 58k [ ] 4.0_Admin_Guide.pdf 09-Jan-2001 09:40 1 ... http://correo.testanet.com/recursos/doc/tecnica/inktomi/?D=D , 1027 bytes 2) The apache DocumentRoot is /home/httpd/html and I have the /home/httpd/recursos as "brother" not "son" and password-protected. The thing is that documents in directories under /home/httpd/html are not indexed by htdig unless I write down expressly in the htdig.conf start_url: http://correo.testanet.com \ http://correo.testanet.com/recursos/ \ http://correo.testanet.com/manual/ \ ... etc. BUT if I use the -u user:password flag it behaves in a recursive way and Ido not need to write all the subdirectories in the conf file. Is this normal?? Thanks in advance -- Evelio Martnez Testanet. Dept. desarrollo software. Av. Reino de Valencia, 15 - 5 46005 Valencia (Spain) Tel: +34 96 395 90 00 Fax: +34 96 316 23 19
[htdig] Re: htdig SSL problems
Matt I'm still trying to get this to work. For some reason Solaris (and in your case SunOS) compiles of htDig (with patch) are not using the openssl libraries correctly. Are you able to actually retrieve pages by just using "./openssl s_client -host {hostname} -port 443" and then issuing the GET command manually ("GET index.html HTTP/1.0")? In my case this works fine, so I know it's possible for htdig to retrieve the pages but patch just doesn't work quite right for these OS's. I'm going to go poking around and see if I can't remedy this. Please let me know if you have any additional success on this problem. I'll let you know if I do. Thanks!!! Jason Matt Wong wrote: Hi Jason, I am running into the exact same problem you posted in the thread with the message http://www.htdig.org/mail/2000/12/0119.html (unable to connect to https) running SunOS 5.7, Apache 1.3.14, openssl-0.9.6. Did you ever manage to get this working? Thanks -Matt To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] How to exclude particular files from indexing?
On Fri, 19 Jan 2001, Martin Mielke wrote: how do I tell ht:/Dig not to index particular filenames (indexint.html and userdata.dat) ? See exclude_urls http://www.htdig.org/attrs.html#exclude_urls -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] append database problem
On Fri, 19 Jan 2001, Julian C. Dunn wrote: And htdig started to index all the URLs that were indexed with the original htdig.conf, and never indexed the documents at the start_url in htdig-bc.conf . How are you sure that it "never" indexed the documents in the start_url? When you run htdig with an existing database, it will load all the URLs in the database and check through them for changes. Typically, this is much faster than an initial dig--if the server supports the If-Modified-Since: header, many documents won't be sent over the connection and only modified documents will be reindexed. However, once it has updated the URLs, it will go through any new URLs listed in the start_url. So I would be very much surprised if it "never" got to the new URLs. -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] Duplicate results with directories some questions/bugs?
On Fri, 19 Jan 2001, Evelio Martinez wrote: How can it be fix ? What does it mean the links endings? ?N=D?M=D ?S=D ... By the way, the subdirectory doc is a symlink. I do not know it it has something to do with this wrong result. It means that you're running Apache with FancyIndexing turned on. This generates links at the top of the columns to sort the directory index by name, size, date, etc. You probably will want to either: a) Turn off this behavior in Apache -- I believe the option is "SupressColumnSorting" b) Enter the patterns into exclude_urls, e.g.: exclude_urls: ?N=D ?N=A ?S=D ?S=A ?M=D ?M=A ... 2) The apache DocumentRoot is /home/httpd/html and I have the /home/httpd/recursos as "brother" not "son" and password-protected. [snip] BUT if I use the -u user:password flag it behaves in a recursive way and I do not need to write all the subdirectories in the conf file. If you do not supply a password to htdig, it will be rejected when it attempts to access password-protected areas. So if parts of your site are protected, you will need to supply the password for them to be indexed. -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
[htdig] rundig error
I thought everything installed right..no errors finally :) however.. i type in rundig and get this ajax:/export/netapp/user/rpy/htdig-3.1.5/installdir/ rundig rundig: @BIN_DIR@/htdig: not found rundig: @BIN_DIR@/htmerge: not found rundig: @BIN_DIR@/htnotify: not found ajax:/export/netapp/user/rpy/htdig-3.1.5/installdir/ so I move up a directory, there is no bin directory made..so what can I do, is making a bin directory(which is what CONFIG says) and copy those exe to it , is that going to work, im going to try but that still doesn't explain why there is no bin when CONFIG says there should be... hummm ron To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
[htdig] Plural, singular
Estimated users. I have set HTDIG 3.2 -spanish- in my PC and I have a problem. Imagine a text, which contains "ganaderos" word. I'd like to find "ganadero", which is a "ganaderos" singular, but I can' t find it. I can find "ganaderos" but that's what I don`t like. This affaire matters with all words and their singular. What can I do?? Thank you
Re: [htdig] rundig error
well i um, made a bin directory and now i moved those exe into there (dig,notice,etc) and hardcoded ../bin in the rundig and it appears to be doing something so well see. ron On Fri, 19 Jan 2001, Ronald Edward Petty wrote: I thought everything installed right..no errors finally :) however.. i type in rundig and get this ajax:/export/netapp/user/rpy/htdig-3.1.5/installdir/ rundig rundig: @BIN_DIR@/htdig: not found rundig: @BIN_DIR@/htmerge: not found rundig: @BIN_DIR@/htnotify: not found ajax:/export/netapp/user/rpy/htdig-3.1.5/installdir/ so I move up a directory, there is no bin directory made..so what can I do, is making a bin directory(which is what CONFIG says) and copy those exe to it , is that going to work, im going to try but that still doesn't explain why there is no bin when CONFIG says there should be... hummm ron To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
[htdig] word file
When rundig runs, a word file is made. And htmerge needs this. But for some reason mine is not created.. and the only errors i get is htmerge needs /db/db.word or whatever it is called. Where is the vvv debug info supposed to be? Am I lost.. Ron To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
[htdig] more problems with CONFIG
Ok, i think i have found out where some problems are being generated CONFIG file does not work correctly or I should say Im not using it right. ./configure vi CONFIG and mess with it make make install So what is wrong with this, all the paths that are in the CONFIG file are writeable by me. htdig finds the server but finds 0 docs (this is the rundig script) I am having to go hardcode all the links. What am I doing wrong. Thanks Ron To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] rundig error
On Fri, 19 Jan 2001, Ronald Edward Petty wrote: I thought everything installed right..no errors finally :) however.. i type in rundig and get this You apparently did not run "make install" which will set paths in the rundig script and copy the script and the binaries into the directories set in the CONFIG file. -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] rundig error
i did, its not working correctly.. what should i send to u guys to look at? Im not getting any permission problems or anything... ugh.. ron On Fri, 19 Jan 2001, Geoff Hutchison wrote: On Fri, 19 Jan 2001, Ronald Edward Petty wrote: I thought everything installed right..no errors finally :) however.. i type in rundig and get this You apparently did not run "make install" which will set paths in the rundig script and copy the script and the binaries into the directories set in the CONFIG file. -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] rundig error
ok.. i type make install and get no errors, but only /db is created, none of the others are. So this is where my problem is... what could be causing it.. I do have CONFIG inside htdig-3.1.5 directory. Ron To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
[htdig] Problem compiling?
Is this error a light one while compiling? Bad tag in serialized data: 90 Mike To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] Problem compiling?
On Fri, 19 Jan 2001, htdighelp wrote: Is this error a light one while compiling? I doubt you'll ever see this while compiling the program itself, though you may see it when indexing. It is not a fatal error, but usually indicates some form of database corruption. -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re:[htdig] db.docdb db.wordlist
I've just installed htdig and after running rundig I get the following error htmerge: Unable to open word list file '/home/httpd/docs/search/test/db/db.wordlist DB" problem ...: /home/httpd/docs/search/test/db/db.docdb no such file or directory. I've created a file in the db directory db.docdb - but I then get an error that it is not the correct file format. Is there an initial startup file I need to run on a first run... As far as I know the server I am running on is a redhat linux 6.0 Thanks
Re:[htdig] db.docdb db.wordlist
same thing here On Sat, 20 Jan 2001, Cormac Robinson wrote: I've just installed htdig and after running rundig I get the following error htmerge: Unable to open word list file '/home/httpd/docs/search/test/db/db.wordlist DB" problem ...: /home/httpd/docs/search/test/db/db.docdb no such file or directory. I've created a file in the db directory db.docdb - but I then get an error that it is not the correct file format. Is there an initial startup file I need to run on a first run... As far as I know the server I am running on is a redhat linux 6.0 Thanks To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
[htdig] multiple run instances?
Can I run htdig multiple times on the same system? I see that I can use a separate config file but not if I can run multiple instances. I've done so but I keep having problems. Mike To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html