Hi all, i have a problem with the query-string, when digging a server. I get an url with the same query-parameters, but one query-parameter changes his value, so this url will be indexed many times and the second problem is, that the digging process never ends or ends with a very high count of urls.
I know, thats the problem of this special server, but I'am not able to change the servers structure, because its an external server. So I thought, I could solve this problem by changing htdig. I tried to eliminate the query_parameter '_last' (see below) in method 'push' of class Server (Server.cc), but that wasn't really successfull, because then the digging process ended too early. Can someone give me an idea at which position of htdig I should eleminate this bad query_parameter ?? Here an example of digging this server (I stopped it manually !): ht://dig Start Time: Thu Oct 25 10:15:46 2001 New server: www.xxxx.de, 80 - Persistent connections: enabled - HEAD before GET: disabled - Timeout: 30 - Connection space: 0 - Max Documents: 10000 - TCP retries: 1 - TCP wait time: 5 0:2:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/arzt/pass/req.htm?_usr=&_pwd=&_last=01298101328&_ses=2177219424.01298101139: --- size = 3347 1:3:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/patient/frameset_patient.htm?_usr=&_pwd=: + size = 289 2:5:0:http://www.xxxx.de/: ++ size = 1663 3:8:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/bilder/leer.htm?_usr=&_pwd=&_last=01298101328&_ses=2177219424.01298101139: size = 317 4:7:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/bilder/leer.htm?_usr=&_pwd=: + size = 276 5:6:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/hauptframeset.htm?_usr=&_pwd=: + size = 278 6:11:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/detail_mittelpunkt_patient.htm?_usr=&_pwd=&_last=01298101210&_ses=2177219424.01298101139: size = 587 7:12:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/navi_service_patient.htm?_usr=&_pwd=&_last=01298101240&_ses=2177219424.01298101139: +++-- size = 4279 8:16:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/frameset_service_patient.htm?_usr=&_pwd=&_last=01298101306&_ses=2177219424.01298101139: +++ size = 1144 9:20:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/frameset_service_patient.htm?_usr=&_pwd=&_last=01298101220&_ses=2177219424.01298101139: *** size = 1144 10:19:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/service_patient.htm?_usr=&_pwd=&_last=01298101547&_ses=2177219424.01298101139: +-+-*--- size = 11294 11:18:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/navi_service_patient.htm?_usr=&_pwd=&_last=01298101547&_ses=2177219424.01298101139: ***-- size = 4279 12:23:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/banner_service.htm?_usr=&_pwd=&_last=01298101242&_ses=2177219424.01298101139: size = 462 13:24:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/praxis/patient/frameset_praxis_patient.htm?_usr=&_pwd=&_last=01298101302&_ses=2177219424.01298101139: +++ size = 1137 14:17:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/banner_service.htm?_usr=&_pwd=&_last=01298101547&_ses=2177219424.01298101139: size = 462 15:27:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/praxis/patient/praxis_patient.htm?_usr=&_pwd=&_last=01298101547&_ses=2177219424.01298101139: *--- size = 20050 16:26:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/praxis/patient/navi_praxis_patient.htm?_usr=&_pwd=&_last=01298101547&_ses=2177219424.01298101139: +++-- size = 4289 17:25:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/praxis/banner_praxis.htm?_usr=&_pwd=&_last=01298101547&_ses=2177219424.01298101139: size = 460 18:31:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/arzt/frameset_arzt.htm?_usr=&_pwd=&_last=01298101242&_ses=2177219424.01298101139: ++++ size = 2309 19:36:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/arzt/frameset_arzt.htm?_usr=&_pwd=&_last=01298101326&_ses=2177219424.01298101139: **** size = 2309 20:35:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/bilder/leer.htm?_usr=&_pwd=&_last=01298101548&_ses=2177219424.01298101139: size = 317 21:34:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/arzt/pass/req.htm?_usr=&_pwd=&_last=01298101548&_ses=2177219424.01298101139: --- size = 3347 22:33:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/navi_home.htm?_usr=&_pwd=&_last=01298101548&_ses=2177219424.01298101139: ++-- size = 3732 23:32:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/banner_home.htm?_usr=&_pwd=&_last=01298101548&_ses=2177219424.01298101139: size = 456 24:39:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/detail_nebenwirkungen.htm?_usr=&_pwd=&_last=01298101324&_ses=2177219424.01298101139: size = 594 25:40:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/frameset_service_patient.htm?_usr=&_pwd=&_last=01298101321&_ses=2177219424.01298101139: +++ size = 1144 26:44:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/praxis/patient/frameset_praxis_patient.htm?_usr=&_pwd=&_last=01298101324&_ses=2177219424.01298101139: +++ size = 1137 27:43:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/service/patient/service_patient.htm?_usr=&_pwd=&_last=01298101548&_ses=2177219424.01298101139: +-+-*--- size = 11294 28:47:0:http://www.xxxx.de/cgi-bin/dispatcher.cgi/praxis/patient/praxis_patient.htm?_usr=&_pwd=&_last=01298101548&_ses=2177219424.01298101139: +--- size = 20050 ... ________________________________________________________________ Lotto online tippen! Egal zu welcher Zeit, egal von welchem Ort. Mit dem WEB.DE Lottoservice. http://tippen2.web.de/?x=13 _______________________________________________ htdig-dev mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/htdig-dev
