[htdig] Redirection of Htdig output -- 3.20b2
My intent is to capture the STDOUT, from HTDIG, in a disk file. Following code operates as intended (Linux system) #!/mybin/sh URLMAIN=mallst CONFDIR=/htdig3.2b2/sngl/conf DBDIR=/htdig3.2b2/sngl/data BINDIR=/htdig3.2b2/bin # echo "progname = $0 / $URLMAIN" TMPDIR=$DBDIR export TMPDIR CONFDIR_LOAD=/htdig3.2b2/sngl/conf DB_STAT_DIR=/htdig3.2b2/sngl/stat $BINDIR/htdig -svic $CONFDIR_LOAD/$URLMAIN.conf $DB_STAT_DIR/$URLMAIN._htdig.log Following line of Perl code is intended to run htdig, and send STDOUT to /htdig3.2b2/autoshop-online._htdig.log; system "/htdig3.2b2/bin/htdig","-svic","/htdig3.2b2/sngl/conf/autoshop-online.conf"," ", "/htdig3.2b2/autoshop-online._htdig.log"; The execution of Htdig produces valid content in STDOUT, but it goes to STDOUT itself (as opposed to the specified file). Best I can tell, from review of Perl (5.005_03) documentation, syntax of above command is valid. I'm somewhat mystified . . Steven P Haver/602-242-9708 To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] Redirection of Htdig output -- 3.20b2
According to [EMAIL PROTECTED]: Following line of Perl code is intended to run htdig, and send STDOUT to /htdig3.2b2/autoshop-online._htdig.log; system "/htdig3.2b2/bin/htdig","-svic","/htdig3.2b2/sngl/conf/autoshop-online.conf"," ", "/htdig3.2b2/autoshop-online._htdig.log"; The execution of Htdig produces valid content in STDOUT, but it goes to STDOUT itself (as opposed to the specified file). Best I can tell, from review of Perl (5.005_03) documentation, syntax of above command is valid. While I'm no Perl expert, I've never seen "system" used in this way. I think system("/htdig3.2b2/bin/htdig -svic /htdig3.2b2/sngl/conf/autoshop-online.conf /htdig3.2b2/autoshop-online._htdig.log"); will do what you want. The string just gets passed to the shell for parsing, as far as I know, so you use standard sh/ksh/bash syntax in the string. Perhaps in the syntax you used, the "" got passed literally as argument 3 to the htdig program. -- Gilles R. Detillieux E-mail: [EMAIL PROTECTED] Spinal Cord Research Centre WWW:http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax:(204)789-3930 To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
[htdig] ssl patch for ht://dig
Hi I'm trying to get this ssl patch to work. I finally installed all the patch, but now I'm getting an error when compiling. Server.cc: In method `Server::Server(char *, int, int, StringList * = 0)': Server.cc:44: passing `const char *' as argument 1 of `String::operator =(char *)' discards qua lifiers make[1]: *** [Server.o] Error 1 make[1]: Leaving directory `/tmp/work/htdig-3.1.5/htdig' make: *** [all] Error 1 This is a line that was modified by the patch. Here is the excert from Server.cc // // Attempt to get a robots.txt file from the specified server // String url; url = ssl ? "https://" : "http://"; this is where it's getting an error url host ':' port "/robots.txt"; Documentdoc(url, 0); static int local_urls_only = config.Boolean("local_urls_only"); time_t timeZero = 0; Document::DocStatus status; Here's the diff @@ -40,7 +40,8 @@ // // Attempt to get a robots.txt file from the specified server // -String url = "http://"; +String url; +url = ssl ? "https://" : "http://"; url host ':' port "/robots.txt"; Document doc(url, 0); Any ideas?? Thanks in advance Jeremy begin:vcard n:Lyon;Jeremy tel;pager:303-899-9178 tel;work:303-624-4226 x-mozilla-html:FALSE org:Qwest;Information Technologies version:2.1 email;internet:[EMAIL PROTECTED] title:Associate IT Specialist adr;quoted-printable:;;1515 Arapahoe=0D=0ATower 1=0D=0AFlr 9;Denver;Colorado;80202; fn:Jeremy Lyon end:vcard To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] SSL Patch
According to Michael Arndt: When applying SSL.0 or SSL.2 (SSL.1 doesnt apply) to a htdig 3.1.5 fresh from Server, i get Problems when trying to compile on a linux box: ... Server.cc: In method `Server::Server(char *, int, int, StringList * = 0)': Server.cc:44: passing `const char *' as argument 1 of `String::operator =(char *)' discards qualifiers Try replacing lines 43-44 of the patched Server.cc with the following construct to see if it would keep your compiler happy: String url = "http://"; if (ssl) url = "https://"; -- Gilles R. Detillieux E-mail: [EMAIL PROTECTED] Spinal Cord Research Centre WWW:http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax:(204)789-3930 To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] ssl patch for ht://dig
Gilles, Thank you so much. That worked. Now I have a new problem. I am indexing from the local file system. Now when I do a search everything works fine except the urls that are returned for the ssl sites appear like this. http://ecom.uswest:443/path It's storing as a regular http:// instead of https:// and it's cutting off the .com. Any ideas. Thanks Jeremy Gilles Detillieux wrote: According to Jeremy Lyon: I'm trying to get this ssl patch to work. I finally installed all the patch, but now I'm getting an error when compiling. Server.cc: In method `Server::Server(char *, int, int, StringList * = 0)': Server.cc:44: passing `const char *' as argument 1 of `String::operator =(char *)' discards qualifiers make[1]: *** [Server.o] Error 1 make[1]: Leaving directory `/tmp/work/htdig-3.1.5/htdig' make: *** [all] Error 1 This is a line that was modified by the patch. Here is the excert from Server.cc // // Attempt to get a robots.txt file from the specified server // String url; url = ssl ? "https://" : "http://"; this is where it's getting an error url host ':' port "/robots.txt"; Documentdoc(url, 0); static int local_urls_only = config.Boolean("local_urls_only"); time_t timeZero = 0; Document::DocStatus status; Here's the diff @@ -40,7 +40,8 @@ // // Attempt to get a robots.txt file from the specified server // -String url = "http://"; +String url; +url = ssl ? "https://" : "http://"; url host ':' port "/robots.txt"; Document doc(url, 0); Any ideas?? I'd try the following construct to see if it would keep your compiler happy: String url = "http://"; if (ssl) url = "https://"; This seems to be a compiler bug to me, as there shouldn't be a difference in the type of a single string literal or a ternary operator with two string literals as result. We have had reports before of some C++ compilers choking on ternary operators, though, so we have tried to use them sparingly. -- Gilles R. Detillieux E-mail: [EMAIL PROTECTED] Spinal Cord Research Centre WWW:http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax:(204)789-3930 begin:vcard n:Lyon;Jeremy tel;pager:303-899-9178 tel;work:303-624-4226 x-mozilla-html:FALSE org:Qwest;Information Technologies version:2.1 email;internet:[EMAIL PROTECTED] title:Associate IT Specialist adr;quoted-printable:;;1515 Arapahoe=0D=0ATower 1=0D=0AFlr 9;Denver;Colorado;80202; fn:Jeremy Lyon end:vcard To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] ssl patch for ht://dig
According to Jeremy Lyon: Gilles, Thank you so much. That worked. Now I have a new problem. I am indexing from the local file system. Now when I do a search everything works fine except the urls that are returned for the ssl sites appear like this. http://ecom.uswest:443/path It's storing as a regular http:// instead of https:// and it's cutting off the .com. Any ideas. Not a clue, but then I haven't had a good long look at the SSL patch to see what it's doing. You should probably ask the developer of the orginal SSL patch (for 3.1.3, I think), as the current one is supposedly a straight port of it to 3.1.5. -- Gilles R. Detillieux E-mail: [EMAIL PROTECTED] Spinal Cord Research Centre WWW:http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax:(204)789-3930 To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] Problem using local_urls
According to Jeremy Lyon: I am running htdig 3.1.5 and I'm trying to use local_urls. It seems to be working fine, except it is not crawling through the pages. I only get the index.html. Any ideas??? Here is my conf file. local_urls: https://ecom.uswest.com/redaction/test/=/apps/redaction/current /docs/test/ local_urls_only:true start_url: `/export/web/esearch/current/sites/ecom/redaction/site.lst` #limit_urls_to: `/export/web/esearch/current/sites/ecom/redaction/site.lst` limit_normalized: .uswest.com .uswest.net .qwest.com .qwest.net I assume this was because you were trying to use https URLs before applying the SSL patch. I don't know how well the SSL patch fits in with the local_urls handling, but theoretically the two should be pretty independent of each other. -- Gilles R. Detillieux E-mail: [EMAIL PROTECTED] Spinal Cord Research Centre WWW:http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax:(204)789-3930 To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] Problem using local_urls
I got it working. I had to remove the old database and make a new one. Thanks Gilles Detillieux wrote: According to Jeremy Lyon: I am running htdig 3.1.5 and I'm trying to use local_urls. It seems to be working fine, except it is not crawling through the pages. I only get the index.html. Any ideas??? Here is my conf file. local_urls: https://ecom.uswest.com/redaction/test/=/apps/redaction/current /docs/test/ local_urls_only:true start_url: `/export/web/esearch/current/sites/ecom/redaction/site.lst` #limit_urls_to: `/export/web/esearch/current/sites/ecom/redaction/site.lst` limit_normalized: .uswest.com .uswest.net .qwest.com .qwest.net I assume this was because you were trying to use https URLs before applying the SSL patch. I don't know how well the SSL patch fits in with the local_urls handling, but theoretically the two should be pretty independent of each other. -- Gilles R. Detillieux E-mail: [EMAIL PROTECTED] Spinal Cord Research Centre WWW:http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax:(204)789-3930 begin:vcard n:Lyon;Jeremy tel;pager:303-899-9178 tel;work:303-624-4226 x-mozilla-html:FALSE org:Qwest;Information Technologies version:2.1 email;internet:[EMAIL PROTECTED] title:Associate IT Specialist adr;quoted-printable:;;1515 Arapahoe=0D=0ATower 1=0D=0AFlr 9;Denver;Colorado;80202; fn:Jeremy Lyon end:vcard To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] Additional variables for htsearch
According to Oliver Hoogvliet: The following line show my "long.html"-file: ... 4 BRFONT FACE="arial, helvetica,geneva" SIZE="-4"$(MODIFIED) ???/FONTBR ... At line 4 you can see three ???. At this place I would like to have one of two alternative messages: (a) News (b) Stories. Is there any possibility to get these alternatives with htdig by using any META-Tag? Can this additional information be used to produce an output by htsearch (i.e. with an user-defined variable)? If not, is there any alternative solution for that? The template_patterns attribute allows selection of different template files based on patterns the the URLs of the documents, but there's no corresponding method of selecting based on the content of any meta tag. I can't think of an approach that wouldn't require extensive changes to the code to implement what you want, unless you can make an URL-based approach work. -- Gilles R. Detillieux E-mail: [EMAIL PROTECTED] Spinal Cord Research Centre WWW:http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax:(204)789-3930 To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
[htdig] [off topic] -- how to reset STDOUT Assignment
Have a Perl Script which invokes execution of htdig and htmerge. Similar purpose to rundig, but use of shell scripts, in the specific environment, is not practicable. I want to direct the STDOUT, from htdig/htmerge, to disk files. open (STDOUT,"diskfile"), followed by system commands to execute ht(dig/merge), send the STDOUT to the disk file. Trouble is, I can't re-assign STDOUT to its original value. open (STDOUT,"-") appears to lose the output entirely. If I close one disk file, and open another, any subsequent SYSOUT does go to the new disk file. I strongly suspect that some form of "dev/" is needed; how can I find the value (under Linux 6.1). Steven P Haver/602-242-9708 To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] ssl patch for ht://dig
According to Jeremy Lyon: I'm trying to get this ssl patch to work. I finally installed all the patch, but now I'm getting an error when compiling. Server.cc: In method `Server::Server(char *, int, int, StringList * = 0)': Server.cc:44: passing `const char *' as argument 1 of `String::operator =(char *)' discards qualifiers make[1]: *** [Server.o] Error 1 make[1]: Leaving directory `/tmp/work/htdig-3.1.5/htdig' make: *** [all] Error 1 This is a line that was modified by the patch. Here is the excert from Server.cc // // Attempt to get a robots.txt file from the specified server // String url; url = ssl ? "https://" : "http://"; this is where it's getting an error url host ':' port "/robots.txt"; Documentdoc(url, 0); static int local_urls_only = config.Boolean("local_urls_only"); time_t timeZero = 0; Document::DocStatus status; Here's the diff @@ -40,7 +40,8 @@ // // Attempt to get a robots.txt file from the specified server // -String url = "http://"; +String url; +url = ssl ? "https://" : "http://"; url host ':' port "/robots.txt"; Document doc(url, 0); Any ideas?? I'd try the following construct to see if it would keep your compiler happy: String url = "http://"; if (ssl) url = "https://"; This seems to be a compiler bug to me, as there shouldn't be a difference in the type of a single string literal or a ternary operator with two string literals as result. We have had reports before of some C++ compilers choking on ternary operators, though, so we have tried to use them sparingly. -- Gilles R. Detillieux E-mail: [EMAIL PROTECTED] Spinal Cord Research Centre WWW:http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax:(204)789-3930 To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html