Re: Beta Testing a Robot
Casey West wrote: I'm beta-testing a robot that searches Google when new questions are posed to the beginners' lists. I have no idea if it will be useful. :-) I'm going to watch it closely and hope it is. I'll remove it if I find that it does a bad job. Casey West Hi Casey, I'm getting in on this sorta late, but here's my $0.02 worth: I don't mind getting the bot responses. I guess I may be in a minority on this subject though. One thing I do see is a fairly broad spectrum search that sometimes shoots pretty wide of the mark. There are a couple branches to this, in my view: 1. The search seems to respond to boilerplate with equal or greater weight than to the meat of the question. I se the same problem with the perldoc -q implementation on my computer. I've got some thoughts on approachesw to this, but I'll defer them to later, because they are pretty speculative. 2. There may be benefit to using a prioritized search pattern with the significant content of the search string. I have been working on an archive manager for my record of this list [actually a generalized mailbox archive manager, and here is the approach I took. I actually had three search options: Precise phrase [case-insensitive], all words, and any words. The current search pattern seem to be more of an all words search. It might help to narrow that down to demand matches on mutliple words. Within my all words serach, I also used a priority queue system for ordering response by significance. Here I scan the file keeping a count of total matches found, and ensuring that each word was matched at least once: Note that each entry in the hash pointed to by $found_in, and loaded by iterative calls to this routine has a 'count' element. input: $regexes--anonymous array of search strings $file_key--anonymous array of message sequences numbers $files--anonymous hash of filenames, keyed by the above $file_keys $found_in--anonympous hash to be loaded with filenames, keys, and counts sub seek_all_words_in_file { my ($regexes, $file_key, $files, $found_in) = @_; my $file = $files-{$file_key}; open IN, $file or die Could not open $file $!; my $matchcount = {}; $matchcount-{$_} = 0 foreach @$regexes; my $line; $line = IN until $line and $line eq \x0A; # This gets me past a header section of the file I'm scanning my $total_count; while (defined ($line = IN)) { foreach my $regex (@$regexes) { # get match counts per line of each regex if (my $line_match_count = () = $line =~ /$regex/gi) { $matchcount-{$regex} += $line_match_count; } } } my $matched_all = 1; for (@$regexes) { $matched_all = 0 if not $matchcount-{$_}; # filters if any words are missing } return if not $matched_all; my $count; $count += $matchcount-{$_} for @$regexes; $found_in-{$file_key}-{filename} = $file if not $found_in-{$file_key}; $found_in-{$file_key}-{count} = $count; } The calling function uses the above scanning routine thusly: ... while (my $file_key = shift @$file_keys) { seek_all_words_in_file($regexes, $file_key, $message_files, $found_in); } display_search_results($found_in, $search_dialog); ... handing it off to the following sub. Keep an eye on the hash pointed to by $best_bets, since that is the actual priority queue mechanism: sub display_search_results { my ($found_in, $search_dialog) = @_; our $message_viewer; our $message_list; my $best_bets = {}; foreach my $file_key (keys %$found_in) { my $file = $found_in-{$file_key}; my $line_count = $file-{count}; $best_bets-{$line_count} = [] if not $best_bets-{$line_count}; push @{$best_bets-{$line_count}}, $file_key; } $message_list-delete('all'); my $match_count = 0; foreach my $priority_level (sort {$b = $a} keys %$best_bets) { foreach my $file (sort {$b = $a} @{$best_bets-{$priority_level}}) { my $details = get_message_info($file); add_message_to_tree($file, $details, $message_list, $file) } } set_viewer_status('sort', 'none'); } Of course this still somewhat lacks subtlety. For one thing there is no weighting for the balance of search words in the file being searched. It might be better to give extra points for files that had all words in roughly equal quantity. Between precise phrase and all words is also another standard, that I hadn't really tried to explore. That would be words in order'. Something like this might be best with the record separator set to a period, so that it would scan text on a sentence-by-sentence basis, looking for all words in the same order as the search phrase, even if intermingled with other text. Unlike the above, I haven't built or tested this but a general algorithm for the regex might be: my $regex = quotemeta shift @search_words; regex .= .*$word while my $word = quotemeta shift @search_words; Whcih should render a regex that will match any string
Re: Beta Testing a Robot
It was Wednesday, December 03, 2003 when Casey West took the soap box, saying: : I'm beta-testing a robot that searches Google when new questions are : posed to the beginners' lists. I have no idea if it will be useful. : :-) I should like to make an important note. This bot is not intended to deter people from answering questions. Indeed, the goal of this list is still to give detailed, useful answers to questions. Specific answers that really help people. This robot is intended as a First Response Service, something that can help the questioner between the time they ask a question and the time they receive a good answer, from a human. My goal is that the archives from this list and others, and the web itself will be of further assistance, perhaps even adding a spark of creativity to some code or thought process. Therefore I hope this robot will bring that information a small step closer to the list. So please, keep answering those questions. The robot doesn't do such a great job at answering questions, only guesses at where the technical information might be, which is still a very useful service as far as I can see. Finally, the subject lines from the robot have been altered to show what it really is, a First Response System. Think of a person in cardiac arrest. First Response does some very general, well tested attempts at saving the patient. Sometimes they're successful, but the patient still needs to see the doctor, maybe even ER. That's what we're here for. :-) Casey West PS: For all who asked, I'll show the source code in due course. I don't quite understand why the first response is sent back to the list rather than just the OP though? Using your analogy it seems like the first response is *also* trying to save the doctors at the same time, which would seem to get in their way. Another concern is that the response comes from your address, or at least uses your name, which will eventually (though maybe the subject will stem this, since it just went in) cause me to ignore all posts from Casey West which would be a bad thing. Personally I use the list through SMTP and don't have a thread/filter option for about half the time I am reading it (don't worry early next year I intend to write a web mail client that I can use that will do these things ;-)) which means the automated messages get in the way more than they help *me*. I do like the idea of sending help back to the OP, and I appreciate your time and efforts as admin of the list(s), just throwing out my $.02 http://danconia.org -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Beta Testing a Robot
Casey West wrote: I'm beta-testing a robot that searches Google when new questions are posed to the beginners' lists. I have no idea if it will be useful. :-) I'm going to watch it closely and hope it is. I'll remove it if I find that it does a bad job. Casey West This does seem like a good idea, but for some of us who can not access the web from work, this just makes it worse. I signed up for the list, to see others questions and responses, as well as post my questions and receive answers through email. With the bot giving responses with links that I can not access, it just makes it more frustrating, since others will not feel the need to respond since the bot gave links. If it wasnt for the corp nazis here, I would love this bot! Chris -- --- Just Your Friendly Neighborhood _SPIDEY_ - The information contained in this message may be privileged, confidential, and protected from disclosure. If the reader of this message is not the intended recipient, or any employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. Paychex, Inc. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Beta Testing a Robot
It was Wednesday, December 03, 2003 when Casey West took the soap box, saying: : I'm beta-testing a robot that searches Google when new questions are : posed to the beginners' lists. I have no idea if it will be useful. : :-) I should like to make an important note. This bot is not intended to deter people from answering questions. Indeed, the goal of this list is still to give detailed, useful answers to questions. Specific answers that really help people. This robot is intended as a First Response Service, something that can help the questioner between the time they ask a question and the time they receive a good answer, from a human. My goal is that the archives from this list and others, and the web itself will be of further assistance, perhaps even adding a spark of creativity to some code or thought process. Therefore I hope this robot will bring that information a small step closer to the list. So please, keep answering those questions. The robot doesn't do such a great job at answering questions, only guesses at where the technical information might be, which is still a very useful service as far as I can see. Finally, the subject lines from the robot have been altered to show what it really is, a First Response System. Think of a person in cardiac arrest. First Response does some very general, well tested attempts at saving the patient. Sometimes they're successful, but the patient still needs to see the doctor, maybe even ER. That's what we're here for. :-) Casey West PS: For all who asked, I'll show the source code in due course. -- I have traveled the length and breadth of this country and talked with the best people, and I can assure you that data processing is a fad that won't last out the year. -- The editor in charge of business books for Prentice Hall, 1957 -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Beta Testing a Robot
It was Thursday, December 04, 2003 when Wiggins d Anconia took the soap box, saying: : : : It was Wednesday, December 03, 2003 when Casey West took the soap box, : saying: : : I'm beta-testing a robot that searches Google when new questions are : : posed to the beginners' lists. I have no idea if it will be useful. : : :-) : : I should like to make an important note. This bot is not intended to : deter people from answering questions. Indeed, the goal of this list is : still to give detailed, useful answers to questions. Specific answers : that really help people. : : I don't quite understand why the first response is sent back to the list : rather than just the OP though? Using your analogy it seems like the : first response is *also* trying to save the doctors at the same time, : which would seem to get in their way. Another concern is that the : response comes from your address, or at least uses your name, which will : eventually (though maybe the subject will stem this, since it just went : in) cause me to ignore all posts from Casey West which would be a bad : thing. : : Personally I use the list through SMTP and don't have a thread/filter : option for about half the time I am reading it (don't worry early next : year I intend to write a web mail client that I can use that will do : these things ;-)) which means the automated messages get in the way more : than they help *me*. : : I do like the idea of sending help back to the OP, and I appreciate your : time and efforts as admin of the list(s), just throwing out my $.02 Thanks for the input, it's most valuable. That's the point of testing, after all. :-) Casey West -- When a cat is dropped, it always lands on its feet; when toast is dropped, it always lands with the buttered side facing down. I propose to strap buttered toast to the back of a cat; the two will hover, spinning inches above the ground. With a giant buttered-cat array, we could power entire metropolitan areas. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Beta Testing a Robot
My $0.02 on this: While it may be a worthwhile personal pursuit to write a script that provides relevant results from google based on the text of someone's email/news posting/etc, I don't think this is the forum for it. These are very busy lists to start with and this essentially will result in doubling the 'new' traffic to the list with information that is irrelevant to anyone but the original poster. The irrelevancy of the auto responder will increase when faced with questions by folks that are so far off that they're not even asking the right question to begin with(seems to be somewhat common in these groups). For these people, investigative questioning is the only correct response and an autoresponder (with a lot of text - read: information overload) will only confuse them more. If anything, the response should go directly to the poster, not to the list. IE: Your question has been posted to the XXX List, while you're waiting for someone to respond, try out these links from google.com that may or may not be relevant to your query: 1. link Google Desc ... 10. link Google Desc Also, googling should be the first resort of any technical person. I think the list FAQs should be updated in Section 2, Question 9 'Other Tips Before Posting To The List' with something like: _1._ Try to google.com with your question first! Google has extremely relevant search results and may preclude asking your question to the list. Tossing his copper in the pot, Matt Shaw Technical Architect xwave, An Aliant Company Bus: 506-389-4641 (Mctn) Bus: 506-444-9639 (Fred) Cel: 506-863-8949 [EMAIL PROTECTED] -Original Message- From: Casey West [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 03, 2003 3:41 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Beta Testing a Robot I'm beta-testing a robot that searches Google when new questions are posed to the beginners' lists. I have no idea if it will be useful. :-) I'm going to watch it closely and hope it is. I'll remove it if I find that it does a bad job. Casey West -- Good Idea: Kissing a loved one. Bad Idea: Kissing a total stranger. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Beta Testing a Robot
It was Wednesday, December 03, 2003 when Casey West took the soap box, saying: : I'm beta-testing a robot that searches Google when new questions are : posed to the beginners' lists. I have no idea if it will be useful. : :-) : : I'm going to watch it closely and hope it is. I'll remove it if I : find that it does a bad job. Thank you for your timely and useful responses, they're under consideration. Until a decision has been reached (and re-coded), the bot will be temporarily suspended. Casey West -- I'm just glad it'll be Clark Gable who's falling on his face and not Gary Cooper. -- Gary Cooper on his decision not to take the leading role in Gone With The Wind. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: Beta Testing a Robot
Casey, I would like to chime in on the side of sending the search results directly to the poster. In most cases, the poster is at the mercy of the search engine they choose. Whereas, you have the advantage of knowing where to search. Please do not abandon this work. A digest -- to which, one may subscribe -- of this activity may also prove useful in the long run to the lurkers like myself. Thanks for the cool and interesting approach to getting an answer out to the questioner. Chuck [EMAIL PROTECTED] wrote: It was Wednesday, December 03, 2003 when Casey West took the soap box, saying: : I'm beta-testing a robot that searches Google when new questions are : posed to the beginners' lists. I have no idea if it will be useful. : :-) : : I'm going to watch it closely and hope it is. I'll remove it if I : find that it does a bad job. Thank you for your timely and useful responses, they're under consideration. Until a decision has been reached (and re-coded), the bot will be temporarily suspended. Casey West -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
[REBUILT] Re: Beta Testing a Robot
It was Thursday, December 04, 2003 when Chuck Fox took the soap box, saying: : Casey, : : I would like to chime in on the side of sending the search results : directly to the poster. In most cases, the poster is at the mercy of : the search engine they choose. Whereas, you have the advantage of : knowing where to search. Please do not abandon this work. A digest -- : to which, one may subscribe -- of this activity may also prove useful in : the long run to the lurkers like myself. Fear not! I've received a lot of good feedback, and here are the results. http://bfr.caseywest.com The First Response System is no longer going to the mailing list. It's also not going to the OP. If you notice, the new footer on all the list postings points to a redirection URL. http://learn.perl.org/first-response You can subscribe to the RSS feed. http://bfr.caseywest.com/index.rdf Your can search the site and read all the archives if you like, as well. We'll see how this approach works out. I've yet to get the FAQ updated, but I will. : Thanks for the cool and interesting approach to getting an answer out to : the questioner. I hope it works. The goal is to be helpful, after all. :-) : -- : To unsubscribe, e-mail: [EMAIL PROTECTED] : For additional commands, e-mail: [EMAIL PROTECTED] : http://learn.perl.org/ http://learn.perl.org/first-response Aah, there it is. Casey West -- Usenet is like Tetris for people who still remember how to read. -- Button from the Computer Museum, Boston, MA -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response