Re: Beta Testing a Robot

2003-12-05 Thread R. Joseph Newton
Casey West wrote:

 I'm beta-testing a robot that searches Google when new questions are
 posed to the beginners' lists.  I have no idea if it will be useful.
 :-)

 I'm going to watch it closely and hope it is.  I'll remove it if I
 find that it does a bad job.

   Casey West

Hi Casey,

I'm getting in on this sorta late, but here's my $0.02 worth:

I don't mind getting the bot responses.  I guess I may be in a minority
on this subject though.

One thing I do see is a fairly broad spectrum search that sometimes
shoots pretty wide of the mark.  There are a couple branches to this, in
my view:

1.  The search seems to respond to boilerplate with equal or greater
weight than to the meat of the question.  I se the same problem with the
perldoc -q implementation on my computer.  I've got some thoughts on
approachesw to this, but I'll defer them to later, because they are
pretty speculative.

2.  There may be benefit to using a prioritized search pattern with the
significant content of the search string.  I have been working on an
archive manager for my record of this list [actually a generalized
mailbox archive manager, and here is the approach I took.

I actually had three search options:  Precise phrase [case-insensitive],
all words, and any words.  The current search pattern seem to be more of
an all words search.  It might help to narrow that down to demand
matches on mutliple words.

Within my all words serach, I also used a priority queue system for
ordering response by significance.

Here I scan the file keeping a count of total matches found, and
ensuring that each word was matched at least once:  Note that each entry
in the hash pointed to by $found_in, and loaded by iterative calls to
this routine has a 'count' element.

input:
$regexes--anonymous array of search strings
$file_key--anonymous array of message sequences numbers
$files--anonymous hash of filenames, keyed by the above $file_keys
$found_in--anonympous hash to be loaded with  filenames, keys, and
counts

sub seek_all_words_in_file {
  my ($regexes, $file_key, $files, $found_in) = @_;

  my $file = $files-{$file_key};
  open IN, $file or die Could not open $file $!;
  my $matchcount = {};
  $matchcount-{$_} = 0 foreach @$regexes;
  my $line;
  $line = IN until $line and $line eq \x0A;
   #  This gets me past a header
section of the file I'm scanning
  my $total_count;
  while (defined ($line = IN)) {
foreach my $regex (@$regexes) {  #   get match counts per
line of each regex
  if (my $line_match_count = () = $line =~ /$regex/gi) {
$matchcount-{$regex} += $line_match_count;
  }
}
  }
  my $matched_all = 1;
  for (@$regexes) {
$matched_all = 0 if not $matchcount-{$_};   #  filters if any words
are missing
  }
  return if not $matched_all;
  my $count;
  $count += $matchcount-{$_} for @$regexes;
  $found_in-{$file_key}-{filename} = $file if not
$found_in-{$file_key};
  $found_in-{$file_key}-{count} = $count;
}

The calling function uses the above scanning routine thusly:

...
  while (my $file_key = shift @$file_keys) {
seek_all_words_in_file($regexes, $file_key, $message_files,
$found_in);
  }
  display_search_results($found_in, $search_dialog);
...
handing it off to the following sub.  Keep an eye on the hash pointed to
by $best_bets, since that is the actual priority queue mechanism:


sub display_search_results {
  my ($found_in, $search_dialog) = @_;

  our $message_viewer;
  our $message_list;
  my $best_bets = {};
  foreach my $file_key (keys %$found_in) {
my $file = $found_in-{$file_key};
my $line_count = $file-{count};
$best_bets-{$line_count} = [] if not $best_bets-{$line_count};
push @{$best_bets-{$line_count}}, $file_key;
  }
  $message_list-delete('all');
  my $match_count = 0;
  foreach my $priority_level (sort {$b = $a} keys %$best_bets) {
foreach my $file (sort {$b = $a} @{$best_bets-{$priority_level}})
{
  my $details = get_message_info($file);
  add_message_to_tree($file, $details, $message_list, $file)
}
  }
  set_viewer_status('sort', 'none');
}

Of course this still somewhat lacks subtlety.  For one thing there is no
weighting for the balance of search words in the file being searched.
It might be better to give extra points for files that had all words in
roughly equal quantity.  Between precise phrase and all words is also
another standard, that I hadn't really tried to explore.  That would be
words in order'.  Something like this might be best with the record
separator set to a period, so that it would scan text on a
sentence-by-sentence basis, looking for all words in the same order as
the search phrase, even if intermingled with other text.  Unlike the
above, I haven't built or tested this but a general algorithm for the
regex might be:

my $regex = quotemeta shift @search_words;
regex .= .*$word while my $word = quotemeta shift @search_words;

Whcih should render a regex that will match any string

Re: Beta Testing a Robot

2003-12-04 Thread Wiggins d Anconia


 It was Wednesday, December 03, 2003 when Casey West took the soap box,
saying:
 : I'm beta-testing a robot that searches Google when new questions are
 : posed to the beginners' lists.  I have no idea if it will be useful.
 : :-)
 
 I should like to make an important note. This bot is not intended to
 deter people from answering questions. Indeed, the goal of this list is
 still to give detailed, useful answers to questions. Specific answers
 that really help people.
 
 This robot is intended as a First Response Service, something that can
 help the questioner between the time they ask a question and the time
 they receive a good answer, from a human. My goal is that the archives
 from this list and others, and the web itself will be of further
 assistance, perhaps even adding a spark of creativity to some code or
 thought process. Therefore I hope this robot will bring that information
 a small step closer to the list.
 
 So please, keep answering those questions. The robot doesn't do such
 a great job at answering questions, only guesses at where the
 technical information might be, which is still a very useful service
 as far as I can see.
 
 Finally, the subject lines from the robot have been altered to show what
 it really is, a First Response System. Think of a person in cardiac
 arrest. First Response does some very general, well tested attempts at
 saving the patient. Sometimes they're successful, but the patient still
 needs to see the doctor, maybe even ER. That's what we're here for. :-)
 
   Casey West
 
 PS: For all who asked, I'll show the source code in due course.
 

I don't quite understand why the first response is sent back to the list
rather than just the OP though?  Using your analogy it seems like the
first response is *also* trying to save the doctors at the same time,
which would seem to get in their way. Another concern is that the
response comes from your address, or at least uses your name, which will
eventually (though maybe the subject will stem this, since it just went
in) cause me to ignore all posts from Casey West which would be a bad
thing.

Personally I use the list through SMTP and don't have a thread/filter
option for about half the time I am reading it (don't worry early next
year I intend to write a web mail client that I can use that will do
these things ;-)) which means the automated messages get in the way more
than they help *me*. 

I do like the idea of sending help back to the OP, and I appreciate your
time and efforts as admin of the list(s), just throwing out my $.02

http://danconia.org


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Beta Testing a Robot

2003-12-04 Thread Christopher G Tantalo
Casey West wrote:

I'm beta-testing a robot that searches Google when new questions are
posed to the beginners' lists.  I have no idea if it will be useful.
:-)
I'm going to watch it closely and hope it is.  I'll remove it if I
find that it does a bad job.
 Casey West

This does seem like a good idea, but for some of us who can not access 
the web from work, this just makes it worse.  I signed up for the list, 
to see others questions and responses, as well as post my questions and 
receive answers through email.  With the bot giving responses with links 
that I can not access, it just makes it more frustrating, since others 
will not feel the need to respond since the bot gave links.
If it wasnt for the corp nazis here, I would love this bot!
Chris

--
---
Just Your Friendly Neighborhood
_SPIDEY_


-
The information contained in this message may be privileged, confidential, and protected from disclosure. If the reader of this message is not the intended recipient, or any employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. 

Thank you. Paychex, Inc.

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Beta Testing a Robot

2003-12-04 Thread Casey West
It was Wednesday, December 03, 2003 when Casey West took the soap box, saying:
: I'm beta-testing a robot that searches Google when new questions are
: posed to the beginners' lists.  I have no idea if it will be useful.
: :-)

I should like to make an important note. This bot is not intended to
deter people from answering questions. Indeed, the goal of this list is
still to give detailed, useful answers to questions. Specific answers
that really help people.

This robot is intended as a First Response Service, something that can
help the questioner between the time they ask a question and the time
they receive a good answer, from a human. My goal is that the archives
from this list and others, and the web itself will be of further
assistance, perhaps even adding a spark of creativity to some code or
thought process. Therefore I hope this robot will bring that information
a small step closer to the list.

So please, keep answering those questions. The robot doesn't do such
a great job at answering questions, only guesses at where the
technical information might be, which is still a very useful service
as far as I can see.

Finally, the subject lines from the robot have been altered to show what
it really is, a First Response System. Think of a person in cardiac
arrest. First Response does some very general, well tested attempts at
saving the patient. Sometimes they're successful, but the patient still
needs to see the doctor, maybe even ER. That's what we're here for. :-)

  Casey West

PS: For all who asked, I'll show the source code in due course.

-- 
I have traveled the length and breadth of this country and talked
with the best people, and I can assure you that data processing is a
fad that won't last out the year.
 -- The editor in charge of business books for Prentice Hall, 1957


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Beta Testing a Robot

2003-12-04 Thread Casey West
It was Thursday, December 04, 2003 when Wiggins d Anconia took the soap box, saying:
: 
: 
:  It was Wednesday, December 03, 2003 when Casey West took the soap box,
: saying:
:  : I'm beta-testing a robot that searches Google when new questions are
:  : posed to the beginners' lists.  I have no idea if it will be useful.
:  : :-)
:  
:  I should like to make an important note. This bot is not intended to
:  deter people from answering questions. Indeed, the goal of this list is
:  still to give detailed, useful answers to questions. Specific answers
:  that really help people.
: 
: I don't quite understand why the first response is sent back to the list
: rather than just the OP though?  Using your analogy it seems like the
: first response is *also* trying to save the doctors at the same time,
: which would seem to get in their way. Another concern is that the
: response comes from your address, or at least uses your name, which will
: eventually (though maybe the subject will stem this, since it just went
: in) cause me to ignore all posts from Casey West which would be a bad
: thing.
: 
: Personally I use the list through SMTP and don't have a thread/filter
: option for about half the time I am reading it (don't worry early next
: year I intend to write a web mail client that I can use that will do
: these things ;-)) which means the automated messages get in the way more
: than they help *me*. 
: 
: I do like the idea of sending help back to the OP, and I appreciate your
: time and efforts as admin of the list(s), just throwing out my $.02

Thanks for the input, it's most valuable.  That's the point of
testing, after all.  :-)

  Casey West

-- 
When a cat is dropped, it always lands on its feet; when toast is
dropped, it always lands with the buttered side facing down. I propose
to strap buttered toast to the back of a cat; the two will hover,
spinning inches above the ground. With a giant buttered-cat array, we
could power entire metropolitan areas.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Beta Testing a Robot

2003-12-04 Thread Shaw, Matthew
My $0.02 on this:

While it may be a worthwhile personal pursuit to write a script that
provides relevant results from google based on the text of someone's
email/news posting/etc, I don't think this is the forum for it. These
are very busy lists to start with and this essentially will result in
doubling the 'new' traffic to the list with information that is
irrelevant to anyone but the original poster. The irrelevancy of the
auto responder will increase when faced with questions by folks that are
so far off that they're not even asking the right question to begin
with(seems to be somewhat common in these groups). For these people,
investigative questioning is the only correct response and an
autoresponder (with a lot of text - read: information overload) will
only confuse them more.

 If anything, the response should go directly to the poster, not to the
list. IE: 

Your question has been posted to the XXX List, while you're waiting for
someone to respond, try out these links from google.com that may or may
not be relevant to your query:

1. link Google Desc
...
10. link Google Desc

Also, googling should be the first resort of any technical person. I
think the list FAQs should be updated in Section 2, Question 9 'Other
Tips Before Posting To The List' with something like:

_1._ Try to google.com with your question first! Google has extremely
relevant search results and may preclude asking your question to the
list. 

Tossing his copper in the pot,
Matt Shaw
Technical Architect
xwave, An Aliant Company
Bus: 506-389-4641 (Mctn)
Bus: 506-444-9639 (Fred)
Cel: 506-863-8949
[EMAIL PROTECTED]


 -Original Message-
 From: Casey West [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, December 03, 2003 3:41 PM
 To: [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Subject: Beta Testing a Robot
 
 
 I'm beta-testing a robot that searches Google when new 
 questions are posed to the beginners' lists.  I have no idea 
 if it will be useful.
 :-)
 
 I'm going to watch it closely and hope it is.  I'll remove it 
 if I find that it does a bad job.
 
   Casey West
 
 -- 
 Good Idea: Kissing a loved one.
 Bad Idea:  Kissing a total stranger. 
 
 
 -- 
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Beta Testing a Robot

2003-12-04 Thread Casey West
It was Wednesday, December 03, 2003 when Casey West took the soap box, saying:
: I'm beta-testing a robot that searches Google when new questions are
: posed to the beginners' lists.  I have no idea if it will be useful.
: :-)
: 
: I'm going to watch it closely and hope it is.  I'll remove it if I
: find that it does a bad job.

Thank you for your timely and useful responses, they're under
consideration.  Until a decision has been reached (and re-coded), the
bot will be temporarily suspended.

  Casey West

-- 
I'm just glad it'll be Clark Gable who's falling on his face and not
Gary Cooper.
 -- Gary Cooper on his decision not to take the leading role in Gone
With The Wind.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/ http://learn.perl.org/first-response




Re: Beta Testing a Robot

2003-12-04 Thread Chuck Fox
Casey,

I would like to chime in on the side of sending the search results 
directly to the poster.  In most cases, the poster is at the mercy of 
the search engine they choose. Whereas, you have the advantage of  
knowing where to search.  Please do not abandon this work.  A digest -- 
to which, one may subscribe -- of this activity may also prove useful in 
the long run to the lurkers like myself.

Thanks for the cool and interesting approach to getting an answer out to 
the questioner.

Chuck

[EMAIL PROTECTED] wrote:

It was Wednesday, December 03, 2003 when Casey West took the soap box, saying:
: I'm beta-testing a robot that searches Google when new questions are
: posed to the beginners' lists.  I have no idea if it will be useful.
: :-)
: 
: I'm going to watch it closely and hope it is.  I'll remove it if I
: find that it does a bad job.

Thank you for your timely and useful responses, they're under
consideration.  Until a decision has been reached (and re-coded), the
bot will be temporarily suspended.
 Casey West

 



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/ http://learn.perl.org/first-response



[REBUILT] Re: Beta Testing a Robot

2003-12-04 Thread Casey West
It was Thursday, December 04, 2003 when Chuck Fox took the soap box, saying:
: Casey,
: 
: I would like to chime in on the side of sending the search results 
: directly to the poster.  In most cases, the poster is at the mercy of 
: the search engine they choose. Whereas, you have the advantage of  
: knowing where to search.  Please do not abandon this work.  A digest -- 
: to which, one may subscribe -- of this activity may also prove useful in 
: the long run to the lurkers like myself.

Fear not!  I've received a lot of good feedback, and here are the
results.

  http://bfr.caseywest.com

The First Response System is no longer going to the mailing list.
It's also not going to the OP.  If you notice, the new footer on all
the list postings points to a redirection URL.

  http://learn.perl.org/first-response

You can subscribe to the RSS feed.

  http://bfr.caseywest.com/index.rdf

Your can search the site and read all the archives if you like, as
well.  We'll see how this approach works out.  I've yet to get the FAQ
updated, but I will.

: Thanks for the cool and interesting approach to getting an answer out to 
: the questioner.

I hope it works. The goal is to be helpful, after all. :-)

: -- 
: To unsubscribe, e-mail: [EMAIL PROTECTED]
: For additional commands, e-mail: [EMAIL PROTECTED]
: http://learn.perl.org/ http://learn.perl.org/first-response

Aah, there it is.

  Casey West

-- 
Usenet is like Tetris for people who still remember how to read. 
  -- Button from the Computer Museum, Boston, MA


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/ http://learn.perl.org/first-response