On 1/21/13 3:34 PM, Douglas Roberts wrote:
Uh, I'm gonna guess that Nick is not a Unix user.
Unix tools are not the best for this kind of task. One would better served with a programmable editor or e-mail client that can traverse lines and work in terms of sentences and paragraphs. One editor that is well-suited to this, and can work in batch Emacs.

Consider if someone is writing on a portable device and enters narrow text with this form..

Let's extract a
sentence that crosses
several
lines.

Line oriented (Unix) tools will require assembling up until the "." is reached. In Emacs, to append a sentence to a list `l' it's easy:

(setq l nil)

(defun grab-sentence ()
  (interactive)
  (push
   (let ((start (point)))
     (forward-sentence)
     (buffer-substring start (point)))
   l))

Yielding a list with one string:

("Let's extract a sentence that crosses several lines.")

On second call, the list grows to two elements, and so on.

But that's just a baby step. Then one needs to categorize text by its owner. One way to do this is to read chronologically from top to bottom, and assume that first occurrence, especially if there is no leading ">" is the author named in the last From line. Even the quoting conventions change depending on what mail program is used. Some people like to tag with names or initials as delimiters, others quote passages with quotation marks, others (like Emacs users), have tags per line like Nick>, others just simply ">", ">>", as quoting levels, still others use richtext cues like boldface.

What's needed is a sort of content-addressable memory (e.g. a hash from text to author).

And for goodness sake, stop pretending that the mailing list is anything like a collaborative essay. It is not, there is no central purpose. No one has agreed on anything. A mailing list is a set of people approaching a topic from their own point of view, and in the process redefining what the topic is. It's not a blog, which is one person's point of view, with some attached comments.

Marcus
============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com

Reply via email to