Howdy, Holger:

> For instance if the last character of a URL before
> whitespace in an email is a "?" then this is most likely a
> real question mark, not part of the URL, regardless of
> whether a "?" at the end of a URL is valid. Same thing for
> ",", "." etc.

Hey, go check out this web site: http://www.example.com/? It's
great.  Also, this one is neat, too: http://www.example.com/,

> That kind of determination is heuristic in nature though,
> and cannot be derived from URL grammar rules. The exact set
> of heuristics would depend on the context, e.g. the language
> of the surrounding text. This can actually get very
> complicated.  Look at the two examples

  I don't know that a heuristic approach would ever be really
  adequate, but you really need a full natural language
  grammar to make solid distinctions.
 
> "Have a look at http://www.example.com/? for a great time."
> 
> "Have you seen http://www.example.com/? Looks cool."
> 
> In the first case the "?" appears to be part of the URL, in
> the second it does not.

  You can detect the differences in the above two sentences
  because looking at the first sentance, a decent natural
  language grammar won't allow the the second PP as a complete
  sentence (but will recognize "have" as a main verb and thus
  complete the VP with the PP), where as with the second
  sentence, the grammar will recognize "Have" as an auxiliary
  for "seen" and make a match (using a gap and fill scheme,
  for example) based on the fact that this is a wh-question
  equivalent for its declarative form (You have seen
  http://www.example.com.) and therefore it will correctly
  determine http://www.example.com is the end of the sentence
  and the question mark is the sentence terminator.

  Which is to say, as you said, that it can get quite
  complicated, but it is also to say that heuristics may
  not be sufficient for a lot of cases. :-)  

  -jeff 

-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the 
subject, without the quotes.

Reply via email to