[twitter-dev] Re: URLification

2009-12-17 Thread dbasch
Periods and parentheses are valid url characters. Assuming that an
adjacent period or closing parenthesis is not part of the url is a
gamble. The most sensible urlification includes all valid characters
until it finds one that clearly delimits the url such as a space.

http://www.ietf.org/rfc/rfc1738.txt

On Dec 17, 7:13 am, Ken Dobruskin  wrote:
> When adding a URL surrounded by parentheses or followed by a period, these 
> marks are included in the resulting link. Is a trailing whitespace the only 
> workaround? It's ugly and wastes a character.
>
> _
> Windows Live Hotmail: Your friends can get your Facebook updates, right from 
> Hotmail®.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...


[twitter-dev] Re: URLification

2009-12-17 Thread dbasch
You can get pretty sophisticated and have lots of heuristics to guess
what the user actually meant. For example, a period followed by a
space and a word that starts with uppercase almost certainly means
that the period was the end of a sentence and not part of the url.
Twitter probably should do this, as it's quite conservative.

Diego

On Dec 17, 11:10 am, Ken Dobruskin  wrote:
> True, but Yahoo! Mail and others do get it right.
> It's been a few years I no longer worry sending an email with a URL at the 
> end of a sentence. I wonder how they do it.
>
>
>
> > Date: Thu, 17 Dec 2009 05:48:31 -0800
> > Subject: [twitter-dev] Re: URLification
> > From: dba...@gmail.com
> > To: twitter-development-talk@googlegroups.com
>
> > Periods and parentheses are valid url characters. Assuming that an
> > adjacent period or closing parenthesis is not part of the url is a
> > gamble. The most sensible urlification includes all valid characters
> > until it finds one that clearly delimits the url such as a space.
>
> >http://www.ietf.org/rfc/rfc1738.txt
>
> > On Dec 17, 7:13 am, Ken Dobruskin  wrote:
> > > When adding a URL surrounded by parentheses or followed by a period, 
> > > these marks are included in the resulting link. Is a trailing whitespace 
> > > the only workaround? It's ugly and wastes a character.
>
> > > _
> > > Windows Live Hotmail: Your friends can get your Facebook updates, right 
> > > from 
> > > Hotmail®.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...
>
> _
> Windows Live: Friends get your Flickr, Yelp, and Digg updates when they 
> e-mail 
> you.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...


[twitter-dev] Re: URLification

2009-12-17 Thread dbasch
I agree. I searched the issues db and didn't find it. Not sure if it
belongs as an API issue but I submitted it anyway.

http://code.google.com/p/twitter-api/issues/detail?id=1298

On Dec 17, 2:49 pm, Ken Dobruskin  wrote:
> A closing parenthesis followed by a space seems like a pretty safe bet too. 
> I'm sure those rules have been worked out long ago - the RFC was published in 
> '94.
>
>
>
>
>
> > Date: Thu, 17 Dec 2009 07:55:14 -0800
> > Subject: [twitter-dev] Re: URLification
> > From: dba...@gmail.com
> > To: twitter-development-talk@googlegroups.com
>
> > You can get pretty sophisticated and have lots of heuristics to guess
> > what the user actually meant. For example, a period followed by a
> > space and a word that starts with uppercase almost certainly means
> > that the period was the end of a sentence and not part of the url.
> > Twitter probably should do this, as it's quite conservative.
>
> > Diego
>
> > On Dec 17, 11:10 am, Ken Dobruskin  wrote:
> > > True, but Yahoo! Mail and others do get it right.
> > > It's been a few years I no longer worry sending an email with a URL at 
> > > the end of a sentence. I wonder how they do it.
>
> > > > Date: Thu, 17 Dec 2009 05:48:31 -0800
> > > > Subject: [twitter-dev] Re: URLification
> > > > From: dba...@gmail.com
> > > > To: twitter-development-talk@googlegroups.com
>
> > > > Periods and parentheses are valid url characters. Assuming that an
> > > > adjacent period or closing parenthesis is not part of the url is a
> > > > gamble. The most sensible urlification includes all valid characters
> > > > until it finds one that clearly delimits the url such as a space.
>
> > > >http://www.ietf.org/rfc/rfc1738.txt
>
> > > > On Dec 17, 7:13 am, Ken Dobruskin  wrote:
> > > > > When adding a URL surrounded by parentheses or followed by a period, 
> > > > > these marks are included in the resulting link. Is a trailing 
> > > > > whitespace the only workaround? It's ugly and wastes a character.
>
> > > > > _
> > > > > Windows Live Hotmail: Your friends can get your Facebook updates, 
> > > > > right from 
> > > > > Hotmail®.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...
>
> > > _
> > > Windows Live: Friends get your Flickr, Yelp, and Digg updates when they 
> > > e-mail 
> > > you.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...
>
> _
> Keep your friends updated—even when you’re not signed 
> in.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...


[twitter-dev] Re: URLification

2009-12-18 Thread dean.j.robinson
I've recently switch to using this regex for pulling out links,
haven't spotted any issues with any extra characters surrounding the
links as yet.

/(?i)\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d?[.])(?:[^\s()<>]+|\
([^\s()<>]+\))+(?:\([^\s()<>]+\)|[^`!()\[\]{};:\'".,<>?«»“”‘’\s]))/

It was posted by @gruber to his twitter feed a couple of days after
his post that Chad linked to above.



On Dec 19, 3:48 am, Chad Etzel  wrote:
> This might be relevant to your 
> interests:http://daringfireball.net/2009/11/liberal_regex_for_matching_urls
>
> Something definitely changed in the twitter web front-end code which
> is borking url matching as of a month or so ago...
>
> -Chad
>
> On Fri, Dec 18, 2009 at 2:44 AM, Harshad RJ  wrote:
> > Although not an API issue, it might be good to track it as such, because
> > Twitter clients can then follow exactly the same policies that Twitter web
> > interface does.
> > If there is a standard regular expression that can be used for detecting a
> > URL, it could be published as a guideline in the API documentation for
> > consistency between all clients.
> > cheers,
> > Harshad
>
> > On Fri, Dec 18, 2009 at 2:45 AM, Raffi Krikorian  wrote:
>
> >> its not an API issue -- the API doesn't do any auto-URLification.
> >>  however, i'll pass this thread off to the web client team.
>
> >> On Thu, Dec 17, 2009 at 1:13 PM, dbasch  wrote:
>
> >>> I agree. I searched the issues db and didn't find it. Not sure if it
> >>> belongs as an API issue but I submitted it anyway.
>
> >>>http://code.google.com/p/twitter-api/issues/detail?id=1298
>
> > --
> > Harshad RJ
> >http://hrj.wikidot.com


RE: [twitter-dev] Re: URLification

2009-12-17 Thread Ken Dobruskin

True, but Yahoo! Mail and others do get it right. 
It's been a few years I no longer worry sending an email with a URL at the end 
of a sentence. I wonder how they do it.

> Date: Thu, 17 Dec 2009 05:48:31 -0800
> Subject: [twitter-dev] Re: URLification
> From: dba...@gmail.com
> To: twitter-development-talk@googlegroups.com
> 
> Periods and parentheses are valid url characters. Assuming that an
> adjacent period or closing parenthesis is not part of the url is a
> gamble. The most sensible urlification includes all valid characters
> until it finds one that clearly delimits the url such as a space.
> 
> http://www.ietf.org/rfc/rfc1738.txt
> 
> On Dec 17, 7:13 am, Ken Dobruskin  wrote:
> > When adding a URL surrounded by parentheses or followed by a period, these 
> > marks are included in the resulting link. Is a trailing whitespace the only 
> > workaround? It's ugly and wastes a character.
> >
> > _
> > Windows Live Hotmail: Your friends can get your Facebook updates, right 
> > from 
> > Hotmail®.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...
  
_
Windows Live: Friends get your Flickr, Yelp, and Digg updates when they e-mail 
you.
http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-action/social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:en-xm:SI_SB_3:092010

RE: [twitter-dev] Re: URLification

2009-12-17 Thread Ken Dobruskin

A closing parenthesis followed by a space seems like a pretty safe bet too. I'm 
sure those rules have been worked out long ago - the RFC was published in '94.

> Date: Thu, 17 Dec 2009 07:55:14 -0800
> Subject: [twitter-dev] Re: URLification
> From: dba...@gmail.com
> To: twitter-development-talk@googlegroups.com
> 
> You can get pretty sophisticated and have lots of heuristics to guess
> what the user actually meant. For example, a period followed by a
> space and a word that starts with uppercase almost certainly means
> that the period was the end of a sentence and not part of the url.
> Twitter probably should do this, as it's quite conservative.
> 
> Diego
> 
> On Dec 17, 11:10 am, Ken Dobruskin  wrote:
> > True, but Yahoo! Mail and others do get it right.
> > It's been a few years I no longer worry sending an email with a URL at the 
> > end of a sentence. I wonder how they do it.
> >
> >
> >
> > > Date: Thu, 17 Dec 2009 05:48:31 -0800
> > > Subject: [twitter-dev] Re: URLification
> > > From: dba...@gmail.com
> > > To: twitter-development-talk@googlegroups.com
> >
> > > Periods and parentheses are valid url characters. Assuming that an
> > > adjacent period or closing parenthesis is not part of the url is a
> > > gamble. The most sensible urlification includes all valid characters
> > > until it finds one that clearly delimits the url such as a space.
> >
> > >http://www.ietf.org/rfc/rfc1738.txt
> >
> > > On Dec 17, 7:13 am, Ken Dobruskin  wrote:
> > > > When adding a URL surrounded by parentheses or followed by a period, 
> > > > these marks are included in the resulting link. Is a trailing 
> > > > whitespace the only workaround? It's ugly and wastes a character.
> >
> > > > _
> > > > Windows Live Hotmail: Your friends can get your Facebook updates, right 
> > > > from 
> > > > Hotmail®.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...
> >
> > _
> > Windows Live: Friends get your Flickr, Yelp, and Digg updates when they 
> > e-mail 
> > you.http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...
  
_
Keep your friends updated—even when you’re not signed in.
http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-action/social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:en-xm:SI_SB_5:092010

Re: [twitter-dev] Re: URLification

2009-12-17 Thread Raffi Krikorian
its not an API issue -- the API doesn't do any auto-URLification.  however,
i'll pass this thread off to the web client team.

On Thu, Dec 17, 2009 at 1:13 PM, dbasch  wrote:

> I agree. I searched the issues db and didn't find it. Not sure if it
> belongs as an API issue but I submitted it anyway.
>
> http://code.google.com/p/twitter-api/issues/detail?id=1298
>
> On Dec 17, 2:49 pm, Ken Dobruskin  wrote:
> > A closing parenthesis followed by a space seems like a pretty safe bet
> too. I'm sure those rules have been worked out long ago - the RFC was
> published in '94.
> >
> >
> >
> >
> >
> > > Date: Thu, 17 Dec 2009 07:55:14 -0800
> > > Subject: [twitter-dev] Re: URLification
> > > From: dba...@gmail.com
> > > To: twitter-development-talk@googlegroups.com
> >
> > > You can get pretty sophisticated and have lots of heuristics to guess
> > > what the user actually meant. For example, a period followed by a
> > > space and a word that starts with uppercase almost certainly means
> > > that the period was the end of a sentence and not part of the url.
> > > Twitter probably should do this, as it's quite conservative.
> >
> > > Diego
> >
> > > On Dec 17, 11:10 am, Ken Dobruskin  wrote:
> > > > True, but Yahoo! Mail and others do get it right.
> > > > It's been a few years I no longer worry sending an email with a URL
> at the end of a sentence. I wonder how they do it.
> >
> > > > > Date: Thu, 17 Dec 2009 05:48:31 -0800
> > > > > Subject: [twitter-dev] Re: URLification
> > > > > From: dba...@gmail.com
> > > > > To: twitter-development-talk@googlegroups.com
> >
> > > > > Periods and parentheses are valid url characters. Assuming that an
> > > > > adjacent period or closing parenthesis is not part of the url is a
> > > > > gamble. The most sensible urlification includes all valid
> characters
> > > > > until it finds one that clearly delimits the url such as a space.
> >
> > > > >http://www.ietf.org/rfc/rfc1738.txt
> >
> > > > > On Dec 17, 7:13 am, Ken Dobruskin  wrote:
> > > > > > When adding a URL surrounded by parentheses or followed by a
> period, these marks are included in the resulting link. Is a trailing
> whitespace the only workaround? It's ugly and wastes a character.
> >
> > > > > > _
> > > > > > Windows Live Hotmail: Your friends can get your Facebook updates,
> right from Hotmail®.
> http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...
> >
> > > > _
> > > > Windows Live: Friends get your Flickr, Yelp, and Digg updates when
> they e-mail you.
> http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...
> >
> > _
> > Keep your friends updated—even when you’re not signed in.
> http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-act...
>



-- 
Raffi Krikorian
Twitter Platform Team
http://twitter.com/raffi


Re: [twitter-dev] Re: URLification

2009-12-17 Thread Harshad RJ
Although not an API issue, it might be good to track it as such, because
Twitter clients can then follow exactly the same policies that Twitter web
interface does.

If there is a standard regular expression that can be used for detecting a
URL, it could be published as a guideline in the API documentation for
consistency between all clients.

cheers,
Harshad

On Fri, Dec 18, 2009 at 2:45 AM, Raffi Krikorian  wrote:

> its not an API issue -- the API doesn't do any auto-URLification.  however,
> i'll pass this thread off to the web client team.
>
>
> On Thu, Dec 17, 2009 at 1:13 PM, dbasch  wrote:
>
>> I agree. I searched the issues db and didn't find it. Not sure if it
>> belongs as an API issue but I submitted it anyway.
>>
>> http://code.google.com/p/twitter-api/issues/detail?id=1298
>>
>>
-- 
Harshad RJ
http://hrj.wikidot.com


Re: [twitter-dev] Re: URLification

2009-12-18 Thread Chad Etzel
This might be relevant to your interests:
http://daringfireball.net/2009/11/liberal_regex_for_matching_urls

Something definitely changed in the twitter web front-end code which
is borking url matching as of a month or so ago...

-Chad

On Fri, Dec 18, 2009 at 2:44 AM, Harshad RJ  wrote:
> Although not an API issue, it might be good to track it as such, because
> Twitter clients can then follow exactly the same policies that Twitter web
> interface does.
> If there is a standard regular expression that can be used for detecting a
> URL, it could be published as a guideline in the API documentation for
> consistency between all clients.
> cheers,
> Harshad
>
> On Fri, Dec 18, 2009 at 2:45 AM, Raffi Krikorian  wrote:
>>
>> its not an API issue -- the API doesn't do any auto-URLification.
>>  however, i'll pass this thread off to the web client team.
>>
>> On Thu, Dec 17, 2009 at 1:13 PM, dbasch  wrote:
>>>
>>> I agree. I searched the issues db and didn't find it. Not sure if it
>>> belongs as an API issue but I submitted it anyway.
>>>
>>> http://code.google.com/p/twitter-api/issues/detail?id=1298
>>>
>
> --
> Harshad RJ
> http://hrj.wikidot.com
>