Re: [twitter-dev] Re: parsing out entities from tweets (a.k.a. parsing out hashtags is hard!)

2010-05-14 Thread Adam Green
Disambiguating short URLs and delivering the true URL and title would
be a real plus, not just for developers, but for the target of a URL.
While it does add a load to twitter's servers, it will save many, many
useless hits to the target.

Imagine 100,000 Twitter apps resolving each short URL found in a
tweet. All of them doing it within seconds of the tweet arriving via
the streaming API. It would be an automatic DOS against every site
mentioned in a tweet.

If this sounds hyperbolic, read the APIwiki docs that say 2,000
followers is an expected max. Ha!


On Fri, May 14, 2010 at 9:15 AM, Zhami  wrote:
> +1 for it being optional as well -- keep the bandwidth to a minimum
> for scenarios where it's not needed.
>
> +1 for having short URLs' original (long) URL provided (perhaps also
> an option?)
>


Re: [twitter-dev] Re: parsing out entities from tweets (a.k.a. parsing out hashtags is hard!)

2010-05-14 Thread Raffi Krikorian
>
> Besides, if this is the library used for web, you're not doing it
> right. :)
> For example, to mention URL parsing only, you don't check for valid
> domain names (e.g. www.test.failure is matched as URL),
> some characters are not recognized as part of a link (e.g. "|" in
> "http://translate.google.com/?hl=en#auto|en|bonjour")...
>

all we're trying to do is help people standardize on how they parse stuff.
 making sure you can represent what is a hash tag, a url, a username, etc.,
in the same way that twitter.com does it, can be difficult.

-- 
Raffi Krikorian
Twitter Platform Team
http://twitter.com/raffi


Re: [twitter-dev] Re: parsing out entities from tweets (a.k.a. parsing out hashtags is hard!)

2010-05-13 Thread Raffi Krikorian
yeah - i'm extremely sensitive to that not happening again.  i'll keep that
in mind.  i expect there may be another draft floated around before we start
to roll this out.

On Thu, May 13, 2010 at 11:14 PM, Rich  wrote:

> I can see the  inside some of the entities tag causing some
> developers some problems as it's the same tag name as the status.  Of
> course all of us should be able to handle it, but just look what
> happened with the extra user id tag inside a status
>
> On May 13, 11:11 pm, Raffi Krikorian  wrote:
> > hey glenn.
> >
> > i think something went wrong in the copy and paste -- there should have
> been
> > a space between the URL and the hashtag.
> >
> >
> >
> >
> >
> > On Thu, May 13, 2010 at 11:02 PM, glenn gillen 
> wrote:
> > > Raffi,
> >
> > > This follows on nicely from the presentation at Warblecamp last week
> > > discussing how difficult it is to do this right, and I think a
> > > consistent approach across all clients (including twitter.com,
> > > mobile.twitter, and 3rd party apps) should be priority number 1.
> > > However looking at your example:
> >
> > > On May 13, 10:25 pm, Raffi Krikorian  wrote:
> > > > {
> > > >  "text" : "hey @raffi tell @noradio to check out
> > >http://dev.twitter.com#hot";,
> > > > 
> > > > {
> > > >   "url" : "http://dev.twitter.com";,
> > > >   "indices" : [38, 64]
> > > > },
> > > >   ],
> > > >   "hashtags" : [
> > > > {
> > > >   "text" : "#hot",
> > > >   "indices" : [66, 69]
> > > >   "url" : "http://search.twitter.com/search?q=%23hot";
> > > > }
> > > >   ]
> > > >  }
> >
> > > Without looking at how twitter.com would currently handle that
> > > example, I would have expected the url to be "http://dev.twitter.com/
> > > #hot" and for the tweet to contain no hashtag. If the hashtag always
> > > takes precedence I'd have no way to link to the following without
> > > using a URL shortener:http://oauth.net/core/1.0a/#anchor41
> > > --
> > > Glenn Gillen
> > >http://glenngillen.com/
> >
> > --
> > Raffi Krikorian
> > Twitter Platform Teamhttp://twitter.com/raffi
>



-- 
Raffi Krikorian
Twitter Platform Team
http://twitter.com/raffi


Re: [twitter-dev] Re: parsing out entities from tweets (a.k.a. parsing out hashtags is hard!)

2010-05-13 Thread Raffi Krikorian
hey glenn.

i think something went wrong in the copy and paste -- there should have been
a space between the URL and the hashtag.

On Thu, May 13, 2010 at 11:02 PM, glenn gillen  wrote:

> Raffi,
>
> This follows on nicely from the presentation at Warblecamp last week
> discussing how difficult it is to do this right, and I think a
> consistent approach across all clients (including twitter.com,
> mobile.twitter, and 3rd party apps) should be priority number 1.
> However looking at your example:
>
> On May 13, 10:25 pm, Raffi Krikorian  wrote:
> > {
> >  "text" : "hey @raffi tell @noradio to check out
> http://dev.twitter.com#hot";,
> > 
> > {
> >   "url" : "http://dev.twitter.com";,
> >   "indices" : [38, 64]
> > },
> >   ],
> >   "hashtags" : [
> > {
> >   "text" : "#hot",
> >   "indices" : [66, 69]
> >   "url" : "http://search.twitter.com/search?q=%23hot";
> > }
> >   ]
> >  }
>
> Without looking at how twitter.com would currently handle that
> example, I would have expected the url to be "http://dev.twitter.com/
> #hot" and for the tweet to contain no hashtag. If the hashtag always
> takes precedence I'd have no way to link to the following without
> using a URL shortener: http://oauth.net/core/1.0a/#anchor41
> --
> Glenn Gillen
> http://glenngillen.com/
>



-- 
Raffi Krikorian
Twitter Platform Team
http://twitter.com/raffi


RE: [twitter-dev] Re: parsing out entities from tweets (a.k.a. parsing out hashtags is hard!)

2010-05-13 Thread Brian Smith
Glenn Gillen wrote:
> Without looking at how twitter.com would currently handle that example, I
> would have expected the url to be "http://dev.twitter.com/ #hot" and for
the
> tweet to contain no hashtag. If the hashtag always takes precedence I'd
have no
> way to link to the following without using a URL shortener:
> http://oauth.net/core/1.0a/#anchor41

I think you are overlooking the space between the last slash and "#hot".
URLs cannot contain (un-encoded) spaces.

Regards,
Brian