[twitter-dev] Re: How to count to 140: or, characters vs. bytes vs. entities, third strike

2009-05-20 Thread sillyt...@googlemail.com

I think everyone would like an open system for doing this :)

On May 15, 9:58 pm, Eric Martin  wrote:
> I'd be interested to see a document that details the standards for
> this as well.
>
> On May 15, 12:01 pm, leoboiko  wrote:
>
> > > On May 15, 2:03 pm, leoboiko  wrote:
> > > while one with 71 UTF-8
> > > bytes might not (if they’re all non-GSM, say, ‘ç’ repeated 71 times).
>
> > Sorry, that was a bad example: 71 ‘ç’s take up 142 bytes in UTF-8, not
> > 71.
>
> > Consider instead 71 ‘^’ (or ‘\’, ‘[’ &c.).  These take one byte in
> > UTF-8, but their shortest encoding in SMS is two-byte (in GSM).  So
> > the 71-byte UTF-8 string would take more than 140 bytes as SMS and not
> > fit an SMS.
>
> > Why that matters? Consider a twitter update like this:
>
> >     @d00d: in the console, type "cat ~/file.sql | tr [:upper:]
> > [:lower:] | less".  then you cand read the sql commands without the
> > annoying caps
>
> > That looks like a perfectly reasonable 140-character UTF-8 string, so
> > Twitter won't truncate it or warn about sending a short version.  But
> > its SMS encoding would take some 147 bytes, so the last words would be
> > truncated.
>
> > --
> > Leonardo Boikohttp://namakajiri.net


[twitter-dev] Re: Anti Spam

2009-05-20 Thread sillyt...@googlemail.com

Very true and exactly what we plan to do, although when summed up you
find that a scoring system is best handled on a server and not a
client, i.e. Twitter Search. In the mean time we've got enough things
we can do real-time in javascript for our 'filtered' feed.

In our case we're dealing with events/trending topics which can have
large amounts of data so run-time web app processing time becomes an
issue.

On May 19, 2:49 pm, Abraham Williams <4bra...@gmail.com> wrote:
> On Tue, May 19, 2009 at 06:06, sillyt...@googlemail.com <
>
> I know these are just examples but both of those metrics are available to
> you and nothing is stopping you from restricting data from accounts based on
> those metrics.
>
> --
> Abraham Williams |http://the.hackerconundrum.com
> Hacker |http://abrah.am|http://twitter.com/abraham
> Web608 | Community Evangelist |http://web608.org
> This email is: [ ] blogable [x] ask first [ ] private.
> Sent from Mountain View, CA, United States


[twitter-dev] Re: Anti Spam

2009-05-19 Thread sillyt...@googlemail.com

We had a chat about Twitter spam yesterday and would like a points
based approach to user ranking or spam rating. For those of us working
on 3rd party applications, having a spam score to be able to make
quick decisions on with regard to searches would be very useful.

For example, a new user would have a higher 'spam-rating' than a long
time user. Someone with a huge follow:follower ratio similarly. Given
how spam is used on Twitter, there are several categories which could
be dealt with at run-time on a server but less easily on a live
application.

BTW I worry that to join the abuse team one has to "have what it
takes". Does that mean they hand out large amounts of abuse ?-)

On May 18, 7:12 pm, Doug Williams  wrote:
> We have a team dedicated to controlling the number of spam messages and
> accounts in the system. The number of accounts, sophistication, and
> techniques are constantly growing. The team is doing a great job of
> isolating known attack vectors. Obviously there is still work to be
> done. The abuse team is hiring. If you think you have what it takes, please
> apply:http://twitter.com/jobs
> Thanks,
> Doug
> --
>
> Doug Williams
> Twitter Platform Supporthttp://twitter.com/dougw
>
> On Sat, May 16, 2009 at 8:14 PM, sillyt...@googlemail.com <
>
> sillyt...@googlemail.com> wrote:
>
> > I'm working as part of the #twumpet team and as part of our project
> > we're developing an application as well as running some Twitter events
> > - the first having been Eurovision earlier today.
>
> > As we hit the top trend, #twumpet got - and is still getting -
> > enormous amounts of spam. Spammers are signing up, blitzing messages
> > through one immediately after another, and then moving on to the next
> > account.
>
> > Does anyone know if Twitter are going to stop users firing tweets off
> > one after another so blatently like this? I just checked on a couple
> > of top trends and all I can see is spammers tonight.
>
> > Also, as a developer working on a project which will be dealing with
> > trending topics and popular searches, I need a quick way to throw out
> > spam messages.
>
> > I have a couple of ideas for strategies but would be interested in
> > discussing them, and perhaps a group effort which used Twitter itself
> > for rapid short term spam classification & reporting [through Twitter
> > search or a further API]. The one thing about spammers is they appear
> > and disappear extremely quickly so any lists would be very short and
> > 'live', at least for now...
>
> > @newretro


[twitter-dev] Anti Spam

2009-05-16 Thread sillyt...@googlemail.com

I'm working as part of the #twumpet team and as part of our project
we're developing an application as well as running some Twitter events
- the first having been Eurovision earlier today.

As we hit the top trend, #twumpet got - and is still getting -
enormous amounts of spam. Spammers are signing up, blitzing messages
through one immediately after another, and then moving on to the next
account.

Does anyone know if Twitter are going to stop users firing tweets off
one after another so blatently like this? I just checked on a couple
of top trends and all I can see is spammers tonight.

Also, as a developer working on a project which will be dealing with
trending topics and popular searches, I need a quick way to throw out
spam messages.

I have a couple of ideas for strategies but would be interested in
discussing them, and perhaps a group effort which used Twitter itself
for rapid short term spam classification & reporting [through Twitter
search or a further API]. The one thing about spammers is they appear
and disappear extremely quickly so any lists would be very short and
'live', at least for now...

@newretro