[twitter-dev] Re: Search queries not working

2009-04-02 Thread feedbackmine

Hi Matt,

I have tried to use language parameter of twitter search and find the
result is very unreliable. For example:
http://search.twitter.com/search?lang=all&q=tweetjobsearch returns 10
results (all in english), but
http://search.twitter.com/search?lang=en&q=tweetjobsearch only returns
3.

I googled this list and it seems you are using n-gram based algorithm
(http://groups.google.com/group/twitter-development-talk/msg/
565313d7b36e8d65). I have found n-gram algorithm works very well for
language detection, but the quality of training data may make a big
difference.

Recently I have developed a language detector (in ruby) myself:
http://github.com/feedbackmine/language_detector/tree/master
It uses wikipedia's data for training, and based on my limited
experience it works well. Actually using wikipedia's data is not my
idea, all credits should go to Kevin Burton (http://feedblog.org/
2005/08/19/ngram-language-categorization-source/ ).

Just thought you may be interested.

@feedbackmine
http://twitter.com/feedbackmine

On Mar 31, 11:22 am, Matt Sanford  wrote:
> Hi there,
>
>      Can you provide an example URL where since_id isn't working so I  
> can try and reproduce the issue? As forlanguage, thelanguage 
> identifier is not a 100% and sometimes makes mistakes. Hopefully not  
> too many mistakes but it definitely does.
>
> Thanks;
>    — Matt Sanford / @mzsanford
>
> On Mar 31, 2009, at 08:14 AM, codepuke wrote:
>
>
>
>
>
> > Hi all;
>
> > I see a few people complaining about the since_id not working.  I too
> > have the same issue - I am currently storing the last executed id and
> > having to check new tweets to make sure their id is greater than my
> > last processed id as a temporary workaround.
>
> > I have also noticed that the filter bylanguageparam also doesn't
> > seem to be working 100% - I notice a few chinese tweets, as well as
> > tweets having a null value forlanguage...


[twitter-dev] Data mining feed is not working today

2009-03-09 Thread feedbackmine

Sometimes I got no result, sometimes I got same old tweets
(1300721813-1300738909). It has been like this for a few hours.

For example:
GET /statuses/public_timeline_partners_nrab481.xml HTTP/1.0
User-Agent: Wget/1.10.2
Accept: */*
Host: twitter.com
Connection: Keep-Alive

HTTP/1.1 200 OK
Date: Mon, 09 Mar 2009 22:15:47 GMT
Server: hi
Content-Encoding: UTF-8
Content-Type: application/xml
Content-Length: 83
Set-Cookie: JSESSIONID=19EBF930711384DEA614920DC9D6A213; Path=/
Cache-Control: max-age=1800
Expires: Mon, 09 Mar 2009 22:45:48 GMT
Connection: close






Is this a known problem?

Thanks.
feedbackmine
http://twitter.com/feedbackmine


[twitter-dev] Open source twitter job search engine

2009-03-08 Thread feedbackmine

Hello all,
  I developed a job search engine for twitter using two weekends, and
thought someone may be interested.

 A few hightlights:
 1. uses twitter data mining feeds to collect data
 2. uses libsvm classifier and a few hardcoed twitter id to identify
job posts
 3. it is a rails web app and uses sphinx for search

 Lessons learned:
 The are lots of recruiters out there (way more than I expected!)
using twitter to re-publish jobs that they have published somewhere
else. Originally I just want to identify jobs posts that are ONLY
available on twitter.

 Demo site is at: http://tweetjobsearch.com/
 Source code is at: http://github.com/feedbackmine/tweetjobsearch/tree/master
 I have documented the journey at: http://twitter.com/feedbackmine

 Thanks!