[twitter-dev] Re: Languages available on Twitter.
Adam: On Dec 5, 5:59 am, a.przewo...@yahoo.pl a.przewo...@yahoo.pl wrote: Hello, I'd like to ask will Twitter available in other languages? I think twitter might gain more popularity if it might be available in many languages. I hope You will think about that. Twitter is gradually translating the UI of their web site into other languages. See their blog posts: Growing Around the World http:// blog.twitter.com/2010/04/growing-around-world.html and Coming Soon: Twitter in More Languages http://blog.twitter.com/2009/10/coming- soon-twitter-in-more-languages.html. Perhaps you are interested in Twitter.com localised into Polish. I'm only an outside developer, not a Twitter decision maker, but I expect that Twitter agrees with the value of translating Twitter.com into many languages. It's only a matter of when, with what priority relative to other projects. If you use a different Twitter client than Twitter.com, then of course the developer of that client is responsible for making it available in other languages. And, of course, the Twitter message stream has been open to messages in most languages from the very start. I hope this is helpful for you. -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: parsing out entities from tweets (a.k.a. parsing out hashtags is hard!)
Raffi: On May 13, 2:25 pm, Raffi Krikorian ra...@twitter.com wrote: as shown above, we'll be parsing out all mentioned users, all lists, all included URLs, and all hashtags This is an interesting step forward. The internationalisation considerations can be sticky, though. I did some entity-parsing from tweets as part of my Twanguages project (a language census of Twitter). One discover was that people are in fact using hashtags with non-latin scripts. Another is that some people are using the '#' character without intending to create a hashtage (e.g. we are #2 in line). How will your entity parsing handle non-latin hashtags, latin- character hashtags with accented characters, and strings starting with '#' not intended as hashtags? Also note that URLs can now have non-Latin top-level domain names as well as second-level domain names and other path parts. For instance, http://وزارة-الأتصالات.مصر is a valid URL in the .مصر top-level domain. Will your entity parsing code handle such URLs? In any case, it would be very helpful if the platform team would document exactly what regular expressions govern the entities you recognise. I might not agree with your definition of hashtag syntax, but at least I want to know what it is. See for example the running questions on how to measure the length of a status message. matt sanford (@mzsanford) on our internationalization team released the twitter-text library (http://github.com/mzsanford/twitter-text-rb) to help making parsing easier and standardized (in fact, we use this library ourselves), but we on the Platform team wondered if we could make this even easier for our developers. ... I wasn't aware of this, and I'll take a look. Thank you for the tip! — Jim DeLaHunt, Vancouver, Canada
[twitter-dev] Re: Empty reply from server on Streaming API?
Just to close the loop on my issue: I got some off-list help, Twitter investigated, and it turns out my IP address had been blacklisted in error. The blacklisting is removed, and I'm back in business. I must say it's nice that I could ask a question on this list, and get pretty much immediate attention from the proper Twitter person, and over a weekend at that. Thanks, John Kalucki, and thanks, Twitter. --Jim DeLaHunt, Vancouver, Canada@jdlh http://jdlh.com/en/pr/twanguages.html Twanguages: a language census of Twitter @twanguages On Nov 15, 6:52 am, John Kalucki jkalu...@gmail.com wrote: There are two levels of blacklisting. One is a temporary band that resets every few minutes. This one gives you 401 errors. Then there's an IP black hole that is removed by an operator. Currently the IP black hole sends a TCP RST, but we might might also null route you. You can verify an IP block by attempting to connect from a different network. If you provide an account name, I can look through the logs and see what happened. An IP address can also be helpful. In the absence of these keys, I can only speculate as to what occurred. -John Kaluckihttp://twitter.com/jkalucki Services, Twitter Inc. On Nov 15, 12:54 am, Jim DeLaHunt jim.delah...@gmail.com wrote: John: Thanks very much for the reply. On Nov 14, 8:30 pm, John Kalucki jkalu...@gmail.com wrote: This sounds like you were ignoring HTTP error codes and eventually got blacklisted. Consider:http://apiwiki.twitter.com/Streaming-API-Documentation#Connecting Hmm... I was launching single curl requests, making one connection then breaking it after max 3 seconds. I would then wait 6 minutes before trying to connect again. I didn't record the HTTP result code I got back, but it seems that according to Streaming-API- Documentation#Connecting I was being tremendously conservative. That doc recommends backing off for 10 to 240 seconds on an HTTP error code (200); I always backed off for 360 seconds immediately, whether the HTTP error code was good or bad. How would backing off by *more* than the docs call for get me blacklisted? You can tell for sure by turning off --silent and using -v to see what's going on. You should be getting some sort of message back, or absolutely nothing back. Those codes are not HTTP error codes, they must be some curl artifact. Correct, the codes 6 and 52 are defined by curl. Seehttp://curl.haxx.se/docs/manpage.html. Using -v and other curl options, I see clearly that what I'm getting back is absolutely nothing back: 0 bytes in response to my HTTP query. (That's the meaning of the code 52.) For the last 6 hours, I've polled once per hour (once per 3600 seconds), and this null response has not changed. The docs don't say how to confirm that I've been blacklisted. Any suggestions for how to confirm that? Nor do they say what to do if I am in fact blacklisted. They say that the blacklist lasts an indeterminate period of time, so maybe they are implying I should just wait and the system will list the blacklist itself. The biggest issue, though, is to understand why I could have become blacklisted, when I backed off for 360 seconds after each attempt. Because right now, I don't know what I should do differently. Thanks again for the guidance. --Jim DeLaHunt, Vancouver, Canada �...@jdlh Twanguages: a language census of Twitter @twanguageshttp://jdlh.com/en/pr/twanguages.html Tcpdump is also sometimes useful. -John Kaluckihttp://twitter.com/jkalucki Services, Twitter Inc. On Nov 14, 6:13 pm, Jim DeLaHunt jim.delah...@gmail.com wrote: Am I the only one seeing this? I call the Streaming API 10x/hour. For the last 23 hours or so, I've been getting bad responses every time. I use a cron job to call from the Linux shell: curl --user myid:mypassword --silent --fail --max-time 3 --retry 0http://stream.twitter.com/1/statuses/sample.xml and I get usually a curl return code (52) Empty reply from server, though sometimes (6) name lookup timed out. Same thing happens when I ask for .json instead of .xml. The failures started at the rate of 1-2/hour on 2009/11/13 09:00h UTC (Friday early morning PST), though they became continuous as of 200/11/14 03:24h UTC (Friday evening PST), and remain continuous. Is anyone else calling this API and failing? Or succeeding? in the last 24 hours? Thank you, --Jim DeLaHunt, Vancouver, Canada �...@jdlh Twanguages: a language census of Twitter @twanguageshttp://jdlh.com/en/pr/twanguages.html
[twitter-dev] Re: Empty reply from server on Streaming API?
John: Thanks very much for the reply. On Nov 14, 8:30 pm, John Kalucki jkalu...@gmail.com wrote: This sounds like you were ignoring HTTP error codes and eventually got blacklisted. Consider:http://apiwiki.twitter.com/Streaming-API-Documentation#Connecting Hmm... I was launching single curl requests, making one connection then breaking it after max 3 seconds. I would then wait 6 minutes before trying to connect again. I didn't record the HTTP result code I got back, but it seems that according to Streaming-API- Documentation#Connecting I was being tremendously conservative. That doc recommends backing off for 10 to 240 seconds on an HTTP error code (200); I always backed off for 360 seconds immediately, whether the HTTP error code was good or bad. How would backing off by *more* than the docs call for get me blacklisted? You can tell for sure by turning off --silent and using -v to see what's going on. You should be getting some sort of message back, or absolutely nothing back. Those codes are not HTTP error codes, they must be some curl artifact. Correct, the codes 6 and 52 are defined by curl. See http://curl.haxx.se/docs/manpage.html . Using -v and other curl options, I see clearly that what I'm getting back is absolutely nothing back: 0 bytes in response to my HTTP query. (That's the meaning of the code 52.) For the last 6 hours, I've polled once per hour (once per 3600 seconds), and this null response has not changed. The docs don't say how to confirm that I've been blacklisted. Any suggestions for how to confirm that? Nor do they say what to do if I am in fact blacklisted. They say that the blacklist lasts an indeterminate period of time, so maybe they are implying I should just wait and the system will list the blacklist itself. The biggest issue, though, is to understand why I could have become blacklisted, when I backed off for 360 seconds after each attempt. Because right now, I don't know what I should do differently. Thanks again for the guidance. --Jim DeLaHunt, Vancouver, Canada@jdlh Twanguages: a language census of Twitter @twanguages http://jdlh.com/en/pr/twanguages.html Tcpdump is also sometimes useful. -John Kaluckihttp://twitter.com/jkalucki Services, Twitter Inc. On Nov 14, 6:13 pm, Jim DeLaHunt jim.delah...@gmail.com wrote: Am I the only one seeing this? I call the Streaming API 10x/hour. For the last 23 hours or so, I've been getting bad responses every time. I use a cron job to call from the Linux shell: curl --user myid:mypassword --silent --fail --max-time 3 --retry 0http://stream.twitter.com/1/statuses/sample.xml and I get usually a curl return code (52) Empty reply from server, though sometimes (6) name lookup timed out. Same thing happens when I ask for .json instead of .xml. The failures started at the rate of 1-2/hour on 2009/11/13 09:00h UTC (Friday early morning PST), though they became continuous as of 200/11/14 03:24h UTC (Friday evening PST), and remain continuous. Is anyone else calling this API and failing? Or succeeding? in the last 24 hours? Thank you, --Jim DeLaHunt, Vancouver, Canada �...@jdlh Twanguages: a language census of Twitter @twanguageshttp://jdlh.com/en/pr/twanguages.html
[twitter-dev] Empty reply from server on Streaming API?
Am I the only one seeing this? I call the Streaming API 10x/hour. For the last 23 hours or so, I've been getting bad responses every time. I use a cron job to call from the Linux shell: curl --user myid:mypassword --silent --fail --max-time 3 --retry 0 http://stream.twitter.com/1/statuses/sample.xml and I get usually a curl return code (52) Empty reply from server, though sometimes (6) name lookup timed out. Same thing happens when I ask for .json instead of .xml. The failures started at the rate of 1-2/hour on 2009/11/13 09:00h UTC (Friday early morning PST), though they became continuous as of 200/11/14 03:24h UTC (Friday evening PST), and remain continuous. Is anyone else calling this API and failing? Or succeeding? in the last 24 hours? Thank you, --Jim DeLaHunt, Vancouver, Canada@jdlh Twanguages: a language census of Twitter @twanguages http://jdlh.com/en/pr/twanguages.html
[twitter-dev] Re: Japanese Characters in Image URLs
Happy Canadian Thanksgiving, Kyle: On Oct 12, 12:00 am, Kyle Mulka repalvigla...@yahoo.com wrote: ... Twitter API developers have to deal with non-ASCII characters in image URLs because Twitter doesn't change the name the user gave their image file to something cleaner. The PHP code below is giving me the standard Amazon S3 access denied error message, but if I copy the URL of the image and paste it into my browser, that doesn't happen. What do I need to do to get this to work? I expect the URLs you are getting back are UTF-8 encoded strings. I believe the authoritative spec is RFC-3986 http://tools.ietf.org/html/ rfc3986. My understanding of its contents is: a) the path part of the URL is an octet stream, and b) the web server may interpret that octet stream as it pleases, so it may be in any encoding, but c) it's good practice for agents to present and interpret path parts of URLs as UTF-8 encoded text, and d) any octet in the path part of the URL which aren't in the subset of ASCII permitted in URLs should be percent-encoded, but e) it would be nice for agents to accept unpermitted byte values in the path part of the URL, and f) it would be nice for agents to interpret path parts of URLs as being encoded in UTF-8 unless they know otherwise. As usual, Wikipedia also has a nice writeup. See http://en.wikipedia.org/wiki/Percent-encoding and linked articles. I did an experiment with Firefox 3 which showed it was respecting the above spec. I pasted a URL with non-ASCII UTF-8 characters in it, and it blithely accepted them, perhaps percent-encoded them, and successfully requested the page. Then I visited a URL with percent- encoded characters (non-English versions of Wikipedia are a bounty of such URLs), pasted one of those URLs in to the Firefox location field, and Firefox removed the percent-encoding and displayed the URL as a UTF-8 string. Thus you might want to experiment with revising your code to which handles URLs and other strings from the Twitter API to have UTF- encoded strings, or byte strings with no encoding interpretation. Be ready to apply your own percent-encoding of received URLs per d) above. I don't know the PHP incantations for string encoding, sorry. I do know it differs between PHP 4, PHP 5, and PHP 6. (I just brushed up on the corresponding Python incantations last night, as it happens.) $ch = curl_init('http://twitter.com/users/show.json? screen_name=rennri'); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); $json = curl_exec($ch); curl_close($ch); $data = json_decode($json, true); $ch2 = curl_init($data['profile_image_url']); curl_exec($ch2); curl_close($ch2); -- Kyle Mulkahttp://twilk.com- put your friends' faces on your Twitter background Hope this helps! —Jim DeLaHunt, Vancouver, Canada, http://jdlh.com/ multilingual websites consultant.