[twitter-dev] Re: Twitter4j / showStatus method
Hi Arnaud Currently we are using Twitter4j with the following parameters: • lang is omitted, nl or en • query has several values • since is the day of the last retrieval • rpp is 100 We loop through the results, getting up to 15 page (of 100rpp each so up to 1500 results), starting at page 1. If we find a tweet that is older than the actual previous retrieval date we stop. If we find a page returns less than 100 results we stop. We can replace the since with since_id which will limited the number of results slightly, but we will still need the same number of pages (up to 15). I agree we could use the streaming API then backfill missing data if required, but we still miss the language constraints available from the search API. Thanks again for your help. Luc On May 2, 6:01 pm, Arnaud Meunier arn...@twitter.com wrote: Hey Luc, 1) You should not, indeed. But this tweet was probably posted before its Author turned protected. In that case, it can take some time to reflect on search results. 2) That unfortunately doesn't help me to reproduce your issue. What is the exact request you've been making? When? Full HTTP request/response would help. Concerning tweet IDs, they are not sequential but they are always increasing (on a 1 second interval). So you could use the since_id parameter with the last tweet ID you received. 3) If for some reason you lost connection with the Streaming API, you could still backfill with the same process you're using today :) Arnaud / @rno http://twitter.com/rno On Mon, May 2, 2011 at 3:35 AM, LucMartinPro captain.drinksa...@gmail.comwrote: Hi Arnaud Thank you for your fast reply! I have some more infor for you; 1) If the tweet is protected why does it show up in the tweet list? Is there a way to filter these out? Because if they’re not public and we cannot read / retrieve it, why are we even notified it exists? 2) We started polling every 15 minutes, but these problems started occurring when we started polling every 5 minutes. We do not use a “since_id” contraint, just the date-limited “since” constraint”. That’s because, as far as we understand, the tweet IDs are not incremental so using the “since_id” could cause tweets to be missed. By using “since”, using the last retrieval date, then looping through the results until we find an older result, should give us all new tweets. We currently do not use the streaming API because we can’t miss any tweets. If our retrieval application goes down for 15 minutes for maintenance, we won’t get any notifications. By using a date-limited query (why isn’t there a date/time limited query?) through the search API we can get everything since our last retrieval run, as long as the number of tweets does not exceed 1500 (Twitter’s maximum number of results for a query).” Thanks in advance! Luc On Apr 29, 6:36 pm, Arnaud Meunier arn...@twitter.com wrote: Hey Luc, You're not getting rate limited. Let's take a closer look on these two errors: 1) First error: *Sorry, you are not authorized to see this status* That means the tweet you're trying to get is protected. Only people approved by its author (following him) can read / retrieve it. 2) Second error: *since_id too recent, poll less frequently* Looks like the since_id tweet you provided hasn't been indexed yet. Would be interesting to know the value you used and if you can reproduce this error (using the same since_id value). On an unrelated note, what kind of search requests are you doing? Have you considered the statuses/filter method of the Streaming API? That could be a much more efficient way for you :) - More info on: http://dev.twitter.com/pages/streaming_api_methods#statuses-filter - Twitter4j doc: http://twitter4j.org/en/javadoc/twitter4j/TwitterStream.html#filter(t...) Arnaud / @rno http://twitter.com/rno On Fri, Apr 29, 2011 at 6:53 AM, LucMartinPro captain.drinksa...@gmail.comwrote: Hello We are using our application CommCenter, to search twitter and get full tweet + user data based on query results using the Twitter4j query and showStatus methods. We need both as the query method does not return enough information (like retweet count and follower count), and we can't search using showStatus. Therefore, we first use query to get basic results, then for each call showStatus to get the detailed information. However, we're getting 403 errors such as 403:The request is understood, but it has been refused. An accompanying error message will explain why. This code is used when requests are being denied due to update limits (http:// support.twitter.com/forums/10711/entries/15364). error - Sorry, you are not authorized to see this status. request - /1/statuses/show/62741653433225216.json? include_entities=true Relevant discussions can be on
[twitter-dev] Re: Twitter4j / showStatus method
Hi Arnaud Thank you for your fast reply! I have some more infor for you; 1) If the tweet is protected why does it show up in the tweet list? Is there a way to filter these out? Because if they’re not public and we cannot read / retrieve it, why are we even notified it exists? 2) We started polling every 15 minutes, but these problems started occurring when we started polling every 5 minutes. We do not use a “since_id” contraint, just the date-limited “since” constraint”. That’s because, as far as we understand, the tweet IDs are not incremental so using the “since_id” could cause tweets to be missed. By using “since”, using the last retrieval date, then looping through the results until we find an older result, should give us all new tweets. We currently do not use the streaming API because we can’t miss any tweets. If our retrieval application goes down for 15 minutes for maintenance, we won’t get any notifications. By using a date-limited query (why isn’t there a date/time limited query?) through the search API we can get everything since our last retrieval run, as long as the number of tweets does not exceed 1500 (Twitter’s maximum number of results for a query).” Thanks in advance! Luc On Apr 29, 6:36 pm, Arnaud Meunier arn...@twitter.com wrote: Hey Luc, You're not getting rate limited. Let's take a closer look on these two errors: 1) First error: *Sorry, you are not authorized to see this status* That means the tweet you're trying to get is protected. Only people approved by its author (following him) can read / retrieve it. 2) Second error: *since_id too recent, poll less frequently* Looks like the since_id tweet you provided hasn't been indexed yet. Would be interesting to know the value you used and if you can reproduce this error (using the same since_id value). On an unrelated note, what kind of search requests are you doing? Have you considered the statuses/filter method of the Streaming API? That could be a much more efficient way for you :) - More info on:http://dev.twitter.com/pages/streaming_api_methods#statuses-filter - Twitter4j doc:http://twitter4j.org/en/javadoc/twitter4j/TwitterStream.html#filter(t...) Arnaud / @rno http://twitter.com/rno On Fri, Apr 29, 2011 at 6:53 AM, LucMartinPro captain.drinksa...@gmail.comwrote: Hello We are using our application CommCenter, to search twitter and get full tweet + user data based on query results using the Twitter4j query and showStatus methods. We need both as the query method does not return enough information (like retweet count and follower count), and we can't search using showStatus. Therefore, we first use query to get basic results, then for each call showStatus to get the detailed information. However, we're getting 403 errors such as 403:The request is understood, but it has been refused. An accompanying error message will explain why. This code is used when requests are being denied due to update limits (http:// support.twitter.com/forums/10711/entries/15364). error - Sorry, you are not authorized to see this status. request - /1/statuses/show/62741653433225216.json? include_entities=true Relevant discussions can be on the Internet at: http://www.google.co.jp/search?q=6b80c41cor http://www.google.co.jp/search?q=1b284c1e TwitterException{exceptionCode=[6b80c41c-1b284c1e], statusCode=403, retryAfter=0, rateLimitStatus=RateLimitStatusJSONImpl{remainingHits=339, hourlyLimit=350, resetTimeInSeconds=1304029, secondsUntilReset=900, resetTime=Fri Apr 29 00:19:18 CEST 2011}, version=2.2.1} and 403:The request is understood, but it has been refused. An accompanying error message will explain why. This code is used when requests are being denied due to update limits (http:// support.twitter.com/forums/10711/entries/15364). error - since_id too recent, poll less frequently Relevant discussions can be on the Internet at: http://www.google.co.jp/search?q=d35baff5or http://www.google.co.jp/search?q=0886c892 TwitterException{exceptionCode=[d35baff5-0886c892], statusCode=403, retryAfter=0, rateLimitStatus=null, version=2.2.1} - It seems retrieving the full tweet + user data for each tweet (showStatus method call) counts as a hit so we're hitting our limit really quickly. Do you think there's a way around this or an obvious solution we don't see? whitelisting? Thanks in advance. Luc Martin -- Twitter developer documentation and resources:http://dev.twitter.com/doc API updates via Twitter:http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi
[twitter-dev] Re: Twitter4j / showStatus method
An additional point; our searches are constrained to language which is not possible in the streaming API :) On Apr 29, 6:36 pm, Arnaud Meunier arn...@twitter.com wrote: Hey Luc, You're not getting rate limited. Let's take a closer look on these two errors: 1) First error: *Sorry, you are not authorized to see this status* That means the tweet you're trying to get is protected. Only people approved by its author (following him) can read / retrieve it. 2) Second error: *since_id too recent, poll less frequently* Looks like the since_id tweet you provided hasn't been indexed yet. Would be interesting to know the value you used and if you can reproduce this error (using the same since_id value). On an unrelated note, what kind of search requests are you doing? Have you considered the statuses/filter method of the Streaming API? That could be a much more efficient way for you :) - More info on:http://dev.twitter.com/pages/streaming_api_methods#statuses-filter - Twitter4j doc:http://twitter4j.org/en/javadoc/twitter4j/TwitterStream.html#filter(t...) Arnaud / @rno http://twitter.com/rno On Fri, Apr 29, 2011 at 6:53 AM, LucMartinPro captain.drinksa...@gmail.comwrote: Hello We are using our application CommCenter, to search twitter and get full tweet + user data based on query results using the Twitter4j query and showStatus methods. We need both as the query method does not return enough information (like retweet count and follower count), and we can't search using showStatus. Therefore, we first use query to get basic results, then for each call showStatus to get the detailed information. However, we're getting 403 errors such as 403:The request is understood, but it has been refused. An accompanying error message will explain why. This code is used when requests are being denied due to update limits (http:// support.twitter.com/forums/10711/entries/15364). error - Sorry, you are not authorized to see this status. request - /1/statuses/show/62741653433225216.json? include_entities=true Relevant discussions can be on the Internet at: http://www.google.co.jp/search?q=6b80c41cor http://www.google.co.jp/search?q=1b284c1e TwitterException{exceptionCode=[6b80c41c-1b284c1e], statusCode=403, retryAfter=0, rateLimitStatus=RateLimitStatusJSONImpl{remainingHits=339, hourlyLimit=350, resetTimeInSeconds=1304029, secondsUntilReset=900, resetTime=Fri Apr 29 00:19:18 CEST 2011}, version=2.2.1} and 403:The request is understood, but it has been refused. An accompanying error message will explain why. This code is used when requests are being denied due to update limits (http:// support.twitter.com/forums/10711/entries/15364). error - since_id too recent, poll less frequently Relevant discussions can be on the Internet at: http://www.google.co.jp/search?q=d35baff5or http://www.google.co.jp/search?q=0886c892 TwitterException{exceptionCode=[d35baff5-0886c892], statusCode=403, retryAfter=0, rateLimitStatus=null, version=2.2.1} - It seems retrieving the full tweet + user data for each tweet (showStatus method call) counts as a hit so we're hitting our limit really quickly. Do you think there's a way around this or an obvious solution we don't see? whitelisting? Thanks in advance. Luc Martin -- Twitter developer documentation and resources:http://dev.twitter.com/doc API updates via Twitter:http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
Re: [twitter-dev] Re: Twitter4j / showStatus method
Hey Luc, 1) You should not, indeed. But this tweet was probably posted before its Author turned protected. In that case, it can take some time to reflect on search results. 2) That unfortunately doesn't help me to reproduce your issue. What is the exact request you've been making? When? Full HTTP request/response would help. Concerning tweet IDs, they are not sequential but they are always increasing (on a 1 second interval). So you could use the since_id parameter with the last tweet ID you received. 3) If for some reason you lost connection with the Streaming API, you could still backfill with the same process you're using today :) Arnaud / @rno http://twitter.com/rno On Mon, May 2, 2011 at 3:35 AM, LucMartinPro captain.drinksa...@gmail.comwrote: Hi Arnaud Thank you for your fast reply! I have some more infor for you; 1) If the tweet is protected why does it show up in the tweet list? Is there a way to filter these out? Because if they’re not public and we cannot read / retrieve it, why are we even notified it exists? 2) We started polling every 15 minutes, but these problems started occurring when we started polling every 5 minutes. We do not use a “since_id” contraint, just the date-limited “since” constraint”. That’s because, as far as we understand, the tweet IDs are not incremental so using the “since_id” could cause tweets to be missed. By using “since”, using the last retrieval date, then looping through the results until we find an older result, should give us all new tweets. We currently do not use the streaming API because we can’t miss any tweets. If our retrieval application goes down for 15 minutes for maintenance, we won’t get any notifications. By using a date-limited query (why isn’t there a date/time limited query?) through the search API we can get everything since our last retrieval run, as long as the number of tweets does not exceed 1500 (Twitter’s maximum number of results for a query).” Thanks in advance! Luc On Apr 29, 6:36 pm, Arnaud Meunier arn...@twitter.com wrote: Hey Luc, You're not getting rate limited. Let's take a closer look on these two errors: 1) First error: *Sorry, you are not authorized to see this status* That means the tweet you're trying to get is protected. Only people approved by its author (following him) can read / retrieve it. 2) Second error: *since_id too recent, poll less frequently* Looks like the since_id tweet you provided hasn't been indexed yet. Would be interesting to know the value you used and if you can reproduce this error (using the same since_id value). On an unrelated note, what kind of search requests are you doing? Have you considered the statuses/filter method of the Streaming API? That could be a much more efficient way for you :) - More info on: http://dev.twitter.com/pages/streaming_api_methods#statuses-filter - Twitter4j doc: http://twitter4j.org/en/javadoc/twitter4j/TwitterStream.html#filter(t...) Arnaud / @rno http://twitter.com/rno On Fri, Apr 29, 2011 at 6:53 AM, LucMartinPro captain.drinksa...@gmail.comwrote: Hello We are using our application CommCenter, to search twitter and get full tweet + user data based on query results using the Twitter4j query and showStatus methods. We need both as the query method does not return enough information (like retweet count and follower count), and we can't search using showStatus. Therefore, we first use query to get basic results, then for each call showStatus to get the detailed information. However, we're getting 403 errors such as 403:The request is understood, but it has been refused. An accompanying error message will explain why. This code is used when requests are being denied due to update limits (http:// support.twitter.com/forums/10711/entries/15364). error - Sorry, you are not authorized to see this status. request - /1/statuses/show/62741653433225216.json? include_entities=true Relevant discussions can be on the Internet at: http://www.google.co.jp/search?q=6b80c41cor http://www.google.co.jp/search?q=1b284c1e TwitterException{exceptionCode=[6b80c41c-1b284c1e], statusCode=403, retryAfter=0, rateLimitStatus=RateLimitStatusJSONImpl{remainingHits=339, hourlyLimit=350, resetTimeInSeconds=1304029, secondsUntilReset=900, resetTime=Fri Apr 29 00:19:18 CEST 2011}, version=2.2.1} and 403:The request is understood, but it has been refused. An accompanying error message will explain why. This code is used when requests are being denied due to update limits (http:// support.twitter.com/forums/10711/entries/15364). error - since_id too recent, poll less frequently Relevant discussions can be on the Internet at: http://www.google.co.jp/search?q=d35baff5or http://www.google.co.jp/search?q=0886c892