Re: [CODE4LIB] twarc and 30-day limitation

Eric Lease Morgan Tue, 05 May 2020 08:43:34 -0700

On May 5, 2020, at 8:22 AM, Edward Summers <e...@pobox.com> wrote:

> Like Francis and Darnelle said, Twitter's primary free search API is limited 
> to the last 7 days of activity. The so called "Standard" search API is what 
> twarc uses to gather data when you `twarc search …`
> 
> However a couple years ago Twitter added the Premium Search API [1] which is 
> a hybrid approach that lets you search two endpoints (30 day and full 
> archive), and is engineered to move you from collecting data for free to 
> paying Twitter as you (inevitably) want to gather more.
> 
> From your email it sounds like you want to use the Full Archive endpoint? We 
> have had this on the Documenting the Now roadmap to add premium support to 
> twarc but haven't quite got around to it yet.
> 
> I went ahead and created a GitHub issue for you to track our progress [2]. It 
> actually shouldn't be too difficult to add, so if you have a present need let 
> us know so we can prioritize it higher.
> 
> //Ed
> 
> PS. As Francis mentioned twint gets around Twitter's API constraints by 
> scraping Twitter's search results web page. Scraping comes with its own set 
> of complexities, the biggest one is that Twitter actively work to prevent it, 
> which (in my experience) can make twint a bit unpredictable to use at times.
> 
> [1] https://developer.twitter.com/en/docs/tweets/search/overview/premium
> [2] https://github.com/DocNow/twarc/issues/326



  Ed, this make sense, and makes me feel better; it makes me feel as if I am 
not really doing anything incorrectly. Thank you. --Eric Morgan

Re: [CODE4LIB] twarc and 30-day limitation

Reply via email to