Re: Scrapyd queue backend to MongoDB

Tiago Lira Tue, 03 May 2016 09:24:12 -0700

Hi, Uncharted

I took a read on the post that you mentioned, and I noticed that several of 
the issues reported have been solved or minimized. And I think that the 
purpose of their use of mongodb was different and heavier, it was to 
store/query items that could reach millions. The purpose of mongodb here on 
this project is just to queue/dequeue the tasks to be performed by scrapyd 
(which was previously being held with SQLite).


But it was nice to know they are using HBase, I'll take a look and try to 
add to the library an interface for those who find it better to use HBase.
Thank you for the advice!


Em terça-feira, 3 de maio de 2016 10:24:01 UTC-3, Uncharted escreveu:
>
> Hi
>
> I'm currently starting to work on the same kind of use case. 
> I found this article which does not recommend mongodb : 
> https://blog.scrapinghub.com/2013/05/13/mongo-bad-for-scraped-data/
>
> They say that you'll have the same lock contention with mongodb : the 
> article was written in 2013 so maybe it's not the case anymore.
>
> And they migrated to HBase which seems to be the right backend, It is used 
> also in the Apache Nutch project.
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: Scrapyd queue backend to MongoDB

Reply via email to