Thank you that helps.

Am Samstag, 24. Mai 2014 23:26:39 UTC+2 schrieb Jordi Llonch:
>
> Christoph,
>
> You can scale up scrapy using different approaches. It depends on many 
> factors: using scrapyd, using celery, etc...
>
> I will suggest you find what suits you best in terms of storage, network 
> and computational requirements because there's big performance differences.
>
> To me, one of the best fits for storage is using HBase, or even better 
> Hypertable. 
>
> Cheers,
> Jordi
>
>
> 2014-05-24 1:00 GMT+10:00 christoph <[email protected] <javascript:>>
> :
>
>> Hi,
>>
>> I wonder what is a good choice for an environment for a scalable scrapy 
>> project similar to scrapinghub?
>> Starting with a single vserver/root-server for crawling and data storing 
>> with the possibility to add additional servers when I need more scraping 
>> power or database space. 
>> According to a blog entry (
>> http://blog.scrapinghub.com/2013/07/26/introducing-dash/), scrapinghub 
>> is using Cloudera CDH (run on which OS?) and they store their data in 
>> HBase. So this is a good choice?
>>
>> Is there any information how to setup scrapy in a CDH environment and 
>> saving data into HBase?
>>
>> Thank you,
>> Christoph
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "scrapy-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected]<javascript:>
>> .
>> Visit this group at http://groups.google.com/group/scrapy-users.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to