Hello all. I'm new here and this is my first post to this group.
I was studying the Scrapy docs (it's a slow day at work...;-) when I came 
across this:

DjangoItem caveats 

DjangoItem is a rather convenient way to integrate Scrapy projects with 
Django models, but bear in mind that Django ORM may not scale well if you 
scrape a lot of items (ie. millions) with Scrapy. This is because a 
relational backend is often not a good choice for a write intensive 
application (such as a web crawler), specially if the database is highly 
normalized and with many indices.

http://doc.scrapy.org/en/latest/topics/djangoitem.html 

Say what? Explain, please!

Now, I did keep looking, and found 
https://groups.google.com/forum/#!searchin/scrapy-users/DjangoItem/scrapy-users/HsDJ-jM7LvM/ESRlGF6QXcIJ
 "Using DjangoItem step-by-step guide" and the SO post from which it comes. 
Is the max_locks issue that was brought up there the reason for the caveat? 
(Note: I can't access github from work - it's blocked - go figure).

My project is text heavy (government documents) and I need a solid database 
to store my results in. And yes, of course I want to scale. A good database 
is *always *normalized, so what are we talking about here? If you are 
saying don't use an RDBMS for big projects are you just as well saying 
don't use django ORM for big projects? Because the way the caveat is 
worded, it talks about django per se, not djangoitems. (And no, I am not 
inviting a debate about nonrel).

Or should I just not use djangoitems and follow Chris' advice on SO "I 
ended up not using DjangoItem at all which solved all my problems"?

As much clarity, detail, and yes, caveats as you can enlighten me with 
would be GREATLY appreciated. 

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to