I am running into the same thing on OSX and I am not running two crawlers 
at the same time. Any ideas?

On Tuesday, November 22, 2011 at 2:56:54 PM UTC-5, Pablo Hoffman wrote:
>
> Not really, the persistent scheduler feature is intended to support 
> stopping and resuming a crawl, not to distribute the crawl among 
> different nodes/processes (perhaps this should be clarified in the doc).
>
> In order to do that, you would have to do it yourself by storing the 
> urls to visit, splitting the list into multiple chunks, and send each 
> chunk to crawl in a separate process.
>
> Alternatively, there is a scrapy-redis extension [1] that (I think) it 
> allows you do to what you want - worth checking I guess.
>
> [1] https://github.com/darkrho/scrapy-redis
>
> On 11/22/2011 05:22 PM, Алексей Масленников wrote:
> > Oh, and I can do? Can use a hack?
> >
> > On 22 ноя, 20:51, Pablo Hoffman<[email protected]>  wrote:
> >> Are you running two instances of the spider *at the same time*?. Because
> >> that's not supported,
> >>
> >> On 11/22/2011 11:18 AM, Алексей Масленников wrote:
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>> When I run two instances of the spider:
> >>
> >>> ---<exception caught here>    ---
> >>>       File 
> "/usr/local/lib/python2.7/dist-packages/Twisted-11.0.0-py2.7-
> >>> linux-i686.egg/twisted/internet/base.py", line 793, in runUntilCurrent
> >>>         call.func(*call.args, **call.kw)
> >>>       File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.0-
> >>> py2.7.egg/scrapy/utils/reactor.py", line 41, in __call__
> >>>         return self._func(*self._a, **self._kw)
> >>>       File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.0-
> >>> py2.7.egg/scrapy/core/engine.py", line 103, in _next_request
> >>>         if not self._next_request_from_scheduler(spider):
> >>>       File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.0-
> >>> py2.7.egg/scrapy/core/engine.py", line 125, in
> >>> _next_request_from_scheduler
> >>>         request = slot.scheduler.next_request()
> >>>       File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.0-
> >>> py2.7.egg/scrapy/core/scheduler.py", line 55, in next_request
> >>>         return self.mqs.pop() or self._dqpop()
> >>>       File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.0-
> >>> py2.7.egg/scrapy/core/scheduler.py", line 81, in _dqpop
> >>>         d = self.dqs.pop()
> >>>       File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.0-
> >>> py2.7.egg/scrapy/utils/pqueue.py", line 38, in pop
> >>>         m = q.pop()
> >>>       File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.0-
> >>> py2.7.egg/scrapy/squeue.py", line 18, in pop
> >>>         s = super(SerializableQueue, self).pop()
> >>>       File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.0-
> >>> py2.7.egg/scrapy/utils/queue.py", line 160, in pop
> >>>         size, = struct.unpack(self.SIZE_FORMAT, self.f.read())
> >>>     struct.error: unpack requires a string argument of length 4
> >
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to