Hello, I'm having an odd issue with one of my projects.
I have implemented a custom middleware that rotates user-agent for each request. The middleware works by reading from a file when the middleware is initialized by putting the contents of the file into a list(in memory). According to me this should work fine, but I am getting a large amount of 400 bad requsts of my crawls? The odd thing is that it works fine if I just put the agents in a list directly instead of reading from file. What can cause this error? Here is my middleware: class UserAgentPool(): def __init__(self): basepath = os.path.dirname(__file__) filepath = os.path.abspath(os.path.join(basepath, "agents.txt")) with open(filepath, 'r') as f: self.agents = f.readlines() def rotate(self): log.msg("Rotating user agent", level=log.DEBUG) agent = self.agents.pop(0) log.msg("Agent popped %s" %agent, level=log.DEBUG) log.msg("[%s]" % ", ".join(map(str, self.agents)), level=log.DEBUG) self.agents.append(agent) return agent class UserAgentRotationMiddleware(object): def __init__(self): self.pool = UserAgentPool() def process_request(self, request, spider): if getattr(spider, 'agent_rotation', None): agent = self.pool.rotate() request.headers.setdefault('User-Agent', agent) log.msg("Setting User-Agent to %s" %request.headers["User-Agent"]) -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+unsubscr...@googlegroups.com. To post to this group, send email to scrapy-users@googlegroups.com. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.