The actual source of the spider is created using Portia UI. Also posted in http://stackoverflow.com/questions/33606080/portia-spider-logs-showing-partial-during-crawling
On Monday, November 2, 2015 at 8:08:05 PM UTC+5:30, Travis Leleu wrote: > > Hi Prabhakar, > > Thank you for providing the input URL you are using. However, it's not > really possible to provide assistance without the relevant section of your > code, the input data [source urls, which you just provided], and debug > output. > > You may want to read the Stack Overflow guide to asking good questions. > If you put the effort into providing all the information so we can > reproduce it, then people are much more likely to put the effort (and their > time) to help you. > > Check out the SO post for full details > <http://stackoverflow.com/help/how-to-ask> > > The most relevant part. Without the relevant parts of your source (note: > please do not just paste your full program in here!), it's not possible to > debug. Right now, it looks like you aren't getting any errors -- things > are working as you coded them, but your code logic likely has issues. > > Make sre to check out the linked anchor below: "How to create a minimal, > complete, and verifiable example". > > > ---- FROM SO POST LINKED ABOVE --- > > Help others reproduce the problem > > Not all questions benefit from including code. But if your problem is > *with* code you've written, you should include some. But *don't just copy > in your entire program!* Not only is this likely to get you in trouble if > you're posting your employer's code, it likely includes a lot of irrelevant > details that readers will need to ignore when trying to reproduce the > problem. Here are some guidelines: > > - Include just enough code to allow others to reproduce the problem. > For help with this, read How to create a Minimal, Complete, and > Verifiable example <http://stackoverflow.com/help/mcve>. > - If it is possible to create a live example of the problem that you > can *link* to (for example, on http://sqlfiddle.com/ or > http://jsbin.com/) then do so - but also include the code in your > question itself. Not everyone can access external sites, and the links may > break over time. > > > On Mon, Nov 2, 2015 at 10:22 AM, Prabhakar <[email protected] > <javascript:>> wrote: > >> Let me share the URL what I am having issue: >> >> https://www1.apply2jobs.com/EdwardJonesCareers/ProfExt/index.cfm?fuseaction=mExternal.searchJobs >> >> >> On Monday, November 2, 2015 at 12:27:40 PM UTC+5:30, Prabhakar wrote: >>> >>> I have set the domain and I have hidden the original URL in my previous >>> post. >>> >>> On Friday, October 30, 2015 at 7:51:56 PM UTC+5:30, Travis Leleu wrote: >>>> >>>> You need to set the domain you want to crawl. Example.com isn't a real >>>> domain... >>>> >>>> On Fri, Oct 30, 2015 at 9:45 AM, Prabhakar <[email protected]> >>>> wrote: >>>> >>>>> Yes, the scrapy spider prints traceback. >>>>> >>>>> 2015-10-30 17:41:35+0530 [myspider] DEBUG: Crawled (200) <GET >>>>> https://example.com/searchid?ss=1&id=1000> (referer: >>>>> https://example.com/search?ss=1) ['partial'] >>>>> 2015-10-30 17:41:35+0530 [myspider] DEBUG: Crawled (200) <GET >>>>> https://example.com/searchid?ss=2&id=2000> (referer: >>>>> https://example.com/search?ss=2) ['partial'] >>>>> 2015-10-30 17:41:35+0530 [myspider] DEBUG: Crawled (200) <GET >>>>> https://example.com/searchid?ss=3&id=3000> (referer: >>>>> https://example.com/search?ss=3) ['partial'] >>>>> >>>>> >>>>> And How can I fix this. >>>>> >>>>> >>>>> On Friday, October 30, 2015 at 2:15:02 PM UTC+5:30, Nikolaos-Digenis >>>>> Karagiannis wrote: >>>>>> >>>>>> You may be experiencing data loss in your proxy. >>>>>> How do you know that your crawl fails? Does it print a traceback? >>>>>> >>>>>> >>>>>> On Saturday, 4 January 2014 15:16:45 UTC+2, Shivkumar Agrawal wrote: >>>>>>> >>>>>>> Hi >>>>>>> >>>>>>> I have created a crawler in scrapy. It uses a list of proxies to >>>>>>> crawl. I have 3-4 sites to crawl daily. During crawling, scrapy logs >>>>>>> show >>>>>>> ['partial'] message and my crawl is failed on those request. I have >>>>>>> spent >>>>>>> lot of time on googling with no luck. >>>>>>> Can anybody help in this matter >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "scrapy-users" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To post to this group, send email to [email protected]. >>>>> Visit this group at http://groups.google.com/group/scrapy-users. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
