Re: Scrapy logs showing [Partial] during crawling through Proxies

Prabhakar Mon, 09 Nov 2015 01:34:36 -0800

The actual source of the spider is created using Portia UI. 
Also posted in 
http://stackoverflow.com/questions/33606080/portia-spider-logs-showing-partial-during-crawling


On Monday, November 2, 2015 at 8:08:05 PM UTC+5:30, Travis Leleu wrote:
>
> Hi Prabhakar,
>
> Thank you for providing the input URL you are using.  However, it's not 
> really possible to provide assistance without the relevant section of your 
> code, the input data [source urls, which you just provided], and debug 
> output.
>
> You may want to read the Stack Overflow guide to asking good questions.  
> If you put the effort into providing all the information so we can 
> reproduce it, then people are much more likely to put the effort (and their 
> time) to help you.
>
> Check out the SO post for full details 
> <http://stackoverflow.com/help/how-to-ask>
>
> The most relevant part.  Without the relevant parts of your source (note: 
> please do not just paste your full program in here!), it's not possible to 
> debug.  Right now, it looks like you aren't getting any errors -- things 
> are working as you coded them, but your code logic likely has issues.
>
> Make sre to check out the linked anchor below: "How to create a minimal, 
> complete, and verifiable example".
>
>
> ---- FROM SO POST LINKED ABOVE ---
>
> Help others reproduce the problem
>
> Not all questions benefit from including code. But if your problem is 
> *with* code you've written, you should include some. But *don't just copy 
> in your entire program!* Not only is this likely to get you in trouble if 
> you're posting your employer's code, it likely includes a lot of irrelevant 
> details that readers will need to ignore when trying to reproduce the 
> problem. Here are some guidelines:
>
>    - Include just enough code to allow others to reproduce the problem. 
>    For help with this, read How to create a Minimal, Complete, and 
>    Verifiable example <http://stackoverflow.com/help/mcve>.
>    - If it is possible to create a live example of the problem that you 
>    can *link* to (for example, on http://sqlfiddle.com/ or
>    http://jsbin.com/) then do so - but also include the code in your 
>    question itself. Not everyone can access external sites, and the links may 
>    break over time.
>
>
> On Mon, Nov 2, 2015 at 10:22 AM, Prabhakar <[email protected] 
> <javascript:>> wrote:
>
>> Let me share the URL what I am having issue:
>>
>> https://www1.apply2jobs.com/EdwardJonesCareers/ProfExt/index.cfm?fuseaction=mExternal.searchJobs
>>
>>
>> On Monday, November 2, 2015 at 12:27:40 PM UTC+5:30, Prabhakar wrote:
>>>
>>> I have set the domain and I have hidden the original URL in my previous 
>>> post.
>>>
>>> On Friday, October 30, 2015 at 7:51:56 PM UTC+5:30, Travis Leleu wrote:
>>>>
>>>> You need to set the domain you want to crawl.  Example.com isn't a real 
>>>> domain...
>>>>
>>>> On Fri, Oct 30, 2015 at 9:45 AM, Prabhakar <[email protected]> 
>>>> wrote:
>>>>
>>>>> Yes,  the scrapy spider prints traceback.
>>>>>
>>>>> 2015-10-30 17:41:35+0530 [myspider] DEBUG: Crawled (200) <GET 
>>>>> https://example.com/searchid?ss=1&id=1000> (referer: 
>>>>> https://example.com/search?ss=1) ['partial']
>>>>> 2015-10-30 17:41:35+0530 [myspider] DEBUG: Crawled (200) <GET 
>>>>> https://example.com/searchid?ss=2&id=2000> (referer: 
>>>>> https://example.com/search?ss=2) ['partial']
>>>>> 2015-10-30 17:41:35+0530 [myspider] DEBUG: Crawled (200) <GET 
>>>>> https://example.com/searchid?ss=3&id=3000> (referer: 
>>>>> https://example.com/search?ss=3) ['partial']
>>>>>
>>>>>
>>>>> And How can I fix this.
>>>>>
>>>>>
>>>>> On Friday, October 30, 2015 at 2:15:02 PM UTC+5:30, Nikolaos-Digenis 
>>>>> Karagiannis wrote:
>>>>>>
>>>>>> You may be experiencing data loss in your proxy.
>>>>>> How do you know that your crawl fails? Does it print a traceback?
>>>>>>
>>>>>>
>>>>>> On Saturday, 4 January 2014 15:16:45 UTC+2, Shivkumar Agrawal wrote:
>>>>>>>
>>>>>>> Hi 
>>>>>>>
>>>>>>> I have created a crawler in scrapy. It uses a list of proxies to 
>>>>>>> crawl. I have 3-4 sites to crawl daily. During crawling, scrapy logs 
>>>>>>> show 
>>>>>>> ['partial'] message and my crawl is failed on those request. I have 
>>>>>>> spent 
>>>>>>> lot of time on googling with no luck. 
>>>>>>> Can anybody help in this matter
>>>>>>>
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "scrapy-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at http://groups.google.com/group/scrapy-users.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: Scrapy logs showing [Partial] during crawling through Proxies

Reply via email to