Bruce speaks the truth. Not only may it not know the reason, it might not even know a problem has happened. Whatever site you're scraping may change so slightly that your full xpath starts getting the wrong data, or data you don't care about.
The documentation is right on this one, using relative paths and being specific (contains(@href, 'image') etc) is the way to go here. Even so, theres always the potential for something to be incorrect. If you know what the values are going in, you can check with things like len() or if it begins/ends with specific values, those are ways to be sure you're getting the right data from the right spot. "Fragile" is a good word here On Wed, Sep 16, 2015 at 11:14 AM, bruce <[email protected]> wrote: > Hi. > > When dealing with scraping, and using xpath/dom operations, you need > to always keep in mind, that the overall structure of the content is > subject to change. It's fragile. If you use a "complete" xpath from > the root(top) to the item in question, any "change" along the way, can > result in an error. Unless you have sufficient error checking, your > app might not "know" the reason for the error. > > If you create an xpath, that has the "minimum" of attributes to get > you to where/what you need, it's more robust. But, it's still fragile, > just not as fragile as using the complete xpath... > > > > On Wed, Sep 16, 2015 at 5:50 AM, michio basya <[email protected]> wrote: > > Hi, > > > > > > I have a question that why never use full xpath. > > I have been developing crawler with full xpath, and I notice this > sentence > > in the documents. > > > > Are these reasons a tbody problem and a live browser dom problem? Or any > > other reasons? > > If the reasons are only two problems, I will keep developing with full > > xpath. > > So please teach me any other reason to prevent a future problem. > > Thanks, > > > > > http://doc.scrapy.org/en/1.0/topics/firefox.html#caveats-with-inspecting-the-live-browser-dom > > > > -- > > You received this message because you are subscribed to the Google Groups > > "scrapy-users" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to [email protected]. > > To post to this group, send email to [email protected]. > > Visit this group at http://groups.google.com/group/scrapy-users. > > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "scrapy-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/scrapy-users. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
