Answer accepted, SO-karma sent -)
On Thursday, November 13, 2014 1:38:13 PM UTC-5, Travis Leleu wrote:
>
> Stephane,
>
> What steps did you take to determine there is no JS involved? I loaded
> the page w/o javascript, and while that area of the page had the stub
> content ("Visitas"), the actual data was written there with an ajax request.
>
> You can still load that data using scrapy, it'll just take an additional
> request. The server returns the number of visits in XML, via the script at
> http://www.fincaraiz.com.co/WebServices/Statistics.asmx/GetAdvertVisits?idAdvert=1337688&idASource=40&idType=1001
>
> (try loading that script and you'll see the # of visits for the page you
> provided in the original email).
>
> There is another ajax request that returns "True" for that page, but I'm
> not sure what the data's actual meaning is. Still, it may be useful:
>
>
> http://www.fincaraiz.com.co/WebServices/Statistics.asmx/DetailAdvert?idAdvert=1337688&idType=1001&idASource=40&strCookie=13/11/2014:19-05419&idSession=10hx5wsfbqybyxsywezx0n1r&idOrigin=44
>
> (I cross-posted this answer to your SO question. If you don't mind,
> please send me some sweet sweet SO karma by accepting the answer.)
>
> Thanks,
> Travis
>
>
>
> On Thu, Nov 13, 2014 at 9:56 AM, Stephane Leonard <[email protected]
> <javascript:>> wrote:
>
>> Already posted this on stackoverflow, without an answer. I think it's a
>> very relevant question though.
>>
>> The story : all wanted fields but one get scraped perfectly. The content
>> of the missing field simply doesn't show up in the Scrapy response (as
>> checked in the scrapy shell), while it does show up when i use my browser
>> (actually any browser) to visit the page. In the scrapy response, the
>> expected tags are there, but not the text between the tags.
>>
>> There's no JavaScript involved, but it is a variable that is provided by
>> the server (it's the current number of visits to that particular page). No
>> iframe involved either.
>>
>> Already set the user agent (in the settings-file) to match my browser.
>> Already set the download delay (in the settings-file) to 5.
>>
>> -
>>
>> The page :
>>
>> http://www.fincaraiz.com.co/apartamento-en-venta/bogota/salitre-det-1337688.aspx
>> -
>>
>> Xpath to the wanted element : //*[@id="numAdvertVisits"]
>>
>> What could be the cause of this mystery ?
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "scrapy-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected]
>> <javascript:>.
>> Visit this group at http://groups.google.com/group/scrapy-users.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.