Re: How use Scrapy encoding

Rico A Mada Fri, 20 Mar 2015 06:34:02 -0700

Problem come from my DB encoding. Now it's utf8_unicode_ci and it's works 
great.


Le jeudi 19 mars 2015 18:59:04 UTC+3, Rico A Mada a écrit :
>
> Hi all,
>
> I'm blocked with encodage issue when using Scrapy, hope someone can help 
> me.
>
>    - On my spider : item['title'] = html.xpath('.//h5/text()')
>    - On pipeline : item['title'] = 
>    item['title'].extract()[0].encode('utf-8', 'replace')
>
> It result string like Namontana \xe2\x80\x93 Une attaque \xc3\xa0 main 
> arm\xc3\xa9e avort\xc3\xa9e. I save all item on database (mysql for now).
>
> Now I want to show all this items to a website but my problem is I can't 
> transform *\xe2* (for example) to visual char.
>
> I've already try :
>
>    - Add # -*- coding: utf-8 -*- at begin of all .py file
>    - Use htmlentities or utf8_decode functions when display with PHP code
>    - Add unicode(response.body.decode(response.encoding)).encode('utf-8') on 
>    my spider
>    - Add <meta http-equiv="content-type" content="text/html; 
>    charset=utf-8" /> to my HTML page
>    - Check and convert all file to UTF8 without BOM
>
> For now, my only alternative is to use custom function to replace all char 
> (explain here 
> <http://stackoverflow.com/questions/9736949/how-to-substitute-non-sgml-characters-in-string-using-php>)
>  
> but I thinks they've better solution.
>
> Thanks in advance for your help.
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: How use Scrapy encoding

Reply via email to