Hi Brian

I've now set the mysqldb to be default charset utf8, and everything else is
utf8.  collation etc etc.

I think I know what the problem is, and it's a really old one and I feel
foolish now for not realising it earlier.

Our content people are copying and pasting sh*t from word into the content.

:-)

Now that the database is utf8, I'd like to write something to change the
crap from word into a readable value before it get's into the database. 
Using python, so I suppose this is more of a python question than a solr
one.

Anyone got any tips anyway? 



Brian Whitman wrote:
> 
> Post the line of code this is breaking on. Are you pulling the data  
> from mysql as utf8? Are you setting the encoding of Mysqldb?
> 
> Solr has no problems with proper utf8 and you don't need to do  
> anything special to get it to work. Check out the newer solr.py in JIRA.
> 

-- 
View this message in context: 
http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a11118400
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to