Hi Brian I've now set the mysqldb to be default charset utf8, and everything else is utf8. collation etc etc.
I think I know what the problem is, and it's a really old one and I feel foolish now for not realising it earlier. Our content people are copying and pasting sh*t from word into the content. :-) Now that the database is utf8, I'd like to write something to change the crap from word into a readable value before it get's into the database. Using python, so I suppose this is more of a python question than a solr one. Anyone got any tips anyway? Brian Whitman wrote: > > Post the line of code this is breaking on. Are you pulling the data > from mysql as utf8? Are you setting the encoding of Mysqldb? > > Solr has no problems with proper utf8 and you don't need to do > anything special to get it to work. Check out the newer solr.py in JIRA. > -- View this message in context: http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a11118400 Sent from the Solr - User mailing list archive at Nabble.com.