Cheesr Mike, read the page, it's starting to get into my brian now.

Django was giving me unicode string, so I did some encoding and decoding and
now the data is getting into solr, and it's simply not passing the
characters that are cuasing problems, which is great.

I'm going to follow the same sort of principle in my python code when I'm
adding the items, so I can keep my solr index up to date as and when things
are entered.

Here's the code I'm using to enter the data.

http://pastie.textmate.org/71367

2 little things, I'm getting an error when it's trying to optimise the index

AttributeError: SolrConnection instance has no attribute 'optimise'

You don't know what that is about do you?

I'm still on solr1.1 as we were having trouble getting this sort of
interaction to work with 1.2, not sure if it's related.

2.  I've used your suggestions to force the output into ascii, but if I try
to force it into utf8, which I though solr would accept, it fails.  I'm not
sure why though.

 



Mike Klaas wrote:
> 
> Hi,
> 
> To diagnose this properly, you're going to have to figure out if  
> you're dealing with encoded bytes or unicode, and what django does.   
> See http://www.joelonsoftware.com/articles/Unicode.html.
> 
> As a short-term solution, you can force things to ascii using:
> 
> str(s.decode('ascii', 'ignore')) # assuming s is a bytestring
> u.encode('ascii', 'ignore') # assuming u is a unicode string
> 
> -Mike
> 

-- 
View this message in context: 
http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a11174969
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to