On 18-Jun-07, at 6:27 AM, vanderkerkoff wrote:


Cheesr Mike, read the page, it's starting to get into my brian now.

Django was giving me unicode string, so I did some encoding and decoding and
now the data is getting into solr, and it's simply not passing the
characters that are cuasing problems, which is great.

Glad to hear that it is working.

2 little things, I'm getting an error when it's trying to optimise the index

AttributeError: SolrConnection instance has no attribute 'optimise'

You don't know what that is about do you?

Er, it means that SolrConnection has no optimise command.  Instead do

conn.commit(optimize=True)

I'm still on solr1.1 as we were having trouble getting this sort of
interaction to work with 1.2, not sure if it's related.

2. I've used your suggestions to force the output into ascii, but if I try to force it into utf8, which I though solr would accept, it fails. I'm not
sure why though.

Perhaps this is why: solr.py expects unicode. You can pass it ascii, and it will transparently convert to unicode fine because that is the default codec. If you end up with utf-8, it will try to convert to unicode using the ascii codec and fail.

So, you could completely skip the ;encode('ascii', 'ignore') line. Of course, you'd have the characters in the text. I'm not quite sure what you're after, since leaving it in utf-8 would leave the funny characters that you wanted to strip.

-MIke

Reply via email to