Re: problems getting data into solr index

Mike Klaas Mon, 18 Jun 2007 20:02:11 -0700

On 18-Jun-07, at 6:27 AM, vanderkerkoff wrote:


Cheesr Mike, read the page, it's starting to get into my brian now.

Django was giving me unicode string, so I did some encoding anddecoding and

now the data is getting into solr, and it's simply not passing the
characters that are cuasing problems, which is great.


Glad to hear that it is working.

2 little things, I'm getting an error when it's trying to optimisethe index
AttributeError: SolrConnection instance has no attribute 'optimise'

You don't know what that is about do you?


Er, it means that SolrConnection has no optimise command.  Instead do

conn.commit(optimize=True)

I'm still on solr1.1 as we were having trouble getting this sort of
interaction to work with 1.2, not sure if it's related.
2. I've used your suggestions to force the output into ascii, butif I tryto force it into utf8, which I though solr would accept, it fails.I'm not
sure why though.

Perhaps this is why: solr.py expects unicode. You can pass it ascii,and it will transparently convert to unicode fine because that is thedefault codec. If you end up with utf-8, it will try to convert tounicode using the ascii codec and fail.

So, you could completely skip the ;encode('ascii', 'ignore') line.Of course, you'd have the characters in the text. I'm not quite surewhat you're after, since leaving it in utf-8 would leave the funnycharacters that you wanted to strip.


-MIke

Re: problems getting data into solr index

Reply via email to