solr locked itself out
Hello everyone. I've been reading some posts on this forum and I thought it best to start my own post as our situation is different from evveryone elses, isn't it always :-) We've got a django powered website that has solr as it's search engine. We're using the example solr application and starting the java at boot time with java -jar start.jar in the example directory We've had no problem at all until this morning when I started getting an error saying that solr was locked. I checked the /tmp directory and in there was a file called lucene-75248553b96c7f175a8217320c9b8471-write.lock It's not a very busy website at all and doesn't have alot of data in it, can someone get me started on how to make sure this doesn't happen again? some more information ulimit is unlimited and cat /proc/sys/fs/file-max 11769 in the /tmp directory are 18 directories all called Jetty_8983__solr and 17 of them have numbers at the end of the directory name. Sorry I'm such a newbie at this, but any help will be greatly appreciated. -- View this message in context: http://www.nabble.com/solr-locked-itself-out-tf4466377.html#a12734891 Sent from the Solr - User mailing list archive at Nabble.com.
DeleteByQuery python syntax for delte all
Hello everyone Loving solr, got an idiot question for you. I have been manually deleting our index in the python interpretor when testing from solr import SolrConnection c = SolrConnection(host='localhost:8983', persistent=False) allgone = '[ * : * ]' c.deleteByQuery(query=allgone) c.commit(optimize-True) I've forgotten the exact syntax for this line allgone = '[ * : * ]' Can't seem to get it right, anyone know what it should be? Is it '[all:all]' or something? Any help, greatly appreciated -- View this message in context: http://www.nabble.com/DeleteByQuery-python-syntax-for-delte-all-tf4109267.html#a11685509 Sent from the Solr - User mailing list archive at Nabble.com.
Re: DeleteByQuery python syntax for delte all
roopesh, thank you very much roopesh-2 wrote: This should work : c.deleteByQyery('id:[* TO *]') c.commit() Regards Roopesh vanderkerkoff wrote: Hello everyone Loving solr, got an idiot question for you. I have been manually deleting our index in the python interpretor when testing from solr import SolrConnection c = SolrConnection(host='localhost:8983', persistent=False) allgone = '[ * : * ]' c.deleteByQuery(query=allgone) c.commit(optimize-True) I've forgotten the exact syntax for this line allgone = '[ * : * ]' Can't seem to get it right, anyone know what it should be? Is it '[all:all]' or something? Any help, greatly appreciated -- DigitalGlue, India -- View this message in context: http://www.nabble.com/DeleteByQuery-python-syntax-for-delte-all-tf4109267.html#a11687320 Sent from the Solr - User mailing list archive at Nabble.com.
Re: Deleting from index via web
Done some more digging about this here's my delete code def delete(self): from solr import SolrConnection c = SolrConnection(host='localhost:8983', persistent=False) e_url = '/news/' + self.created_at.strftime(%Y/%m/%d) + '/' + self.slug e_url = e_url.encode('ascii','ignore') c.delete(id=e_url) c.commit(optimize=True) I get this back from jetty INFO: delete(id '/news/2007/07/12/pilly') 0 1 It's not deleting the record form the index though, even if I restart jetty. I'm wondering if I can use URL's as ID's now. -- View this message in context: http://www.nabble.com/Deleting-from-index-via-web-tf4066903.html#a11558048 Sent from the Solr - User mailing list archive at Nabble.com.
Re: Deleting from index via web
Different tactic now adding like this idstring = news:%s; % self.id c.add(id=idstring,url_t=e_url,body_t=body4solr,title_t=title4solr,summary_t=summary4solr,contact_name_t=contactname4solr) c.commit(optimize=True) Goes in fine, search results show an ID of news:36 Delete like this delidstring = news:%s; % self.id c.delete(id=delidstring) c.commit(optimize=True) still no joy -- View this message in context: http://www.nabble.com/Deleting-from-index-via-web-tf4066903.html#a11559113 Sent from the Solr - User mailing list archive at Nabble.com.
Re: Deleting from index via web
I/my boss and me worked it out. The delete funtion in solr.py looks like this def delete(self, id): xstr = 'deleteid'+self.escapeVal(`id`)+'/id/delete' return self.doUpdateXML(xstr) As we're not passing an integer it get's all c*nty booby, technical term. So if I rewrite the delete to be like this def delete(self, id): xstr = 'deleteid'+ id + '/id/delete' print xstr return self.doUpdateXML(xstr) It works fine. There's no need for escapeVal, as I know the words I'll be sending prior to the ID, in fact, I'm not sure why escapeVal is in there at all if you can't send it non integer values. Maybe someone can enlighten us. -- View this message in context: http://www.nabble.com/Deleting-from-index-via-web-tf4066903.html#a11560068 Sent from the Solr - User mailing list archive at Nabble.com.
Re: problems getting data into solr index
Hi Mike, Brian Thanks for helping with this, and for clearing up my misunderstanding. Solr the python module and Solr the package being two different things, I've got you. The issues I have are compounded by the fact that we're hovering between using the Unicode branch of Django and the older branch that has newforms, both of which have an impact on what I'm trying to do. It's getting closer to being resolved, and it's down to your advice, so thanks again. -- View this message in context: http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a11230922 Sent from the Solr - User mailing list archive at Nabble.com.
Re: problems getting data into solr index
Cheesr Mike, read the page, it's starting to get into my brian now. Django was giving me unicode string, so I did some encoding and decoding and now the data is getting into solr, and it's simply not passing the characters that are cuasing problems, which is great. I'm going to follow the same sort of principle in my python code when I'm adding the items, so I can keep my solr index up to date as and when things are entered. Here's the code I'm using to enter the data. http://pastie.textmate.org/71367 2 little things, I'm getting an error when it's trying to optimise the index AttributeError: SolrConnection instance has no attribute 'optimise' You don't know what that is about do you? I'm still on solr1.1 as we were having trouble getting this sort of interaction to work with 1.2, not sure if it's related. 2. I've used your suggestions to force the output into ascii, but if I try to force it into utf8, which I though solr would accept, it fails. I'm not sure why though. Mike Klaas wrote: Hi, To diagnose this properly, you're going to have to figure out if you're dealing with encoded bytes or unicode, and what django does. See http://www.joelonsoftware.com/articles/Unicode.html. As a short-term solution, you can force things to ascii using: str(s.decode('ascii', 'ignore')) # assuming s is a bytestring u.encode('ascii', 'ignore') # assuming u is a unicode string -Mike -- View this message in context: http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a11174969 Sent from the Solr - User mailing list archive at Nabble.com.
Re: problems getting data into solr index
I think I've resolved this. I've edited that solr.py file to optimize=True on commit and moved the commit outside of the loop http://pastie.textmate.org/71392 The data is going in, it's optmizing once but it's showing as commit = 0 in the stats page of my solr. There's no errors that I can see, and the data is definately in the index as I can now search for it. vanderkerkoff wrote: 2 little things, I'm getting an error when it's trying to optimise the index AttributeError: SolrConnection instance has no attribute 'optimise' You don't know what that is about do you? I'm still on solr1.1 as we were having trouble getting this sort of interaction to work with 1.2, not sure if it's related. -- View this message in context: http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a11176732 Sent from the Solr - User mailing list archive at Nabble.com.
Re: problems getting data into solr index
Hello Hoss Thanks for replying, I tried what you suggested as the iniital step of my troubleshooting and it outputs it fine. It was what I suspected initially as well, but thanks for the advice. hossman_lucene wrote: : I'm running solr1.2 and Jetty, I'm having problems looping through a mysql : database with python and putting the data into the solr index. : : Here's the error : : UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 369: : ordinal not in range(128) I may be missing something here, but i don't think that error is coming from Solr ... UnicodeDecodeError appears to be a python error message, so i suspect the probelm is between MySql and your python script .. i bet if yo uchange your script to comment out hte lines where you talk to solr, and just read the data from mysql and throw it to /dev/null you'd still see that message. http://wiki.wxpython.org/UnicodeDecodeError -Hoss -- View this message in context: http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a5954 Sent from the Solr - User mailing list archive at Nabble.com.
Re: problems getting data into solr index
Hi Yonik Here's the output from netcat POST /solr/update HTTP/1.1 Host: localhost:8983 Accept-Encoding: identity Content-Length: 83 Content-Type: text/xml; charset=utf-8 that looks Ok to me, but I am a bit twp you see. :-) Yonik Seeley wrote: On 6/13/07, vanderkerkoff [EMAIL PROTECTED] wrote: I'm running solr1.2 and Jetty, I'm having problems looping through a mysql database with python and putting the data into the solr index. Here's the error UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 369: ordinal not in range(128) There are two issues... what char encoding you tell solr to use, via Content-type in the HTTP headers (solr defaults to UTF-8), and then if what you send matches that coding. If you can get the complete message (including HTTP headers) that is being sent to Solr, that would help people debug the problem. One easy way is to use netcat to pretend to be solr: 1) shut down solr 2) start up netcat on solr's port nc -l -p 8983 3) send your update message from the client as you normally would -Yonik -- View this message in context: http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a6020 Sent from the Solr - User mailing list archive at Nabble.com.
Re: problems getting data into solr index
Hi Brian I've now set the mysqldb to be default charset utf8, and everything else is utf8. collation etc etc. I think I know what the problem is, and it's a really old one and I feel foolish now for not realising it earlier. Our content people are copying and pasting sh*t from word into the content. :-) Now that the database is utf8, I'd like to write something to change the crap from word into a readable value before it get's into the database. Using python, so I suppose this is more of a python question than a solr one. Anyone got any tips anyway? Brian Whitman wrote: Post the line of code this is breaking on. Are you pulling the data from mysql as utf8? Are you setting the encoding of Mysqldb? Solr has no problems with proper utf8 and you don't need to do anything special to get it to work. Check out the newer solr.py in JIRA. -- View this message in context: http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a8400 Sent from the Solr - User mailing list archive at Nabble.com.
problems getting data into solr index
Hello everyone I'm running solr1.2 and Jetty, I'm having problems looping through a mysql database with python and putting the data into the solr index. Here's the error UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 369: ordinal not in range(128) I think that means that there is a UTF8 character in the data that is out of the ascii range. Please let me know if I'm wrong. So solr can't decode the character and therefore stops commiting any more data to the index. Is there a simple way to tell solr to accept UTF8 characters? I've read about this topic on your site and on others, so far I'm more confused than when I started. -- View this message in context: http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a11102282 Sent from the Solr - User mailing list archive at Nabble.com.