solr locked itself out

2007-09-17 Thread vanderkerkoff

Hello everyone.

I've been reading some posts on this forum and I thought it best to start my
own post as our situation is different from evveryone elses, isn't it always
:-)

We've got a django powered website that has solr as it's search engine.

We're using the example solr application and starting the java at boot time
with 

java -jar start.jar in the example directory

We've had no problem at all until this morning when I started getting an
error saying that solr was locked.

I checked the /tmp directory and in there was a file called
lucene-75248553b96c7f175a8217320c9b8471-write.lock

It's not a very busy website at all and doesn't have alot of data in it, can
someone get me started on how to make sure this doesn't happen again?

some more information

ulimit is unlimited and cat /proc/sys/fs/file-max 11769

in the /tmp directory are 18 directories all called Jetty_8983__solr and 17
of them have numbers at the end of the directory name.

Sorry I'm such a newbie at this, but any help will be greatly appreciated.
-- 
View this message in context: 
http://www.nabble.com/solr-locked-itself-out-tf4466377.html#a12734891
Sent from the Solr - User mailing list archive at Nabble.com.



DeleteByQuery python syntax for delte all

2007-07-19 Thread vanderkerkoff

Hello everyone

Loving solr, got an idiot question for you.

I have been manually deleting our index in the python interpretor when
testing

from solr import SolrConnection
c = SolrConnection(host='localhost:8983', persistent=False)
allgone = '[ * : * ]'
c.deleteByQuery(query=allgone)
c.commit(optimize-True)

I've forgotten the exact syntax for this line
allgone = '[ * : * ]'

Can't seem to get it right, anyone know what it should be?

Is it '[all:all]' or something?

Any help, greatly appreciated




-- 
View this message in context: 
http://www.nabble.com/DeleteByQuery-python-syntax-for-delte-all-tf4109267.html#a11685509
Sent from the Solr - User mailing list archive at Nabble.com.



Re: DeleteByQuery python syntax for delte all

2007-07-19 Thread vanderkerkoff

roopesh, thank you very much

roopesh-2 wrote:
 
 This should work :
 c.deleteByQyery('id:[* TO *]')
 c.commit()
 
 Regards
 Roopesh
 
 vanderkerkoff wrote:
 Hello everyone

 Loving solr, got an idiot question for you.

 I have been manually deleting our index in the python interpretor when
 testing

 from solr import SolrConnection
 c = SolrConnection(host='localhost:8983', persistent=False)
 allgone = '[ * : * ]'
 c.deleteByQuery(query=allgone)
 c.commit(optimize-True)

 I've forgotten the exact syntax for this line
 allgone = '[ * : * ]'

 Can't seem to get it right, anyone know what it should be?

 Is it '[all:all]' or something?

 Any help, greatly appreciated




   
 
 
 --
 DigitalGlue, India
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/DeleteByQuery-python-syntax-for-delte-all-tf4109267.html#a11687320
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Deleting from index via web

2007-07-12 Thread vanderkerkoff

Done some more digging about this

here's my delete code
def delete(self):
from solr import SolrConnection
c = SolrConnection(host='localhost:8983', persistent=False)
e_url = '/news/' + self.created_at.strftime(%Y/%m/%d) + '/' + 
self.slug
e_url = e_url.encode('ascii','ignore')
c.delete(id=e_url)
c.commit(optimize=True)

I get this back from jetty

INFO: delete(id '/news/2007/07/12/pilly') 0 1

It's not deleting the record form the index though, even if I restart jetty.

I'm wondering if I can use URL's as ID's now.

-- 
View this message in context: 
http://www.nabble.com/Deleting-from-index-via-web-tf4066903.html#a11558048
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Deleting from index via web

2007-07-12 Thread vanderkerkoff

Different tactic now

adding like this
idstring = news:%s; % self.id  
c.add(id=idstring,url_t=e_url,body_t=body4solr,title_t=title4solr,summary_t=summary4solr,contact_name_t=contactname4solr)
c.commit(optimize=True)

Goes in fine, search results show an ID of news:36

Delete like this
delidstring = news:%s; % self.id
c.delete(id=delidstring)
c.commit(optimize=True)

still no joy
-- 
View this message in context: 
http://www.nabble.com/Deleting-from-index-via-web-tf4066903.html#a11559113
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Deleting from index via web

2007-07-12 Thread vanderkerkoff

I/my boss and me worked it out.

The delete funtion in solr.py looks like this
def delete(self, id):
xstr = 'deleteid'+self.escapeVal(`id`)+'/id/delete'
return self.doUpdateXML(xstr)

As we're not passing an integer it get's all c*nty booby, technical term.

So if I rewrite the delete to be like this

  def delete(self, id):
xstr = 'deleteid'+ id + '/id/delete'
print xstr
return self.doUpdateXML(xstr)

It works fine.

There's no need for escapeVal, as I know the words I'll be sending prior to
the ID, in fact, I'm not sure why escapeVal is in there at all if you can't
send it non integer values.

Maybe someone can enlighten us.
-- 
View this message in context: 
http://www.nabble.com/Deleting-from-index-via-web-tf4066903.html#a11560068
Sent from the Solr - User mailing list archive at Nabble.com.



Re: problems getting data into solr index

2007-06-21 Thread vanderkerkoff

Hi Mike, Brian

Thanks for helping with this, and for clearing up my misunderstanding.  Solr
the python module and Solr the package being two different things, I've got
you.

The issues I have are compounded by the fact that we're hovering between
using the Unicode branch of Django and the older branch that has newforms,
both of which have an impact on what I'm trying to do.

It's getting closer to being resolved, and it's down to your advice, so
thanks again.






-- 
View this message in context: 
http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a11230922
Sent from the Solr - User mailing list archive at Nabble.com.



Re: problems getting data into solr index

2007-06-18 Thread vanderkerkoff

Cheesr Mike, read the page, it's starting to get into my brian now.

Django was giving me unicode string, so I did some encoding and decoding and
now the data is getting into solr, and it's simply not passing the
characters that are cuasing problems, which is great.

I'm going to follow the same sort of principle in my python code when I'm
adding the items, so I can keep my solr index up to date as and when things
are entered.

Here's the code I'm using to enter the data.

http://pastie.textmate.org/71367

2 little things, I'm getting an error when it's trying to optimise the index

AttributeError: SolrConnection instance has no attribute 'optimise'

You don't know what that is about do you?

I'm still on solr1.1 as we were having trouble getting this sort of
interaction to work with 1.2, not sure if it's related.

2.  I've used your suggestions to force the output into ascii, but if I try
to force it into utf8, which I though solr would accept, it fails.  I'm not
sure why though.

 



Mike Klaas wrote:
 
 Hi,
 
 To diagnose this properly, you're going to have to figure out if  
 you're dealing with encoded bytes or unicode, and what django does.   
 See http://www.joelonsoftware.com/articles/Unicode.html.
 
 As a short-term solution, you can force things to ascii using:
 
 str(s.decode('ascii', 'ignore')) # assuming s is a bytestring
 u.encode('ascii', 'ignore') # assuming u is a unicode string
 
 -Mike
 

-- 
View this message in context: 
http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a11174969
Sent from the Solr - User mailing list archive at Nabble.com.



Re: problems getting data into solr index

2007-06-18 Thread vanderkerkoff

I think I've resolved this.

I've edited that solr.py file to optimize=True on commit and moved the
commit outside of the loop

http://pastie.textmate.org/71392

The data is going in, it's optmizing once but it's showing as commit = 0 in
the stats page of my solr.

There's no errors that I can see, and the data is definately in the index as
I can now search for it.



vanderkerkoff wrote:
 
 
 2 little things, I'm getting an error when it's trying to optimise the
 index
 
 AttributeError: SolrConnection instance has no attribute 'optimise'
 
 You don't know what that is about do you?
 
 I'm still on solr1.1 as we were having trouble getting this sort of
 interaction to work with 1.2, not sure if it's related.
 
 

-- 
View this message in context: 
http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a11176732
Sent from the Solr - User mailing list archive at Nabble.com.



Re: problems getting data into solr index

2007-06-14 Thread vanderkerkoff

Hello Hoss

Thanks for replying, I tried what you suggested as the iniital step of my
troubleshooting and it outputs it fine.

It was what I suspected initially as well, but thanks for the advice.



hossman_lucene wrote:
 
 
 : I'm running solr1.2 and Jetty, I'm having problems looping through a
 mysql
 : database with python and putting the data into the solr index.
 :
 : Here's the error
 :
 : UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position
 369:
 : ordinal not in range(128)
 
 I may be missing something here, but i don't think that error is coming
 from Solr ... UnicodeDecodeError appears to be a python error message,
 so i suspect the probelm is between MySql and your python script .. i bet
 if yo uchange your script to comment out hte lines where you talk to solr,
 and just read the data from mysql and throw it to /dev/null you'd still
 see that message.
 
 http://wiki.wxpython.org/UnicodeDecodeError
 
 
 -Hoss
 
 
 

-- 
View this message in context: 
http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a5954
Sent from the Solr - User mailing list archive at Nabble.com.



Re: problems getting data into solr index

2007-06-14 Thread vanderkerkoff

Hi Yonik

Here's the output from netcat

POST /solr/update HTTP/1.1
Host: localhost:8983
Accept-Encoding: identity
Content-Length: 83
Content-Type: text/xml; charset=utf-8

that looks Ok to me, but I am a bit twp you see.

:-)

Yonik Seeley wrote:
 
 On 6/13/07, vanderkerkoff [EMAIL PROTECTED] wrote:
 I'm running solr1.2 and Jetty, I'm having problems looping through a
 mysql
 database with python and putting the data into the solr index.

 Here's the error

 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 369:
 ordinal not in range(128)
 
 There are two issues... what char encoding you tell solr to use, via
 Content-type in the HTTP headers (solr defaults to UTF-8), and then if
 what you send matches that coding.
 
 If you can get the complete message (including HTTP headers) that is
 being sent to Solr, that would help people debug the problem.
 
 One easy way is to use netcat to pretend to be solr:
 1) shut down solr
 2) start up netcat on solr's port
   nc -l -p 8983
 3) send your update message from the client as you normally would
 
 -Yonik
 
 

-- 
View this message in context: 
http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a6020
Sent from the Solr - User mailing list archive at Nabble.com.



Re: problems getting data into solr index

2007-06-14 Thread vanderkerkoff

Hi Brian

I've now set the mysqldb to be default charset utf8, and everything else is
utf8.  collation etc etc.

I think I know what the problem is, and it's a really old one and I feel
foolish now for not realising it earlier.

Our content people are copying and pasting sh*t from word into the content.

:-)

Now that the database is utf8, I'd like to write something to change the
crap from word into a readable value before it get's into the database. 
Using python, so I suppose this is more of a python question than a solr
one.

Anyone got any tips anyway? 



Brian Whitman wrote:
 
 Post the line of code this is breaking on. Are you pulling the data  
 from mysql as utf8? Are you setting the encoding of Mysqldb?
 
 Solr has no problems with proper utf8 and you don't need to do  
 anything special to get it to work. Check out the newer solr.py in JIRA.
 

-- 
View this message in context: 
http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a8400
Sent from the Solr - User mailing list archive at Nabble.com.



problems getting data into solr index

2007-06-13 Thread vanderkerkoff

Hello everyone

I'm running solr1.2 and Jetty, I'm having problems looping through a mysql
database with python and putting the data into the solr index.

Here's the error

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 369:
ordinal not in range(128)

I think that means that there is a UTF8 character in the data that is out of
the ascii range.  Please let me know if I'm wrong.

So solr can't decode the character and therefore stops commiting any more
data to the index.

Is there a simple way to tell solr to accept UTF8 characters?

I've read about this topic on your site and on others, so far I'm more
confused than when I started.


-- 
View this message in context: 
http://www.nabble.com/problems-getting-data-into-solr-index-tf3915542.html#a11102282
Sent from the Solr - User mailing list archive at Nabble.com.