Re: Commented: (SOLR-216) Improvements to solr.py

Jason Cater Tue, 29 May 2007 12:42:02 -0700

Mike,I've had my solr.py in production use internally for about a month now.So, as you can imagine, I've worked through a few oddball bugs thatoccasionally pop up. It seems pretty stable for me.

I'm planning to upload a new file attachment to this issue containing mychanges, plus fixing the bug reports that were filed against my openticket. But first, a few quick questions....

I would prefer to have a complete directory structure (i.e., setup.py,unit tests, samples, etc) instead of just the solr.py file. Wouldanyone see a problem with this?


Also, on some of your comments:

 - list comprehensions solely to perform looped execution are harder to parse 
and slower than explicitly writing a for loop

List comprehensions seem to be a matter of contention for some.However, it's a battle I'm not interested in fighting, so changed it toa for loop.

 - shadowing builtins is generally a bad idea

Any shadowing of builtins was unintentional. Did you see specificexamples? I run the code through pychecker and pylint to try to avoidsuch cases.

- SolrConnection is an old-style class, but Response is new-style

This was a holdover from the old SolrConnection class I copied from. I'mfixing this.

functionality:

 - why are 'status'/'QTime' returned as floats?

This was just a misunderstanding on my part of what QTime was actuallyreturning. Fixing.

 - all NamedList's appearing in the output are converted to dicts--this loses 
information (in particular, it will be unnecessarily hard for the user to use 
highlighting/debug data).  Using the python/json response format would prevent 
this.  Not returning highlight/debug data in the standard response format (and 
yet providing said parameters in the query() method) seems odd.  Am I missing 
something?  Oh, they are set as dynamic attributes of Response, I see.  
Definitely needs documentation.

Yes, this needs to be documented. (Please c.f. to my question aboutallowing a complete directory structure.)

 - passing fields='' to query() will return all fields, when the desired return 
is likely no fields

I've changed the default for fields= to be '*', instead of None or "".This way, passing in 'fields=""' will result in 'fl=' being passed tothe backend. However, I still don't see the point, as passing both'fl=' and 'fl=*' return the exact same set of fields (i.e., "all") on mytest setup.

 - it might be better to settle on an api that permits doc/field boosts.  How 
about using a tuple as the field name in the field dict?

conn.add_many([{'id': 1, ('field2', 2.33): u"some text"}])

doc boosts could be handled by optionally providing the fielddict as a 
(<fielddict>, boost) tuple.


I agree. I was not aware of field boosts at the time. I'll code this change.

- for 2.5+, a cool addition might be:

if sys.version > 2.5

import contextlibdef batched(solrconn):

          solrconn.begin_batch()
        yield solrconn
        solrconn.end_batch()
  batched = contextlib.contextmanager(batched)

Use as:

with batched(solrconn):
       solrconn.add(...)
       solrconn.add(...)
       solrconn.add(...)


Adding...

Re: Commented: (SOLR-216) Improvements to solr.py

Reply via email to