[ https://issues.apache.org/jira/browse/SOLR-216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499913 ]
Mike Klaas commented on SOLR-216: --------------------------------- On 29-May-07, at 12:41 PM, Jason Cater wrote: I've had my solr.py in production use internally for about a month now. So, as you can imagine, I've worked through a few oddball bugs that occasionally pop up. It seems pretty stable for me. Yes, I agree that it is looking good. Since we would be replacing the existing implementation completely, I think that it is worth taking extra care and examining the api choices carefully so we won't have to replace it or deprecate things in the near future. I would prefer to have a complete directory structure (i.e., setup.py, unit tests, samples, etc) instead of just the solr.py file. Would anyone see a problem with this? +1. This would be great--a unittest that could be run against the solr example would be spectacular! Also, on some of your comments: - list comprehensions solely to perform looped execution are harder to parse and slower than explicitly writing a for loop List comprehensions seem to be a matter of contention for some. However, it's a battle I'm not interested in fighting, so changed it to a for loop. It is not a matter of contention for me for use in creating a list, but ISTM less clear and less efficient if the purpose is _solely_ to perform a loop: $ python -m timeit '[i+i for i in xrange(10000)]' 100 loops, best of 3: 1.95 msec per loop $ python -m timeit 'for i in xrange(10000): i+i' 100 loops, best of 3: 1.38 msec per loop - shadowing builtins is generally a bad idea Any shadowing of builtins was unintentional. Did you see specific examples? I run the code through pychecker and pylint to try to avoid such cases. `id` is shadowed in a few places. - all NamedList's appearing in the output are converted to dicts--this loses information (in particular, it will be unnecessarily hard for the user to use highlighting/debug data). Using the python/json response format would prevent this. Not returning highlight/debug data in the standard response format (and yet providing said parameters in the query() method) seems odd. Am I missing something? Oh, they are set as dynamic attributes of Response, I see. Definitely needs documentation. Yes, this needs to be documented. (Please c.f. to my question about allowing a complete directory structure.) - passing fields='' to query() will return all fields, when the desired return is likely no fields I've changed the default for fields= to be '*', instead of None or "". This way, passing in 'fields=""' will result in 'fl=' being passed to the backend. However, I still don't see the point, as passing both 'fl=' and 'fl=*' return the exact same set of fields (i.e., "all") on my test setup. Hmm, what if you pass fields='', score=True? Ideally tha would pass fl=score to the backend, bypassing all stored fields. - it might be better to settle on an api that permits doc/field boosts. How about using a tuple as the field name in the field dict? conn.add_many([{'id': 1, ('field2', 2.33): u"some text"}]) doc boosts could be handled by optionally providing the fielddict as a (<fielddict>, boost) tuple. I agree. I was not aware of field boosts at the time. I'll code this change. Unfortunately, it is still somewhat awkward. In my python client I end up passing (<name>, <value>, <field boost or None>) everywhere, but that clutters up the api considerably. It might be worth taking a look at the ruby client to see what Eric's done for the api. - for 2.5+, a cool addition might be: if sys.version > 2.5 import contextlib def batched(solrconn): solrconn.begin_batch() yield solrconn solrconn.end_batch() batched = contextlib.contextmanager(batched) Use as: with batched(solrconn): solrconn.add(...) solrconn.add(...) solrconn.add(...) Adding... Unfortunately, it does push the required python version to 2.4. Personally, I think that requiring 2.4 is not unreasonable, but I'm somewhat of a bleeding edge guy... [incidently, it would be best to keep comments in JIRA, for posterity] > Improvements to solr.py > ----------------------- > > Key: SOLR-216 > URL: https://issues.apache.org/jira/browse/SOLR-216 > Project: Solr > Issue Type: Improvement > Components: clients - python > Affects Versions: 1.2 > Reporter: Jason Cater > Assignee: Mike Klaas > Priority: Trivial > Attachments: solr.py > > > I've taken the original solr.py code and extended it to include higher-level > functions. > * Requires python 2.3+ > * Supports SSL (https://) schema > * Conforms (mostly) to PEP 8 -- the Python Style Guide > * Provides a high-level results object with implicit data type conversion > * Supports batching of update commands -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.