On Jul 28, 2009, at 2:25 PM, Chris Hostetter wrote:


: > OK, color me confused about how naming should be done for params. There : > clearly seems to be two camps in Solr-land: 1. those who abbreviate params
: > and 2. those who don't.  Pick your sides, please!  ;-)
:
: Tend towards brevity, but not for the sake of readability.

agreed.

for me it's primarily an issue of huffman encoding:

1) params that are going to be used all the freaking time, by lots of
people, frequently when constructing URLs (which will get sent over the wire millions of times), should be on the shorter side (q, fl, sort, rows,
etc...).

2) params that are going to be used extremely infrequently, and typically hardcoded inot a config in the rare cases where they are used, should be longer and more verbose (the verbosity being an issue of self documenting since people won't be use to seeing them and won't immediately recognize
them)


(Disclaimer: i freely admit that i screwed the pooch on all those dismax params. i came up with those back before it was possible to put defaults in solrconfig.xml, so the "keep things that will be in millions of URLs
going over teh wire" mantra kicked in)

OK, sounds reasonable, although I suspect your frequency based convention is going to be in the eye of the beholder.

[somewhat of an aside, and a rant  :-)  ]

Has anyone actually documented/tested the "cost" of a URL of 100 chars versus one of 200 chars? I don't know much how NIC's work, but I have a hard time believing that makes much difference when it comes to packets, buffers, etc. especially in comparison to optimizing the response side of things.

While I'm not for wasting bits (except in email, where I waste them all the time), I find it curious that so much is put into shortening params in a 200 char URL (if that) to the point of near unreadability, in some cases, and yet the responses (up until the binary response format was added) are so verbose, especially for XML (but even JSON isn't all that succinct) and especially when you throw in highlighting, MLT, etc. Seems to me if we are optimizing for over the wire, we'd be a whole lot better off making sure all the various clients supported the binary response format then making sure "event" is abbreviated to "evt". It's just that I've seen over and over the cost savings one gets from not having to deal with XML (or even JSON, which we just learned over in Mahout). Don't get me wrong, Solr needs an XML response, it's just that the only client that should be using it is the Human one (i.e. via the browser or curl, debugging, etc.) as it is a waste of time when you control both sides of the pipe and are a program.

-Grnt (trying to save bits, one 'a' at a time, oops, there goes more)

Reply via email to