On Jul 28, 2009, at 2:25 PM, Chris Hostetter wrote:
: > OK, color me confused about how naming should be done for
params. There
: > clearly seems to be two camps in Solr-land: 1. those who
abbreviate params
: > and 2. those who don't. Pick your sides, please! ;-)
:
: Tend towards brevity, but not for the sake of readability.
agreed.
for me it's primarily an issue of huffman encoding:
1) params that are going to be used all the freaking time, by lots of
people, frequently when constructing URLs (which will get sent over
the
wire millions of times), should be on the shorter side (q, fl, sort,
rows,
etc...).
2) params that are going to be used extremely infrequently, and
typically
hardcoded inot a config in the rare cases where they are used,
should be
longer and more verbose (the verbosity being an issue of self
documenting
since people won't be use to seeing them and won't immediately
recognize
them)
(Disclaimer: i freely admit that i screwed the pooch on all those
dismax
params. i came up with those back before it was possible to put
defaults
in solrconfig.xml, so the "keep things that will be in millions of
URLs
going over teh wire" mantra kicked in)
OK, sounds reasonable, although I suspect your frequency based
convention is going to be in the eye of the beholder.
[somewhat of an aside, and a rant :-) ]
Has anyone actually documented/tested the "cost" of a URL of 100 chars
versus one of 200 chars? I don't know much how NIC's work, but I have
a hard time believing that makes much difference when it comes to
packets, buffers, etc. especially in comparison to optimizing the
response side of things.
While I'm not for wasting bits (except in email, where I waste them
all the time), I find it curious that so much is put into shortening
params in a 200 char URL (if that) to the point of near unreadability,
in some cases, and yet the responses (up until the binary response
format was added) are so verbose, especially for XML (but even JSON
isn't all that succinct) and especially when you throw in
highlighting, MLT, etc. Seems to me if we are optimizing for over the
wire, we'd be a whole lot better off making sure all the various
clients supported the binary response format then making sure "event"
is abbreviated to "evt". It's just that I've seen over and over the
cost savings one gets from not having to deal with XML (or even JSON,
which we just learned over in Mahout). Don't get me wrong, Solr needs
an XML response, it's just that the only client that should be using
it is the Human one (i.e. via the browser or curl, debugging, etc.) as
it is a waste of time when you control both sides of the pipe and are
a program.
-Grnt (trying to save bits, one 'a' at a time, oops, there goes more)