Re: Namespaces in response (SOLR-1586)

Grant Ingersoll Wed, 09 Dec 2009 07:13:50 -0800

Inline...

On Dec 9, 2009, at 9:33 AM, Mattmann, Chris A (388J) wrote:


> Hi Grant, and others,
> 
> My 2 cents (and of course I'm bias having prepared the patch):
> 
>> In SOLR-1586, the proposed patch introduces the concept that a Solr response
>> can declare a namespace for part of the response (in this case, it is using
>> the tags defined by georss.org to specify a point, etc.).
> 
> The patch doesn't introduce this concept -- it makes use of it.
> XMLWriter#writePrim took care of that for me, see Hostetter's comment:
> 
> http://www.lucidimagination.com/search/document/be6fb7ce53c2922d/jira_create
> d_solr_1592_refactor_xmlwriter_starttag_to_allow_arbitrary_attributes_to_be_
> writ
> 
> 
> Since that method is public, anyone could have done this in the past, they
> just chose not to. Moreover, they chose not to in the committed source for
> SOLR, but others who took SOLR, prepared their own XML response writers,
> etc., may have done this same thing as well.
> 
>> 
>> Discussion points:
>> 1. If there are standard namespaces, then people can use them to do fun XML
>> things
> 
> +1. This includes things like validation,

Yeah, but the rest of Solr's response doesn't have it, so...

> strong typing (see SOLR-912 for
> others who also believe that the NamedList BagOfObjects structure, while
> robust, introduces type confusion when unraveling the response), and
> plugging in to other tools. Imagine a GIS tool that required a
> "georss:point" to be returned back somehow. You could argue XSLT could do
> this, but as you note below, it's an extra step. It also _implicitly_ ties
> the representation and typing of a FieldType to something that isn't really
> tied to a field type at all (an XSLT file?)

Agreed.

> 
>> 2. If we allow them, we get all of the other benefits of namespaces...
> 
> For sure -- see above for some examples.
> 
>> 3. The indexing side doesn't support them, so it seems odd to put in 
>> something
>> like <field name="point">55.3 27.9</field> and get back <georss:point
>> name="point"> 55.3 27.9</georss:point>.  At the same time, it seems equally
>> weird to get back <str name="point">...</str> when there is in fact more
>> semantic information available about this particular field that would
>> otherwise require more work by an application to make sense of.
> 
> You got it. I'm not sure why it seems weird -- the translation from
> docs/fields to external representation (via response writers or field type
> representation) is one of the benefits of SOLR IMHO.

It's weird b/c no XML type was specified upfront, but a type was given out on 
the back end.  It's not a show stopper or anything, just an interesting point, 
I think.

> 
>> 4. If we let in other namespaces, we then are opening ourselves to longer
>> responses, etc.  It is also likely the case that there isn't just one
>> standard.  This likely could mean slower responses, etc.
> 
> How does adding in some characters (e.g., an "ns" tag and an associated URL)
> add anything other than noise? We're talking the difference between O(n)
> versus O(n+20) here. Also it's perfectly legit IMHO to say, well if you
> introduce 10, 000 namespaces, well, that's on you, and be prepared for
> slower client/server interactions.

You'd be surprised how slow XML parsing often is, especially for larger 
responses, XML processing can be quite expensive and most of the information in 
verbose at best.   I've seen this on a number of occasions and it is why we 
switched to a binary response format in SolrJ and why I think all clients 
should speak the binary protocol.


> 
>> 5. If people wanted them, they could just do XSLT, but that is an extra step
>> too.
> 
> Yep, that's an extra step, and it's not explicit, like the patch I attached
> is. I tried to take advantage of one of SOLR's extension points in the
> architecture to explicitly tie a representation of a Field to its external
> and internal representation (aka, the point of a FieldType, no?)
>> 
>> An alternative is that we could refactor things a bit and allow the FieldType
>> to specify the tag name instead of it being hardcoded in the writers.  This
>> way people writing FieldTypes could define them.  For instance, we could have
>> FieldType.getTagName() that could be overridden and clients could have tools
>> for introspecting this.
> 
> This is basically what I did right? I did an inline namespace using a
> variant of #writePrm in XMLWriter (#writeCdata) and had the
> FieldType#toExternal method set the tag name, which is allowed by the API.

As Hoss' points out on the thread, I think the longer term goal seems to be to 
be more agnostic of the FieldType, so this would argue against my proposal.

-Grant

Re: Namespaces in response (SOLR-1586)

Reply via email to