: > : I think the initial geosearch feature can start off with
: > : <str>10,20</str> for a point.
: > 
: > +1.
: 
: Fundamentally, how is a string a point?

Fundementally a string is not a point, and a point is not a string -- but 
if you want express the concept of a point in a manner that only uses very 
simple primative types, then a string containing comma seperated numbers 
is a pretty dencet way to do it.  If you'd prefer, a pair of numbers would 
workd just as well...

   <arr><float>10</float><float>20</float></arr>

: > The current XML format SOlr uses was designed to be extremely simple, very
: > JSON-esque, and easily parsable by *anyone* in any langauge, without
: > needing special knowledge of types .
: 
: Whoah. I'm totally confused now. Why have FieldTypes then? When not just use
: Lucene? The use case for FieldTypes is _not_ just for indexing, or querying.
: It's also for representation?

No, actually the use case for FieldTYpes is entirely about the internal 
logic of how Solr should deal with those fields, and how various 
operations should work on them.  FieldTypes can dictate the internal 
representation within the confines of a Lucene index, but they should not 
circumvent the contracts of the response writers in dictating what 
is/isn't a legal response.

XMLWriter.writePrim may be public, which means there is a loophole that 
plugin writers can exploit to add new tag names to the Solr XML response that
violate the contract (and no we don't have a formal XSD or DTD for our 
XML response format, but we still have a very well advertised contract) -- 
but that doesn't mean that code which ships with Solr should exploit those 
loopholes to violate that contract.  People should expect that if they use 
Solr as is without any custom code that the XMLResponseWriter won't all of 
the sudden start including new, non-primitive-ish, XML tags/attributes 
that weren't there before.

That's the entire point of the format as it was designed: break down 
whatever complex data might be involved in a response into easily 
digestible maps/lists of maps/lists of very primitive types that can 
easily be used in any programming langauge.

: allowed for a while I think), why prevent it? Allowing namespaces does _not_
: break anything. 
        ...
: > introducing a new 'point" concept, wether as <point> or as
: > <georss:point/>, is going to break things for people.
: 
: Show me an example, I fundamentally disagree with this.

Ok. Let's start with SolrJ then: take a look at the KnownType enum (line 
151) in XMLResponseParser...

http://svn.apache.org/viewvc/lucene/solr/trunk/src/solrj/org/apache/solr/client/solrj/impl/XMLResponseParser.java?revision=819403&view=markup

...or let's do a random google code search for "solr xml lst" -- check out 
ResponseContentHandler in solrpy...

http://code.google.com/p/solrpy/source/browse/trunk/solr/core.py#841

...I can't write python code to save my life, but I have pretty good idea 
what that code will do if it sees an unexpected tag.

This is how a *LOT* of SOlr client libraries are implemented ... it's not 
an issue of broken XML parsers freaking out about namespaces, it's an 
issue of having a long standing, heavily advertised "schema" for the XML 
response that promises to only ever use a handful of types.  Adding any 
new tags to this format (regardless of how easy it may be because of that 
stupid fucking "public" modifier on XMLWuiter.writePrim) will absolutely 
break things for people.

: And why is that? Isn't the point of SOLR to expand to use cases brought up
: by users of the system? As long as those use cases can be principally
: supported, without breaking backwards compatibility (or in that case,  if
: they do, with large blinking red text that says it), then you're shutting
: people out for 0 benefit? It's aesthetics we're talking about here.

I don't know if i'd say that's the point of Solr, but yes we should 
absolutely try to grow the capabilities of the system as new use cases 
come along.

I am 100% in agreement that the existing "simple" XMLRresponseWriter is 
not for everyone -- Historicly we've tried to maintain a sense of equality 
between all of hte Response writers, so that they all contained the same 
data just with different markup -- but there are clearly cases where it 
would be nice to have a response writer that is allowed to "know more" 
about teh real structure of the data and represent it in a manner that 
more closely represents it's purpose.  This was the entire point behind 
adding FieldType.toOBject, and UUIDFIeld w/the BinaryResponseWriter is a 
good example of the model we should follow in the future.

There is a clear push for Solr to natively be able to generated responses that 
incorporate more "industry standard" XML schemas, and i would love to see 
us start adding functionality to do that, but bastardizing the existing 
XMLResponseWriter format is not the way to do it.

Bottom Line: I am a big fat -1 on any patch to Solr that adds new xml tags 
to the output generated by the XMLResponseWriter.  Feel free to call me 
stubborn, call me obstinant, call me pedantic -- but there is no way in 
hell i'm going to support a patch that does that.



-Hoss

Reply via email to