On Sat, 20 Jan 2007, Ryan McKinley wrote: : Date: Sat, 20 Jan 2007 19:17:16 -0800 : From: Ryan McKinley <[EMAIL PROTECTED]> : Reply-To: solr-dev@lucene.apache.org : To: solr-dev@lucene.apache.org : Subject: Re: Update Plugins (was Re: Handling disparate data sources in : Solr) : : > : > ...what if we bring that idea back, and let people configure it in the : > solrconfig.xml, using path like names... : > : > <requestParser name="/raw" class="solr.RawPostRequestParser" /> : > <requestParser name="/multi" class="solr.MultiPartRequestParser" /> : > <requestParser name="/nostream" class="solr.SimpleRequestParser" /> : > <requestParser name="/guess" class="solr.UseContentTypeRequestParser" /> : > : > ...but don't make it a *public* interface ... make it package protected, : > or maybe even a private static interface of the Dispatch Filter .. either : > way, don't instantiate instances of it using the plugin-lib ClassLoader, : > make sure it comes from the WAR to only uses the ones provided out of hte : > box.
: I'm on board as long as the URL structure is: : ${path/from/solr/config}?stream.type=raw actually the URL i was suggesting was... ${parser/path/from/solr/config}${handler/path/from/solr/config}?param=val ...i was trying to avoid keeping the parser name out of the query string, so we don't have to do any hack parsing of HttpServletRequest.getQueryString() to get it. basically if you have this... <requestParser name="/raw" class="solr.RawPostRequestParser" /> <requestParser name="/multi" class="solr.MultiPartRequestParser" /> <requestParser name="/nostream" class="solr.SimpleRequestParser" /> <requestHandler name="/update/commit" class="solr.CommitRequestHandler"/> <requestHandler name="/update" class="solr.UpdateRequestHandler" /> <requestHandler name="/xml" class="solr.XmlQueryRequestHandler" /> ...then these urls are all valid... http://localhost:9999/solr/raw/update?param=val ..uses raw post body for update http://localhost:9999/solr/multi/update?param=val ..uses multipart mime for update http://localhost:9999/solr/update?param=val ..no requestParser matched path prefix, so default is choosen and COntent-Type is used to decide where streams come from. but if instead my config looks like this... <requestParser name="" class="solr.MultiPartRequestParser" /> <requestParser name="/raw" class="solr.RawPostRequestParser" /> <requestHandler name="/update/commit" class="solr.CommitRequestHandler"/> <requestHandler name="/update" class="solr.UpdateRequestHandler" /> <requestHandler name="/xml" class="solr.XmlQueryRequestHandler" /> ...then these URLs would fail... http://localhost:9999/solr/raw/update?param=val http://localhost:9999/solr/multi/update?param=val ...because the empty string would match as a parser, but "/raw/update" and "/multi/update" wouldn't match as requestHandlers (the registration of "/raw" as a parser would be useless) this URL would work however... http://localhost:9999/solr/update?param=val ..treat all requetss as if they have multi-part mime streams ...i use this only as an example of what i'm describing ... not sa an example of soemthing we shoudl recommend. The key to all of this being that we'd check parser names against the URL prefix in order from shortest to longest, then check the rest of the path as a requestHandler ... if either of those fail, then the filter would skip the request. What we would probably recommended is that people map the "guess" request parser to "/" so that they could put in all of hte options they want on buffer sizes and such, then map their requestHandlers without a "/" prefix, and use content types correctly. if they really had a reason to want to force one type of parsing, they could register it with a differnet prefix. * default URLs stay clean * no need for an extra "stream.type" param * urls only get ugly if people want them to get ugly because they don't want to make their clients set the mime type correctly. -Hoss