Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-22 Thread Chris Hostetter
: > > 3) there's a comment in RequestHandlerBase.init about "indexOf" that : > > comes form the existing impl in DismaxRequestHandler -- but doesn't match : > > the new code ... i also wasn't certain that the change you made matches : > I just copied the code from DismaxRequestHandler and made s

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-22 Thread Chris Hostetter
: > throw new SolrException( 400, "missing parameter: "+p ); : > : > This will return 400 with a message "missing parameter: " + p. : > : > Exceptions or SolrExceptions with code=500 || code<100 are sent to : > client with status code 500 and a full stack trace. : : That all seems ideal to me, b

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-21 Thread Erik Hatcher
On Jan 21, 2007, at 2:39 PM, Yonik Seeley wrote: On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > So is everyone happy with the way that errors are currently reported? > If not, now (or right after this is committed), is the time to change > that. /solr/select/qt="myhandler" shou

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-21 Thread Yonik Seeley
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > I don't think i'll have time to look at your new patch today, design wise > i think you are right, but there was still stuff that needed to be > refactored out of core.update and into the UpdateHandler wasn't there? > Yes, I avoided doing

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-21 Thread Ryan McKinley
I don't think i'll have time to look at your new patch today, design wise i think you are right, but there was still stuff that needed to be refactored out of core.update and into the UpdateHandler wasn't there? Yes, I avoided doing that in an effort to minimize refactoring and focus just on a

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-21 Thread Yonik Seeley
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > So is everyone happy with the way that errors are currently reported? > If not, now (or right after this is committed), is the time to change > that. /solr/select/qt="myhandler" should be backward compatible, but > /solr/myhandler doesn't

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-21 Thread Ryan McKinley
So is everyone happy with the way that errors are currently reported? If not, now (or right after this is committed), is the time to change that. /solr/select/qt="myhandler" should be backward compatible, but /solr/myhandler doesn't need to be. Same for the update stuff. In SOLR-104, all ex

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-21 Thread Yonik Seeley
On 1/21/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : > The bugaboo is if the POST data is NOT in fact : > application/x-www-form-urlencoded but the user agent says it is -- as : > both of you have indicated can be the case when using curl. Could that : > be why Yonik thought POST params was

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-21 Thread Chris Hostetter
: > The bugaboo is if the POST data is NOT in fact : > application/x-www-form-urlencoded but the user agent says it is -- as : > both of you have indicated can be the case when using curl. Could that : > be why Yonik thought POST params was broken? : : Correct. That's the format that post.sh in

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-21 Thread Chris Hostetter
: Great! I just posted an update to SOLR-104 that I hope will make you happy. Dude ... i can *not* keep up with you. : If i'm following our discussion correctly, I *think* this takes care : of all the major issues we have. I don't think i'll have time to look at your new patch today, design wi

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-21 Thread Yonik Seeley
On 1/21/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: At the bottom of this email is a quick and dirty servlet i just tried to prove to myself that posting with params in the URL and the body worked fine ... I tried that by simply posting to the Solr standard request handler (it echoes params

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-21 Thread Yonik Seeley
On 1/21/07, J.J. Larrea <[EMAIL PROTECTED]> wrote: The bugaboo is if the POST data is NOT in fact application/x-www-form-urlencoded but the user agent says it is -- as both of you have indicated can be the case when using curl. Could that be why Yonik thought POST params was broken? Correct

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-21 Thread Ryan McKinley
The nut shell being: i'm totally on board with Ryan's simple URL scheme, having a single RequestParser/SolrRequestBuilder, going with an entirely "inspection" based approach for deciding where the streams come from, and leaving all mention of parsers or "stream.type" out of the URL. (because i h

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-21 Thread J.J. Larrea
At 1:20 AM -0800 1/21/07, Chris Hostetter wrote: >: We need code to do that anyway since getParameterMap() doesn't support >: getting params from the URL if it's a POST (I believe I tried this in >: the past and it didn't work). > >Uh ... i'm pretty sure you are mistaken ... yep, i've just checked

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-21 Thread Chris Hostetter
: > ...i was trying to avoid keeping the parser name out of the query string, : > so we don't have to do any hack parsing of : > HttpServletRequest.getQueryString() to get it. : : We need code to do that anyway since getParameterMap() doesn't support : getting params from the URL if it's a POST (I

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Yonik Seeley
On 1/20/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : I'm on board as long as the URL structure is: : ${path/from/solr/config}?stream.type=raw actually the URL i was suggesting was... ${parser/path/from/solr/config}${handler/path/from/solr/config}?param=val ...i was trying to avoid ke

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Chris Hostetter
On Sat, 20 Jan 2007, Ryan McKinley wrote: : Date: Sat, 20 Jan 2007 19:17:16 -0800 : From: Ryan McKinley <[EMAIL PROTECTED]> : Reply-To: solr-dev@lucene.apache.org : To: solr-dev@lucene.apache.org : Subject: Re: Update Plugins (was Re: Handling disparate data sources in :

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Ryan McKinley
...what if we bring that idea back, and let people configure it in the solrconfig.xml, using path like names... ...but don't make it a *public* interface ... make it package protected, or maybe even a private static interface of the Dispatch Filter .. either way, don't instantiate i

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Chris Hostetter
(the three of us are online way to much ... for crying out loud it's a saturday night folks!) : In my opinion, I don't think we need to worry about it for the : *default* handler. That is not a very difficult constraint and, there : is no one out there expecting to be able to post parameters in

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Yonik Seeley
On 1/20/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: > It would be: > http://${context}/${path}?stream.type=post Yes! Feels like a much more natural place to me than as part of the path of the URL. Just need to hash out meaningful param names/values? Oh, and I'm more interested in the semantics

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Yonik Seeley
On 1/20/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > >- put everyone > > understands how to put something in a URL. if nothing else, think of > > putting the "parsetype" in the URL as a checksum that the RequestParaser > > can use to validate it's assumptions -- if it's not there, then it can

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Ryan McKinley
>- put everyone > understands how to put something in a URL. if nothing else, think of > putting the "parsetype" in the URL as a checksum that the RequestParaser > can use to validate it's assumptions -- if it's not there, then it can do > all of the intellegent things you think it should do, bu

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Ryan McKinley
> consider the example you've got on your test.html page: "POST - with query > string" ... that doesn't obey the typical semantics of a POST with a query > string ... if you used the methods on HttpServletRequest to get the params > it would give you all the params it found both in the query stri

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Yonik Seeley
On 1/20/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: but the HTTP Client libraries in vaious languages don't allways make it easy to set Content-type -- and even if they do that doesn't mean the person using that library knows how to use it properly - I think we have to go with common usages.

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Yonik Seeley
On 1/20/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: Ryan: this patch truely does kick ass ... we can probably simplify a lot of the Legacy stuff by leveraging your new StandardRequestBuilder -- but that can be done later. Much is already done by the looks of it. i'm stil really not liking

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Chris Hostetter
: To be clear, (with the current implementation in SOLR-104) you would : have to put this in your solrconfig.xml : : : : Notice the preceding '/'. I think this is a strong indication that : someone *wants* /select to behave distinctly. crap ... i totally misread that ... so if people have a req

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Chris Hostetter
: I just posted a new patch on SOLR-104. I think it addresses most of : the issues we have discussed. (Its a little difficult to know as it : has been somewhat circular) I was going to reply to your points one : by one, but i think that would just make the discussion more confusing : then it a

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Ryan McKinley
easy thing to deal with just by scoping the URLs .. put something, ANYTHING, in front of these urls, that isn't "select" or "update" and I'll let you and Yonik decide this one. I'm fine either way, but I really don't see a problem letting people easily override URLs. I actually think it is a

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Chris Hostetter
: > that scares me ... not only does it rely on the client code sending the : > correct content-type : : Not really... that would perhaps be the default, but the parser (or a : handler) can make intelligent decisions about that. : : If you put the parser in the URL, then there's *that* to be messe

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Chris Hostetter
: > A user should be confident that they can pick anyname they possily want : > for their plugin, and it won't collide with any future addition we might : > add to Solr. : : But that doesn't seem possible unless we make user plugins : second-class citizens by scoping them differently. In the even

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Ryan McKinley
I just posted a new patch on SOLR-104. I think it addresses most of the issues we have discussed. (Its a little difficult to know as it has been somewhat circular) I was going to reply to your points one by one, but i think that would just make the discussion more confusing then it already is!

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Ryan McKinley
> > I'm not sure what "it" is in the above sentence ... i believe from the > context of the rest of hte message you are you refering to > using a ServletFilter instead of a Servlet -- i honestly have no opinion > about that either way. I thought a filter required you to open up the WAR file and c

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-20 Thread Alan Burlison
Chris Hostetter wrote: : 1) I think it should be a ServletFilter applied to all requests that : will only process requests with a registered handler. I'm not sure what "it" is in the above sentence ... i believe from the context of the rest of hte message you are you refering to using a Servlet

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Yonik Seeley
On 1/20/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : I have imagined the single default parser handles *all* the cases you : just mentioned. A ... a lot of confusing things make more sense now. .. but some things are more confusing: If there is only one parser, and it decides wha

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Yonik Seeley
On 1/20/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: the thing about Solr, is there really aren't a lot of "defaults" in the sense you mean ... there is just an example -- people might copy the example, but if they don't have something in their solrconfig, most things just aren't there I

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Ryan McKinley
i would relaly feel a lot happier with something like these that you mentioned... If it will make you happier, then I think its a good idea! (even if i don't see it as a Problem) : /solr/dispatch/update/xml : /solr/cmd/update/xml : /solr/handle/update/xml : /solr/do/update/xml http

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Chris Hostetter
: I have imagined the single default parser handles *all* the cases you : just mentioned. A ... a lot of confusing things make more sense now. .. but some things are more confusing: If there is only one parser, and it decides what to do based entirely on param names and HTTP headers,

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Ryan McKinley
: : This would drop the ':' from my proposed URL and change the scheme to look like: : /parser/path/the/parser/knows/how/to/extract/?params i was totally okay with the ":" syntax (although we should double check if ":" is actaully a legal unescaped URL character) .. but i'm confused by this new

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Chris Hostetter
: > then all is fine and dandy ... but what happens if someone tries to : > configure a plugin with the name "admin" ... now all of the existing admin : that is exactly what you would expect to happen if you map a handler : to /admin. The person configuring solrconfig.xml is saying "Hey, use :

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Yonik Seeley
On 1/20/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > what!? .. really? ... you don't think the ones i mentioned before are > things we should support out of the box? > > - no stream parser (needed for simple GETs) > - single stream from raw post body (needed for current updates > -

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Chris Hostetter
: > This would give people a relativly easy way to implement 'restful' : > URLs if they need to. (but they would have to edit web.xml) : : A handler could alternately get the rest of the path (absent params), right? only if the RequestParser adds it to the SolrRequest as a SolrParam. : > Unit t

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Ryan McKinley
what!? .. really? ... you don't think the ones i mentioned before are things we should support out of the box? - no stream parser (needed for simple GETs) - single stream from raw post body (needed for current updates - multiple streams from multipart mime in post body (needed for SOLR

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Chris Hostetter
: The RequestParser is not be part of the core API - It would be a : helper function for Servlets and Filters that call the core API. It : could be configured in web.xml rather then solrconfig.xml. A : RequestDispatcher (Servlet or Filter) would be configured with a : single RequestParser. : : T

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Ryan McKinley
On 1/19/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : First Ryan, thank you for your patience on this *very* long hash I could not agree more ... as i was leaving work this afternoon, it occured to me "I really hope Ryan realizes i like all of his ideas, i'm just wondering if they can be bet

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Chris Hostetter
: First Ryan, thank you for your patience on this *very* long hash I could not agree more ... as i was leaving work this afternoon, it occured to me "I really hope Ryan realizes i like all of his ideas, i'm just wondering if they can be better" -- most people I work with don't have the stamina to

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Yonik Seeley
First Ryan, thank you for your patience on this *very* long hash session. Most wouldn't last that long unless it were a flame war ;-) And thanks to Hoss, who seems to have the highest read+response bandwidth of anyone I've ever seen (I'll admit I've only been selectively reading this thread, with

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Ryan McKinley
(Note: this is different then what i have suggested before. Treat it as brainstorming on how to take what i have suggested and mesh it with your concerns) What if: The RequestParser is not be part of the core API - It would be a helper function for Servlets and Filters that call the core API.

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Yonik Seeley
On 1/19/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: All that said, this could just as cleanly map everything to: /solr/dispatch/update/xml /solr/cmd/update/xml /solr/handle/update/xml /solr/do/update/xml thoughts? That was my original assumption (because I was thinking of using servle

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Ryan McKinley
then all is fine and dandy ... but what happens if someone tries to configure a plugin with the name "admin" ... now all of the existing admin pages break. that is exactly what you would expect to happen if you map a handler to /admin. The person configuring solrconfig.xml is saying "Hey, use

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Chris Hostetter
: > On 1/19/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : > > whoa ... hold on a minute, even if we use a ServletFilter do do all of the : > > dispatching instead of a Servlet we still need a base path right? : > I thought that's what the filter gave you... the ability to filter any : > URL to

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Ryan McKinley
On 1/19/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 1/19/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: > whoa ... hold on a minute, even if we use a ServletFilter do do all of the > dispatching instead of a Servlet we still need a base path right? I thought that's what the filter gave you...

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Yonik Seeley
On 1/19/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: whoa ... hold on a minute, even if we use a ServletFilter do do all of the dispatching instead of a Servlet we still need a base path right? I thought that's what the filter gave you... the ability to filter any URL to the /solr webapp, and

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Chris Hostetter
: 1) I think it should be a ServletFilter applied to all requests that : will only process requests with a registered handler. I'm not sure what "it" is in the above sentence ... i believe from the context of the rest of hte message you are you refering to using a ServletFilter instead of a Servl

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-19 Thread Ryan McKinley
Ok, now i think I get what you are suggesting. The differences are that: 1) I think it should be a ServletFilter applied to all requests that will only process requests with a registered handler. 2) I think the RequestParser should take care off parsing ContentStreams *and* SolrParams - not just

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-18 Thread Chris Hostetter
: > Ah ... this is the one problem with high volume on an involved thread ... : > i'm sending replies to messages you write after you've already read other : > replies to other messages you sent and changed your mind :) : Should we start a new thread? I don't think it would make a differnece ...

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-18 Thread Ryan McKinley
: I was... then you talked me out of it! You are correct, the client : should determine the RequestParser independent of the RequestHandler. Ah ... this is the one problem with high volume on an involved thread ... i'm sending replies to messages you write after you've already read other repli

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-18 Thread Ryan McKinley
Cool. I think i need more examples... concrete is good :-) I don't quite grok your format below... is it one line or two? /path/defined/in/solrconfig:parser?params /${handler}:${parser} Is that simply /${handler}:${parser}?params yes. the ${} is just to show what is extracted from the req

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-18 Thread Chris Hostetter
: However, I'm not yet convinced the benefits are worth the costs. If : the number of RequestParsers remain small, and within the scope of : being included in the core, that functionality could just be included : in a single non-pluggable RequestParser. : : I'm not convinced is a bad idea either,

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-18 Thread Chris Hostetter
: I was... then you talked me out of it! You are correct, the client : should determine the RequestParser independent of the RequestHandler. Ah ... this is the one problem with high volume on an involved thread ... i'm sending replies to messages you write after you've already read other replie

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-18 Thread Yonik Seeley
On 1/18/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: On 1/18/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On 1/18/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > Yes, this proposal would fix the URL structure to be > > /path/defined/in/solrconfig:parser?params > > /${handler}:${parser} > > > >

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-18 Thread Ryan McKinley
On 1/18/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 1/18/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > Yes, this proposal would fix the URL structure to be > /path/defined/in/solrconfig:parser?params > /${handler}:${parser} > > I *think* this cleanly handles most cases cleanly and simply. Th

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-18 Thread Yonik Seeley
On 1/18/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: Yes, this proposal would fix the URL structure to be /path/defined/in/solrconfig:parser?params /${handler}:${parser} I *think* this cleanly handles most cases cleanly and simply. The only exception is where you want to extract variables from

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-18 Thread Ryan McKinley
I'm confused by your sentence "A RequestParser converts a HttpServletRequest to a SolrRequest." .. i thought you were advocating that the servlet parse the URL to pick a RequestHandler, and then the RequestHandler dicates the RequestParser? I was... then you talked me out of it! You are corr

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-18 Thread Yonik Seeley
OK, trying to catch up on this huge thread... I think I see why it's become more complicated than I originally envisioned. What I originally thought: 1) add a way to get a Reader or InputStream from SolrQueryRequest, and then reuse it for updates too 2) use the plugin name in the URL 3) write cod

RE: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-18 Thread Chris Hostetter
: > With all this talk about plugins, registries etc., /me can't help : > thinking that this would be a good time to introduce the Spring IoC : > container to manage this stuff. I don't have a lot of familiarity with spring except for the XML configuration file used for telling the spring context

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-18 Thread Chris Hostetter
: I think the confusion is that (in my view) the RequestParser is the : *only* object able to touch the stream. I don't think anything should : happen between preProcess() and process(); A RequestParser converts a : HttpServletRequest to a SolrRequest. Nothing else will touch the : servlet requ

RE: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-17 Thread Cook, Jeryl
.. Jeryl Cook -Original Message- From: Alan Burlison [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 16, 2007 10:52 AM To: solr-dev@lucene.apache.org Subject: Re: Update Plugins (was Re: Handling disparate data sources in Solr) Bertrand Delacretaz wrote: > With all this talk about p

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-17 Thread Ryan McKinley
On 1/17/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : I'm not sure i underestand preProcess( ) and what it gets us. it gets us the abiliity for a RequestParser to be able to pull out the raw InputStream from the HTTP POST body, and make it available to the RequestHandler as a ContentStream a

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-17 Thread Chris Hostetter
: I'm not sure i underestand preProcess( ) and what it gets us. it gets us the abiliity for a RequestParser to be able to pull out the raw InputStream from the HTTP POST body, and make it available to the RequestHandler as a ContentStream and/or it can wait untill the servlet has parsed the URL t

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-17 Thread Ryan McKinley
I'm not sure i underestand preProcess( ) and what it gets us. I like the model that 1. The URL path selectes the RequestHandler 2. RequestParser = RequestHandler.getRequestParser() (typically from its default params) 3. SolrRequest = RequestParser.parse( HttpServletRequest ) 4. handler.handleRe

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-17 Thread Chris Hostetter
Acctually, i have to amend that ... it occured to me in my slep last night that calling HttpServletRequest.getInputStream() wasn't safe unless we *now* the Requestparser wasnts it, and will close it if it's non-null, so the API for preProcess would need to look more like this... interface Po

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-17 Thread J.J. Larrea
At 11:48 PM -0800 1/16/07, Chris Hostetter wrote: >yeah ... once we have a RequestHandler doing that work, and populating a >SolrQueryResponse with it's result info, it >would probably be pretty trivial to make an extremely bare-bones >LegacyUpdateOutputWRiter that only expected that simple mount o

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-17 Thread Alan Burlison
Ryan McKinley wrote: In addition, consider the case where you want to index a SVN repository. Yes, this could be done in SolrRequestParser that logs in and returns the files as a stream iterator. But this seems like more 'work' then the RequestParser is supposed to do. Not to mention you woul

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-17 Thread Alan Burlison
Chris Hostetter wrote: i'm totally on board now ... the RequestParser decides where the streams come from if any (post body, file upload, local file, remote url, etc...); the RequestHandler decides what it wants to do with those streams, and has a library of DocumentProcessors it can pick from t

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-17 Thread Chris Hostetter
talking about the URL structure made me realize that the Servlet should dicate the URL structure and the param parsing, but it should do it after giving the RequestParser a crack at any streams it wants (actually i think that may be a direct quote from JJ ... can't remember now) ... *BUT* the Requ

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-17 Thread Ryan McKinley
data and wrote it out in the current update response format .. so the current SolrUpdateServlet could be completley replaced with a simple url mapping... /update --> /select?qt=xmlupdate&wt=legacyxmlupdate Using the filter method above, it could (and i think should) be mapped to: /update

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Ryan McKinley
On 1/16/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : >I left out "micro-plugins" because i don't quite have a good answer : >yet :) This may a place where a custom dispatcher servlet/filter : >defined in web.xml is the most appropriate solution. : : If the issue is munging HTTPServletReques

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Ryan McKinley
kind of like a binary stream equivilent to the way analyzers can be customized -- is thta kind of what you had in mind? exactly. interface SolrDocumentParser { public init(NamedList args); Document parse(SolrParams p, ContentStream content); } yes

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Chris Hostetter
: > - Revise the XML-based update code (broken out of SolrCore into a : > RequestHandler) to use all the above. : : +++1, that's been needed forever. yeah ... once we have a RequestHandler doing that work, and populating a SolrQueryResponse with it's result info, it would probably be pretty trivi

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Erik Hatcher
On Jan 17, 2007, at 1:41 AM, Chris Hostetter wrote: : The number of people writing update plugins will be small compared to : the number of users using the external HTTP API (the URL + query : parameters, and the relationship URL-wise between different update : formats). My main concern is ma

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Chris Hostetter
: >I left out "micro-plugins" because i don't quite have a good answer : >yet :) This may a place where a custom dispatcher servlet/filter : >defined in web.xml is the most appropriate solution. : : If the issue is munging HTTPServletRequest information, then a proper : separation of concerns sug

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Chris Hostetter
: > : In addition to RequestProcessors, maybe there should be a general : > : DocumentProcessor : > : interface SolrDocumentParser : > : { : > : Document parse(ContentStream content); : > : } : > what else would the RequestProcessor do if it was delegating all of the : > parsing to something e

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Chris Hostetter
: > So to understand better: : > : > user request -> micro-plugin -> RequestHandler -> ResponseHandler : or: : : HttpServletRequest -> SolrRequestParser -> SolrRequestProcessor -> : SolrResponse -> SolrResponseWriter specifically what i had in mind was something like this... class SolrUberSe

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Chris Hostetter
: The number of people writing update plugins will be small compared to : the number of users using the external HTTP API (the URL + query : parameters, and the relationship URL-wise between different update : formats). My main concern is making *that* as nice and utilitarian as : possible, and a

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Yonik Seeley
On 1/16/07, J.J. Larrea <[EMAIL PROTECTED]> wrote: - Revise the XML-based update code (broken out of SolrCore into a RequestHandler) to use all the above. +++1, that's been needed forever. If one has the time, I'd also advocate moving to StAX (via woodstox for Java5, but it's built into Java6)

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Yonik Seeley
On 1/16/07, J.J. Larrea <[EMAIL PROTECTED]> wrote: >POST: > if( multipart ) { > read all form fields into parameter map. This should use the same req.getParameterMap as for GET, which Servlet 2.4 says is suppose to be automatically by the servlet container if the payload is application/x-www-

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Yonik Seeley
On 1/15/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : The most important issue is to nail down the external HTTP interface. I'm not sure if i agree with that statement .. i would think that figuring out the "model" or how updates should be handled in a generic way, what all of the "Plugin" ty

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread J.J. Larrea
I'm in frantic deadline mode so I'm just going to throw in some (hopefully) short comments... At 11:02 PM -0800 1/15/07, Ryan McKinley wrote: >>the one thing that still seems missing is those "micro-plugins" i was >> [SNIP] >> >> interface SolrRequestParser { >> SolrRequest process( HttpServ

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Alan Burlison
Bertrand Delacretaz wrote: With all this talk about plugins, registries etc., /me can't help thinking that this would be a good time to introduce the Spring IoC container to manage this stuff. More info at http://www.springframework.org/docs/reference/beans.html for people who are not familiar

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Alan Burlison
Yonik Seeley wrote: Brainstorming: - for errors, use HTTP error codes instead of putting it in the XML as now. That doesn't work so well if there are multiple documents to be indexed in a single request. -- Alan Burlison --

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Erik Hatcher
On Jan 16, 2007, at 3:20 AM, Bertrand Delacretaz wrote: On 1/16/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: ...I think a DocumentParser registry is a good way to isolate this top level task... With all this talk about plugins, registries etc., /me can't help thinking that this would be a g

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Ryan McKinley
So to understand better: user request -> micro-plugin -> RequestHandler -> ResponseHandler Right? or: HttpServletRequest -> SolrRequestParser -> SolrRequestProcessor -> SolrResponse -> SolrResponseWriter

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Thorsten Scherler
On Mon, 2007-01-15 at 12:23 -0800, Chris Hostetter wrote: > : > Right, you're getting at issues of why I haven't committed my CSV handler > yet. > : > It currently handles reading a local file (this is more like an SQL > : > update handler... only a reference to the data is passed). But I also >

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Bertrand Delacretaz
On 1/16/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: ...I think a DocumentParser registry is a good way to isolate this top level task... With all this talk about plugins, registries etc., /me can't help thinking that this would be a good time to introduce the Spring IoC container to manage t

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-16 Thread Ryan McKinley
: In addition to RequestProcessors, maybe there should be a general : DocumentProcessor : : interface SolrDocumentParser : { : Document parse(ContentStream content); : } : : solrconfig could register "text/html" -> HtmlDocumentParser, and : RequestProcessors could share the same parser. what e

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-15 Thread Chris Hostetter
: > (the trick being that the servlet would need to parse the "st" info out : > of the URL (either from the path or from the QueryString) directly without : > using any of the HttpServletRequest.getParameter*() methods... : : I haven't followed all of the discussion, but wouldn't it be easier to :

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-15 Thread Ryan McKinley
the one thing that still seems missing is those "micro-plugins" i was [SNIP] interface SolrRequestParser { SolrRequest process( HttpServletRequest req ); } I left out "micro-plugins" because i don't quite have a good answer yet :) This may a place where a custom dispatcher servlet

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-15 Thread Bertrand Delacretaz
On 1/16/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: interface SolrRequestParser { SolrRequest process( HttpServletRequest req ); } (the trick being that the servlet would need to parse the "st" info out of the URL (either from the path or from the QueryString) directly without u

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-15 Thread Chris Hostetter
: Iterator getContentStreams(); : : Consider the case where you iterate through a local file system. right, a fixed size in memory array can be iterated, but an unbounded stream of objects from an external source can't allways be read into an array effectively -- so when it doubt go with the Iter

Re: Update Plugins (was Re: Handling disparate data sources in Solr)

2007-01-15 Thread Chris Hostetter
: : I hate to inundate you with more code, but it seems like the best way : to describe a possible interface. ... the one thing that still seems missing is those "micro-plugins" i was talking about that can act independent of the SolrRequestProcessor used to decide where the data streams come fro

  1   2   >