[
https://issues.apache.org/jira/browse/SOLR-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467755
]
Ryan McKinley commented on SOLR-104:
------------------------------------
Hymm, this is awkward. I'm not quite sure what happened. But this is what I
thought I sent. (I apologize that you already took the time to try to decipher
it). I'll include the original, for clarification
- - - - - - -
>
> Woot! ... i think we're really close to comiting this.
>
Thanks for going through this!
I'll comment on points i have answers or questions. The rest will go
on the TODO list.
> - SolrUpdateServlet used HttpServletRequest.getReader, the new
> UpdateRequestHandler uses an InputStreamReader arround
> HttpServletRequest.getInputStream() ... this seems bad for legacy
> update support from a char encoding standpoint.
Ok, so we should make sure to put the charset into
ContentStream.getContentType() and open the Reader with:
String charset = getCharset( stream.getContentType() );
new InputStreamReader( stream.getStream(), charset );
> - While i think it's important to refactor the XML Update
> parsing out of SolrCore - I'm still not clear what is gained by
> eliminating SolrServlet and SolrUpdate. The big advantage of
> the new dispatcher being a Filter is that it can pass requests on
> that it doesn't want to deal with, so why not leave the existing
> servlets arround with only the minimum neccessary changes...
Sounds reasonable. I took them out because (at the time) it seemed
clearer and has less duplicated code.
> Comments regarding functionality that i think we *may* want to address
> before commiting (but i won't fight over if i'm the only one that cares)...
>
> - UpdateRequestHandler should probably renamed XmlUpdateRequestHandler
> (particularly since i expect Yonik to commit a
> CsvUpdateRequestHandler real soon now)
yes. At some point it would also be good to make a stronger name
distinction between UpdateHandler (the thing that handles the nity
gritty lucene indexing) and the UpdateRequestHandler -- but lets save
that for another day!
> - StandardRequestParser can't assume that a POST which isn't
> multipart/* should be handled by a RawRequestParser ... if the
> content type is "application/x-www-form-urlencoded" then
> SimpleRequestParser should be used (so all params from query string
> and body are included)
As written, the StandardRequestParser:
1) checks if multipart
2) checks if it has parameters in the URL (?xxx=yyy)
if it has parameters (?xxx=yyy) then use the RawRequestParser
otherwise it pulls parameters from the map. (SimpleRequestParser)
To trigger raw request reading you *must* have a parameter on the URL.
This was my design in response to Yonik's observation that curl puts
"application/x-www-form-urlencoded" in the header even if it is not
form-urlencoded encoded.
As written, it does not rely on clients putting accurate headers
(except for multipart) - it relies on a URL convention.
> - I'm really not fond of having SolrParams.STREAM_TYPE. Can we please,
> please leave it out for now and rely on on content-type detection?
> We can add it back in if/when we make RequestParser a public
> interface and let people register them in solrconfig.
I only put it in there to make you happy! I'll take it out and we can
deal with it later if necessary.
> - I really don't think we want to open the pandoras box of putting
> the HttpServletRequest in the SolrQueryRequest ... i'd hate to put
> that in and then have to support it forever.
I didn't think i could get that past you! I'll take it out and save
the pleading for another time.
> - a use case we talked about that still isn't covered is opening local
> files as a stream ... this should be easy to add later right along
> side STREAM_URL.
for a local file, you can use stream.url=file:///C:/pathtofile.txt,
for remote ones, you use stream.url=http://...
We should have a good notice in the config warning people to have some
security running before enabling streaming.
> - the init semantics for (Xml)UpdateRequestHandler are odd: as a
> RequestHandler it's garunteed that init(NamedList) will be called, but
> instead it uses it's own private init() that's called lazily.
I had implemented it the normal way, BUT it broke many tests (since
they never call init). The better solution is to make sure the tests
call init a standard way, but that got me into editing many files I
don't quite understand, so i opted for lazy init.
> - it's not clear to me why the interface for SolrRequestParser is...
> public SolrParams parseParamsAndFillStreams(
> final HttpServletRequest, ArrayList<ContentStream>)
> ...instead of just...
> public SolrQueryRequest(final HttpServletRequest)
> ...with the param parsing loops in
> SolrRequestParsers.buildRequestFrom
> in static utility methods (in case a RequestParser doesn't want to
> support STREAM_URL or STREAM_BODY)
>
That sounds fine. Since it is a tentative private interface, i was not
too worried about it.
> Update Plugins
> --------------
>
> Key: SOLR-104
> URL: https://issues.apache.org/jira/browse/SOLR-104
> Project: Solr
> Issue Type: Improvement
> Components: update
> Affects Versions: 1.2
> Reporter: Ryan McKinley
> Fix For: 1.2
>
> Attachments: commons-fileupload-20070107.jar, commons-io-1.2.jar,
> DispatchFilter.patch, DispatchFilter.patch, DispatchFilter.patch,
> DispatchFilter.patch, DispatchFilter.patch, DispatchFilter.patch,
> HandlerRefactoring-DRAFT-SRC.zip, HandlerRefactoring-DRAFT-SRC.zip,
> HandlerRefactoring.DRAFT.patch, HandlerRefactoring.DRAFT.patch,
> HandlerRefactoring.DRAFT.zip
>
>
> The plugin framework should work for 'update' actions in addition to 'search'
> actions.
> For more discussion on this, see:
> http://www.nabble.com/Re%3A-Handling-disparate-data-sources-in-Solr-tf2918621.html#a8305828
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.