[ https://issues.apache.org/jira/browse/SOLR-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778726#action_12778726 ]
Chris A. Mattmann commented on SOLR-1516: ----------------------------------------- bq. This does not help the user of the API much because the real difficulty is in unmarshalling various types of objects. This patch does nothing to read the stored fields from the Document . I agree with your statement above regarding "the real difficulty". That's precisely what this patch addresses. This patch deals with that real difficulty for users (of which there are plenty, please see my comment above RE: use cases, e.g., FGDC, RDF, etc.) that are mostly concerned with spitting out (for format compatibility) the resultant Documents from searches in a particular XML format. This patch isn't intended to do anything with the stored fields -- that's left up to the user who extends the abstract base classes by implementing #emitDoc or #emitDocList, where the user deals with Lucene Documents. As I stated above numerous times, it took me quite a bit of printing out and deducing the structure of the resultant SolrResponse to determine where in that list Documents were stored (and in fact they weren't it i just the IDs). This isn't really documented anywhere per se (at least from what I could find with the online Javadocs or Wiki). bq. That is really difficult. A lot of components write their output in a very arbitrary Object tree. The output is largely designed like a JSON object tree (with more promitives) . The producer decides what the tree contains. The good thing about this approach is that we don't need to build custom classes for every type of output. Why is this difficult? It would amount to components declaring what type of schema they return. Typed, bags of objects, coupled with sparse documentation isn't exactly the answer. I think we both agree that there is a larger issue to look at in terms of the SolrResponse though and QueryResponseWriters, my point is that I don't think using this issue to solve those bigger picture questions is the right answer. I'd be happy to create further issues to discuss this. bq. There is no reason why a GenericResponseWriter can't do that . I am not happy about putting this classes in and leading users to believe that this is all that they have to do. How are we telling users that this is all they have to do? The patch specifically states (taken from the included Javadoc): bq. This {...@link QueryResponseWriter} allows a user to implement the {...@link #emitDoc(Document, Writer)} function which acts as a callback function to process one Lucene {...@link Document} returned from the SOLR Query at a time. Sub-classes should keep track of any global state as this class does not provide a means to access the entire set of returned {...@link Document}s.If that functionality is required, see {...@link DocumentListResponseWriter}. bq. This {...@link QueryResponseWriter} allows a user to implement the {...@link #emitDocList(List, Writer)} function which acts as a callback function to process the entire {...@link List} of Lucene {...@link Document} returned from the SOLR Query at once. To process the {...@link Document}s one-at-a-time (to conserve resources, or to speed up the processing/etc.), see {...@link DocumentResponseWriter}. I'm not sure I see the concern behind this ~250 line patch? The patch: * adds functionality that would have simplified a number of use cases that I am leveraging SOLR for in the space and earth science data community, where formats are critical and metadata output is more important than the specific search meta-info (# hits, query time, start/end, etc.). See the 3-4 examples I stated above. * does not introduce anything that is not backwards compatible * includes javadoc on all public methods, as well as class-level javadoc * should apply without trouble to the current SVN trunk This has typically been the criteria for inclusion (modulo unit tests, which if there is concern there, I'd be happy to include) -- is the criteria different here in SOLR? > DocumentList and Document QueryResponseWriter > --------------------------------------------- > > Key: SOLR-1516 > URL: https://issues.apache.org/jira/browse/SOLR-1516 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.3 > Environment: My MacBook Pro laptop. > Reporter: Chris A. Mattmann > Assignee: Noble Paul > Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1516.Mattmann.101809.patch.txt > > > I tried to implement a custom QueryResponseWriter the other day and was > amazed at the level of unmarshalling and weeding through objects that was > necessary just to format the output o.a.l.Document list. As a user, I wanted > to be able to implement either 2 functions: > * process a document at a time, and format it (for speed/efficiency) > * process all the documents at once, and format them (in case an aggregate > calculation is necessary for outputting) > So, I've decided to contribute 2 simple classes that I think are sufficiently > generic and reusable. The first is o.a.s.request.DocumentResponseWriter -- it > handles the first bullet above. The second is > o.a.s.request.DocumentListResponseWriter. Both are abstract base classes and > require the user to implement either an #emitDoc function (in the case of > bullet 1), or an #emitDocList function (in the case of bullet 2). Both > classes provide an #emitHeader and #emitFooter function set that handles > formatting and output before the Document list is processed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.