[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616923#action_12616923 ] Yonik Seeley commented on SOLR-269: --- While I like the syntax of the config (getting rid of explicit chained update processor), I'm not sure about the internal changes: - I think that removing the factories does not simplify things... most processors that do interesting things will need to parse some request arguments and keep some state. So they will end up with a separate object that is looked up in the Context (and created if it's not there and stuffed into the Context). Same number of classes, but maybe even a little more complex. - We lose power by removing the explicit calling of next by components. I actually have a component that needs to buffer up some documents and pass them down the chain in batches later. I think Ryan might have something like this too. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-simple.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616927#action_12616927 ] Ryan McKinley commented on SOLR-269: I also like the simplified syntax, and I think the parent should always be a 'chain' -- this can get rid of some of the ugliness. But the power of the chain model is that each link can take over control without the others needing to know. For example, I have a processor that validates everything in the request before passing it on to next processors. To do this, it reads them all in without passing them down the chain and only continues when finish() is called. I also don't see a problem with the factory model. creating a factory is no more/less difficult then creating a special 'state' object that gets put into the context. But the the context option, the state is always a Map call away rather them being right there. Now you have to worry about what key you used etc... UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-simple.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616932#action_12616932 ] Ryan McKinley commented on SOLR-269: bq. A request scope will create the chain or individual processor for each request so that you may maintain state without using request's context. Otherwise, it will be created once and re-used for all requests. Will that solve this problem? To me, that makes it more confusing then having each processor call next() explicitly... bq. In Noble's patch, instead of calling super.processXXX method, you can return true/false to signal or avoid chaining. but then how would a processor be able to continue the chain? Consider the buffering example... how would I be able to call all buffered functions on finish()? What if I want a processor to make sure only one document is sent at a time? UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-simple.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616949#action_12616949 ] Shalin Shekhar Mangar commented on SOLR-269: bq. This gets overly complex too... do we add a special init() function? would everything need a factory, but it may or may not be used? No, why would we need special methods or a factory? Just the init/inform will be fine. Just that they would be called once in their scope. Am I missing something? I don't really care about sharing objects across requests. My motivation is only to help make the API simpler. bq. Consider the buffering example... how would I be able to call all buffered functions on finish()? What if I want a processor to make sure only one document is sent at a time? I see your point here. The next UpdateProcessor or a Servlet FilterChain like design will be necessary in that case. Let me think more on this since I've obviously under-estimated the use-cases for this API. I always thought that one should do heavy-duty processing like authentication etc. on the client side before sending documents to Solr or else one should extend/write an UpdateHandler. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-simple.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616966#action_12616966 ] Noble Paul commented on SOLR-269: - The idea is to make the API simple. If a Processor wishes to create a state object , it is easier to do it without a factory than with a factory. The user has to care about very few interfaces. I can draw parallels with Servlet Filter. Users write very complex filters and I have never seen people complaining about it not having a factory . SolrDispatchFilter is a very good example. If it is simple enough people will use it. If it is complex only the 'very smart people' use it. Most of the users are not power users and they just want to get things done. On Fri, Jul 25, 2008 at 10:27 PM, Shalin Shekhar Mangar (JIRA) -- --Noble Paul UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-simple.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616457#action_12616457 ] Shalin Shekhar Mangar commented on SOLR-269: I've read through most of the discussion here and the wiki page at http://wiki.apache.org/solr/UpdateRequestProcessor but I couldn't understand the reasons behind the current design. Looking at the configuration we have: {code:xml} updateRequestProcessor factory name=standard class=solr.ChainedUpdateProcessorFactory default=true chain class=org.apache.solr.ConditionalCopyProcessorFactory / chain class=solr.RunUpdateProcessorFactory / chain class=solr.LogUpdateProcessorFactory / /factory /updateRequestProcessor {code} Why can't it be written as: {code:xml} updateRequestProcessor name=standard default=true processor class=com.MyUpdateProcessor / processor class=solr.RunUpdateProcessor / /updateRequestProcessor !-- Another one -- updateRequestProcessor name=alternate processor class=org.apache.solr.ConditionalCopyProcessor / processor class=solr.RunUpdateProcessor / processor class=solr.LogUpdateProcessor / /updateRequestProcessor {code} Why do we need factories here? It seems like there is no advantage being added by multiple factories. If the only advantage is with the factory being able to choose between instantiating on each request or using an already instantiated processor then one can argue on having factories for RequestHandlers or SearchComponents too. The Processors should be created once and re-used. Most of them are stateless and the others can use the init method and store state in instance variables. The same is done with RequestHandlers and SearchComponents at present. Why should we have a explicit ChainedUpdateRequestProcessorFactory? Seems from the use-cases that processors will always be chained. Let us have the implementation do the chaining instead of asking users to add a factory in the configuration. Not trying to be critical but seems like this is too complex for the use-cases it needs to support. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616483#action_12616483 ] Ryan McKinley commented on SOLR-269: bq. Not trying to be critical but seems like this is too complex for the use-cases it needs to support. Nonsense -- the more review / feedback / critique we get, the better -- especially *before* a release :) Why do we need factories here? -- the model came from how things work with Token/Filter factories. Many processors need to maintain state within a request. Check the 'log' processor. I have one that checks if the user has permission on *everything* in the request before executing the commands. We could have something that keeps track of what it did and backs out the changes if there is an error. If each processor were shared across all requests, any state access would need to be synchronized and have some MapRequest,State that seems to get ugly pretty fast. Why ChainedUpdateRequestProcessorFactory? I see your point here. I think we can force everything to be 'chained' -- The original implementation was not chained, but then the functional parts got split into their own components and chained together. Removing the parent chained factory could simplify the whole thing. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616511#action_12616511 ] Noble Paul commented on SOLR-269: - bq.Many processors need to maintain state within a request. Check the 'log' processor. I have one that checks if the user has permission on everything in the request before executing the commands. I do not think we need a factory where we need to maintain local state . Everything can be maintained in the method stack example {code} class LocalState{ class LogUpdateProcessor extends UpdateRequestProcessor { private final SolrQueryRequest req; private final SolrQueryResponse rsp; private final UpdateRequestProcessor next; private final NamedListObject toLog; doSomething(){ //do your thing } } public class LogUpdateProcessor extends UpdateRequestProcessor @Override public void processAdd(AddUpdateCommand cmd) throws IOException { LocalState state = new LocalState ();//pass the params state.doSomeThing() } } {code} UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616520#action_12616520 ] Shalin Shekhar Mangar commented on SOLR-269: bq. If each processor were shared across all requests, any state access would need to be synchronized and have some MapRequest,State that seems to get ugly pretty fast. But we do have SolrQueryRequest#getContext to handle those cases, don't we? IMHO, we should not force users to write a factory class for each processor when the benefit is minimal and easy workarounds exist. Please correct me if I'm misunderstanding something. bq. Nonsense - the more review / feedback / critique we get, the better - especially before a release :) Glad to hear that, though I realize that I'm a year late and that we are very close to a release :) It's just that I set out to use this API and had to jump around for quite a while to figure out how to use it and how it works. I was quite surprised to find the actual chaining happening in a class which is named NoOpUpdateProcessor -- though it made sense to me later. Also, it took me a while to find the wiki page for this feature because it is not linked off the main page (or the update xml/csv pages). I could find it because I knew that a class named UpdateRequestProcessor existed. We should link it off the main page so that it can be found more easily. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616544#action_12616544 ] Ryan McKinley commented on SOLR-269: I'm all for simplifying the API. If you guys want to take a crack at it, I'll review it ASAP. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616649#action_12616649 ] Yonik Seeley commented on SOLR-269: --- bq. But we do have SolrQueryRequest#getContext to handle those cases, don't we? IMHO, we should not force users to write a factory class for each processor when the benefit is minimal and easy workarounds exist. Right... the alternative to a per-request instance would be to use the request context. In general, I think that would be more complex for a user though (if it's something they want to do per request-batch). I think that can be made more efficient for bulk loading by using factories too... context lookups and decisions don't have to be made for every document. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512502 ] Yonik Seeley commented on SOLR-269: --- How do you all feel about the basic structure? It's a go! It will get more complicated, I think, with document modification (SOLR-139) While it would be nice to keep the base stuff package protected, I'm more concerned with the other parts of the API that this moves front-and-center... mainly UpdateCommand and friends... those were really quick hacks on my part since there were no custom update handlers at the time. One clever change is to have the LogUpdateProcessorFactory skip building a LogUpdateProcessor if the log level is not INFO rather then keep a flag. Nice! I also need SOLR-139 btw, is it easy for you to commit this first to limit the size and scope of that patch? UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512356 ] Ryan McKinley commented on SOLR-269: the RequestHandler has final say over what Processor gets used absolutely! The question is just what do in the default /update case. I'm inclined to have the request say what processor to use. With 'invariants' that can be fixed to a single implementation, and will let people configure processors without a custom handler. How do you all feel about the basic structure? I like the structure, but am not sure how 'public' to make the configuration and implementation. While it would be nice to keep the base stuff package protected, then we can't have external configuration and external classes could not reuse the other bits of the chain (defeating the 'chain' advantages) I have a pending deadline that depends on input processing and SOLR-139 modifiable documents -- it would be great to work from a lightly patched trunk rather then a heavily patched one ;) UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511178 ] Yonik Seeley commented on SOLR-269: --- One issue: the current way of having a custom processor (CustomUpdateRequestHandler) seems less than ideal. First is that CustomUpdateRequestHandler extends XMLUpdateRequestHandler but what if I want one for CSV, etc. If update processors are to be a first-class part of Solr, it seems like one should be able to specify the processor to use for any update handler (CSV, XML, etc) without having to write extra classes for those. Perhaps something like: requestHandler name=/customupdate class=solr.XmlUpdateRequestHandler str name=update.processorstandard/str /requestHandler UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511195 ] Ryan McKinley commented on SOLR-269: Hymm. the behavior on trunk is: requestHandler name=/customupdate class=solr.XmlUpdateRequestHandler str name=update.processor.factoryclass name for factory/str /requestHandler The latest patch has the argument lookup an XML configured factory. Do you mean: requestHandler name=/customupdate class=solr.XmlUpdateRequestHandler lst name=invariants str name=update.processorstandard/str /lst /requestHandler Given the direction we are heading, it seems nice to be able to change the update behavior from: /update?update.processor=do-fancy-document-cleanup /update?update.processor=go-quick-i-know-the-docs-are-clean I made it a 1-1 relation (processor-handler) to avoid a hash lookup for each request, but from a pram would be ok. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511198 ] Yonik Seeley commented on SOLR-269: --- I made it a 1-1 relation (processor-handler) to avoid a hash lookup for each request, That was my thinking too... I wasn't suggesting making it an overrideable parameter, but I'm not really against it either. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511206 ] Hoss Man commented on SOLR-269: --- the idea of letting people override the processor on a per request basis seems very scary and depending on what kinds of stuff yo uwere expecting hte processor to do, could introduce some serous bugs ... but then again, if it's a param, it can be specified as an invariant if you want to ensure that doesn't happen. i guess hte key thing is just that the RequestHandler has final say over what Processor gets used ... we can provide handy tools/conventions to get that info from the config or the request, but a very simplistic RequestHandler should be able to hardcode it for absolute control. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510749 ] Yonik Seeley commented on SOLR-269: --- I'm not sure we need to have XML configuration for this If we have those multiple update processor factories, I agree we don't need XML config for the transformers. I need a custom UpdateRequestProcessor that checks all the requests before executing any of them. I plan to store the valid commands in a list and only execute them in the finish() call. I'm not sure how to map that plan to an chain. How would I pass the output from one processor to the next? I had thought of that use-case too (bulk operations), which is why I added explicit flow contol (explicit calling of next.handleAdd() in the processor). You can buffer up all the requests (you want to clone the UpdateCommands as they might be reused though) and not call next. Then in finish, you can delegate all of the buffered commands. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510759 ] Ryan McKinley commented on SOLR-269: The only one I'm not sure about is: - explicit flow control between processors for greatest flexibility I'm still trying to avoid the parent UpdateRequestProcessorFactory chain as a default behavior. It seems fine as a super-duper custom controlller, but unurly in the default/slightly custom case. Folding in: - removal of NamedList return (as you say, chaining those makes less sense anyway) - already extracted and optimized the complex (or rather bigger) logging logic from the simple index updating - passed in SolrQueryResponse as well, enabling a processor to change the response is no problem. If you like the general structure / flow of SOLR-269-UpdateRequestProcessorFactory.patch, I'll clean it up and work in this stuff. Otherwise I'll look at how to make UpdateRequestProcessorFactory[] feel more palatable. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510771 ] Yonik Seeley commented on SOLR-269: --- The only one I'm not sure about is: - explicit flow control between processors for greatest flexibility It's a single call per hook: if (next != null) next.processAdd(); And it's exactly what you need for your buffering situation. Chaining is the model that Lucene uses for it's analyzers too (only difference is that it's a pull instead of a push). I'm still trying to avoid the parent UpdateRequestProcessorFactory chain as a default behavior. It seems fine as a super-duper custom controlller, but unurly in the default/slightly custom case. I'm not clear on why... the configuration is more complex? If you like the general structure / flow of SOLR-269-UpdateRequestProcessorFactory.patch I'm not sure about the named processors... are they needed? It seems like we need a standard one that is used by default everywhere, and then *maybe* we need to be able to change them per-handler. Do we need this up front, or could it be deferred? It seems like there does need to be a method on SolrCore to get a RequestProcessor or Factory, since that becomes the new interface to do an index change (otherwise you miss the doc transformations, etc). Otherwise I'll look at how to make UpdateRequestProcessorFactory[] feel more palatable. That could be wrapped in another UpdateRequestProcessorFactory if desired... it doesn't matter much if the impl is hidden by a class or a method IMO. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510774 ] Ryan McKinley commented on SOLR-269: It's a single call per hook: if (next != null) next.processAdd(); Ok. I'm convinced. I'm not sure about the named processors... are they needed? It seems like we need a standard one that is used by default everywhere, and then *maybe* we need to be able to change them per-handler. Do we need this up front, or could it be deferred? I'm not sure. The only reason I think we *may* want to do it now is to keep the initialization standard and in a single place. If we declare a default processor and have each handler optionally initialize their own, the config may look different. RequestHandlers only have access to a NamedList while initialized, they can't (without serious changes) declare something like: requestHandler ... updateProcessor class= / /requestHandler With that in mind, I think it best to build the updateProcessors using the standard PluginLoader framework and then have RequestHandlers access them by name. Otherwise I'll look at how to make UpdateRequestProcessorFactory[] feel more palatable. That could be wrapped in another UpdateRequestProcessorFactory if desired... it doesn't matter much if the impl is hidden by a class or a method IMO. Ok, I'll start with UpdateProcessor.patch and fold in my changes. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510425 ] Yonik Seeley commented on SOLR-269: --- FYI, I'm working up a prototype right now to handle multiple request processors. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
FYI, I'm working up a prototype right now to handle multiple request processors. excellent. check: https://issues.apache.org/jira/browse/SOLR-139 to see how I was thinking about supporting multiple processors. Essentially a single parent processor that may loop through 'transformers' or whatever in the custom case. (not that I think my design is *the* answer) ryan
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510484 ] Ryan McKinley commented on SOLR-269: I like the AddUpdateCommand changes What do you see as the common use case for wanting to chain request processors? Is the LogUpdateRequestProcessor just an example? The one compelling chained use case I can think of is for document transformation. In SOLR-139, I toyed with SolrInputDocumentTransformer. The default case does nothing, and a subclass may use something like: for( SolrInputDocumentTransformer t : transformers ) { doc = t.transform( doc ); } UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510487 ] Yonik Seeley commented on SOLR-269: --- What do you see as the common use case for wanting to chain request processors? - conditional copyField, field transformations (between multiple fields too... something Analyzer can't do), loading certain fields from a database if missing, updating a related document, etc. Is the LogUpdateRequestProcessor just an example? IMO, it's a default since no logging is done by the ChangeUpdateRequestProcessor (anyone think of a better name for that?). Then in a Benchmarking section of the Solr Wiki, we could advise to remove logging altogether. Or you could remove the ChangeUpdateRequestProcessor to skip index changes to better benchmark hotspots in the parsing + doc creation phase, etc. The one compelling chained use case I can think of is for document transformation Ah, I briefly looked at SOLR-139 when you mentioned it before, but missed the transformer stuff. In a way multiple update processors are more generic and wide open... you could actually insert two documents into the index for each doc added, you could do transforms on the actual Lucene document (add Field options that Solr doesn't currently support, etc. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510504 ] Yonik Seeley commented on SOLR-269: --- Some other issues how to configure processors for multiple update handlers? Perhaps allow configuration of a global default for update handlers with no processors specified? That would make it easy to make sure your custom processor was used everywhere. We should probably have a base class for update handlers to implement initialization logic. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510506 ] Ryan McKinley commented on SOLR-269: - conditional copyField, field transformations (between multiple fields too... something Analyzer can't do), loading certain fields from a database if missing, updating a related document, etc. This all works nicely with a simple 'transform' chain. Is the LogUpdateRequestProcessor just an example? IMO, it's a default since no logging is done by the ChangeUpdateRequestProcessor (anyone think of a better name for that?). Then in a Benchmarking section of the Solr Wiki, we could advise to remove logging altogether. Or you could remove the ChangeUpdateRequestProcessor to skip index changes to better benchmark hotspots in the parsing + doc creation phase, etc. Isn't logging best configured with standard java.util.logging settings? If necessary, the base processor could check if the logging level is high enough to keep track of somethings. For benchmarking, don't we just want a single noop processor? The one compelling chained use case I can think of is for document transformation Ah, I briefly looked at SOLR-139 when you mentioned it before, but missed the transformer stuff. In a way multiple update processors are more generic and wide open... you could actually insert two documents into the index for each doc added, you could do transforms on the actual Lucene document (add Field options that Solr doesn't currently support, etc. I see what you are getting at, but makes the basic cases more complicated then it needs to be. I have been considering UpdateRequestProcessor as an 'advanced' option where changing their behavior is writing custom code -- not text configuration. In the advanced case where you want to build multiple documents or munge the actual Lucene document existing it may be more difficult to live in a chain rather then have explicit control. If I think the cleanest design would be a single entry point and keeping the real functionality in easily subclassed functions or utility classes. The latest SOLR-139 tries that (but it could still use some cleanup) UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510510 ] Ryan McKinley commented on SOLR-269: Some other issues how to configure processors for multiple update handlers? Perhaps allow configuration of a global default for update handlers with no processors specified? That would make it easy to make sure your custom processor was used everywhere. SolrCore could have a single UpdateRequestProcessorFactory that handlers could use as the default. I'm reluctant to add another plugin layer, but this would make it easier to share with the CSV update handler and others. Since its a factory, it will be thread safe across multiple handlers. Again, I'm reluctant to think about configuring a processor chain in solrconfig.xml -- we should make the most sensible/extendible default implementation, but IMO tweeking RequestProcessor functionality should be done with custom code. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510563 ] Ryan McKinley commented on SOLR-269: perhaps the extra functionality of transformations and updating should be pushed into the UpdateHandler interface That was the first SOLR-139 design! Having thought about it for a while, i think there are nice advantages to keeping the updating/modifying outside of the UpdateHandler - the biggest one is that various RequestHandlers *could* transform the document differently. I'm putting together a hybrid example that (I hope) answers questions about chains/configuration/transformation, etc. I'll post it shortly. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510004 ] Hoss Man commented on SOLR-269: --- Different Q on usage: is this where my document mutator stuff should go??? If I want a transformation done on a field, regardless of where the data is coming from (XML update handler, CSV update handler, future REST update handler, etc), how should that be done? Is there a single place I can register a plugin to do this, and is UpdateRequestProcessor where you see it happening? i believe that was acutally the initial intent of UpdateRequestProcesso, note the javadocs... * This is a good place for subclassed update handlers to process the document before it is * indexed. You may wish to add/remove fields or check if the requested user is allowed to * update the given document... * * Perhaps you continue adding an error message (without indexing the document)... * perhaps you throw an error and halt indexing (remove anything already indexed??) UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510009 ] Ryan McKinley commented on SOLR-269: getCore(), getSchema(), getUpdateHandler() all just return instance variables. ok, getCore() isn't a good canidate; It is annoying to start every function with a train wreck: req.getCore().getUpdateHandler() Having a single class per request makes sense for a subclass I am working with -- it does some expensive initialization and stores the results. I could put this in req.getContext() instantiating and initializing all those request processors will get expensive. Really? the default initialize is trivial - stuff that would happen at the beginning of every function anyway. I suppose GC could be an issue I do see your usecase though, in the case of multiple docs per add and you have some expensive state you only want to calculate once. In r552986, I changed the logging to match solr 1.2 -- this required accumulating the id's and spitting them out at the end. In 1.2 with processing and parsing entwined, this was just a giant loop. To get the same behavior we need to stash it somewhere... Different Q on usage: is this where my document mutator stuff should go??? Yes. The intent is to have a simple place between document parsing and indexing where you can do whatever you need to do. Any parsing strategy (XML,JSON,etc) could share the same processor. Looking at SOLR-139, I now think the most flexible/useful way to support modifiable documents is to build utility functions for the UpdateProcessor that can manipulate SolrInputDocuments. - - - I will take another crack at SOLR-139 implemented in the UpdateProcessor, then we should return to the question of singleton vs factory - trying to work with a more complex processor may make this choice more obvious. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509725 ] Yonik Seeley commented on SOLR-269: --- Looking at UpdateRequestProcessor further, it seems like these should be singletons (instance per entry in solrconfig, no factory needed), and any extra state that is needed should be added to classes we already have (like AddCommand, etc), no? UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509732 ] Yonik Seeley commented on SOLR-269: --- I think the newly added incremental time should not be on by default, as well as logging per id for deletes and adds. Mike added the id aggregation code specifically because logging each add was taking so much time. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509733 ] Ryan McKinley commented on SOLR-269: maybe. I'm not sure I totally understand your suggestion though. I need something that is easily subclassed and can cleanly holds state across an entire request cycle. The alternative is to pass the SolrQueryRequest/Response into each action and maybe pull out the schema/updateHandler/logged in user/etc for each command (each document in the list of 100) Is the factory a performance concern? (to my tastes) it seems nicer to work with: processDelete( DeleteUpdateCommand cmd ) { if( user.isAdmin() ) { updateHandler.delete( cmd ); } else { ... } } than: processDelete( DeleteUpdateCommand cmd, SolrQueryRequest req, SolrQueryResponse rsp ) { User user = req.getContext().get( user ); if( user.isAdmin() ) { SolrCore core = req.getCore(); SolrSchema schema = core.getSchema(); UpdateHandler updateHandler = core.getUpdateHandler(); updateHandler.delete( cmd ); } else { ... } } I'm fine either way, like the easy 1 per-request interface. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509734 ] Ryan McKinley commented on SOLR-269: I think the newly added incremental time should not be on by default, as well as logging per id for deletes and adds. Mike added the id aggregation code specifically because logging each add was taking so much time. sounds good. the testing I did showed that lots of time is spent in the logging phase. I will remove it from the default implementation. UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them
[ https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509737 ] Yonik Seeley commented on SOLR-269: --- I need something that is easily subclassed and can cleanly holds state across an entire request cycle. Having a factory and separate object so that one can use core instead of req.getCore(), etc, seems like overkill for the normal case though since getCore(), getSchema(), getUpdateHandler() all just return instance variables. I was thinking any state like that could be on the UpdateCommand. I'd like to have potentially several request processors, but if people start doing single doc add requests, instantiating and initializing all those request processors will get expensive. I do see your usecase though, in the case of multiple docs per add and you have some expensive state you only want to calculate once. If it's a relatively rare case, one could put it in the request context. The tradeoff would be an extra hash lookup per-document of a multi-document add vs an extra object creation for single-doc adds. Different Q on usage: is this where my document mutator stuff should go??? If I want a transformation done on a field, regardless of where the data is coming from (XML update handler, CSV update handler, future REST update handler, etc), how should that be done? Is there a single place I can register a plugin to do this, and is UpdateRequestProcessor where you see it happening? UpdateRequestProcessorFactory - process requests before submitting them --- Key: SOLR-269 URL: https://issues.apache.org/jira/browse/SOLR-269 Project: Solr Issue Type: New Feature Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. An UpdateRequestProcessor lets clients plug in logic after a document has been parsed and before it has been 'updated' with the index. This is a good place to add custom logic for: * transforming the document fields * fine grained authorization (can user X updated document Y?) * allow update, but not delete (by query?) requestHandler name=/update class=solr.StaxUpdateRequestHandler str name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str lst name=update.processor.args ... (optionally pass in arguments to the factory init method) ... /lst /requestHandler http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.