[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259600#comment-14259600 ] Noble Paul commented on SOLR-6892: -- I've updated the description with more meat this time. Please comment Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * A new tag updateProcessor becomes a toplevel tag and it will be equivalent to the {{processor}} tag inside {{updateRequestProcessorChain}} . The only difference is that it should require a {{name}} attribute. The {{updateProcessorChain}} tag will continue to exist and it should be possible to define processor inside as well . It should also be possible to reference a named URP in a chain. * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those URPs * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests * What happens when I say processor=a,b,c in a request? It will execute the default chain after the just-in-time chain {{a-b-c}} . * How to execute a different chain other than the default chain? the same old mechanism of update.chain=x means that the chain {{x}} will be applied after {{a,b,c}} * How to avoid the default processor chain from being executed ? There will be an implicit URP called {{STOP}} . send your request as processor=a,b,c,STOP. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259384#comment-14259384 ] Erik Hatcher commented on SOLR-6892: While this does open up some potential custom power, I'm curious what use cases you see with being able for the indexing client to specify the processors? It is good that processors become their own first class component such that they can be composed into update processor chains when (eventually) creating a chain with API, but I can see using individual processors from the /update call being a possible problem, such as not using the log processor and then not being able to see what happened exactly. Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259410#comment-14259410 ] Alexandre Rafalovitch commented on SOLR-6892: - This use case does not feel strong enough to be *Major*. Are there specific business use-cases that really cannot be solved with pre-defined chains? Also, a lot of URPs take parameters. The proposal above does not seem to allow that. And then what about DistributedUpdateProcessor and that the chains allow to specify items both before and after it. Also consider troubleshooting. It needs to be very clear what applied to the content as it came in. How would one find out if a chain was applied incorrectly. Finally, what are built update processors? Built-in? So far, vast majority of them are built-in, as in shipped with Solr. And have their own class names. Do you means some standard *chains* could be pre-built and named? Do you have a good example? I would say these arguments apply a lot more to the analyzer chains (I'd love to see those built-in), but I am not sure about URPs. Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259409#comment-14259409 ] Noble Paul commented on SOLR-6892: -- The configuration is complex today. We need to make this less of vodoo . Let's look at what is the purpose of an update processor. It is just a transformer for incoming documents. Let's apply the transformers in the order they are specified and let the system take care of the rest and avoid surprises. Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259413#comment-14259413 ] Alexandre Rafalovitch commented on SOLR-6892: - Let's apply the transformers in the order they are specified and let the system take care of the rest and avoid surprises Actually, having a code hidden somewhere inside the system to do the non-trivial thing is what will create surprises. Right now, the user can look at the XML file and step through the cross-references to see what actually happened. Moving away into on-the-fly and case-by-case will *increase* the surprises. So, the proposal and the reasoning are not quite aligned here. Things like pre-defined names for standard components could decrease surprises. The rest of the proposal does not. Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259419#comment-14259419 ] Noble Paul commented on SOLR-6892: -- The configuration is not going away. we will have the individuals URP specified and configured. The point is , the chain does nothing extra. Specifying the URP list at request time is no more complex than deciding the chain name. It is not taking the power away but adding the power to mix and match stuff at request time Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259422#comment-14259422 ] Alexandre Rafalovitch commented on SOLR-6892: - So, would the better explanation be then is that you have an option of pre-configuring and naming individual items on the stack and then composing them either in pre-existing stack (effectively with aliases) or dynamically on the fly. So, the addressable unit becomes an individual pre-configured URP (atom) as opposed to the full stack (molecule)? That would make more sense, though you still need to be super-clear on what becomes hidden from the XML file. For example, there should be an easy way to query all the pre-configured components. One of the issues with ElasticSearch is that it is hard to tell what those symbolic (analyzer chains) names correspond too, as it is hardcoded somewhere deep with it. Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259429#comment-14259429 ] Noble Paul commented on SOLR-6892: -- If there are components that need little or no configuration, it can be made implicitly available with a well known name. Other components which require configuration will have to be configured in xml . But your explanation is correct. We are changing the atomic unit from a chain to a URP Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259430#comment-14259430 ] Yonik Seeley commented on SOLR-6892: We shouldn't let too much implementation leak into the interface. DistribUpdateProcessor, etc, are much more implementation than interface. For example, should one need to know that DistribUpdateProcessor is needed for atomic updates? What if it's split into two processors in the future? Likewise for schemaless - it's currently implemented as a whole bunch of processors, but I could see it moving to a single processor in the future. It's implementation. People should not be specifying this stuff on requests. bq. For example, there should be an easy way to query all the pre-configured components. Perhaps that's all this feature should be... a way to add additional named processors to the chain. That should be relatively safe. Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259437#comment-14259437 ] Noble Paul commented on SOLR-6892: -- Yes yonik . The default urp chain must be immutable . This is about adding URP s before that chain . Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259449#comment-14259449 ] Alexandre Rafalovitch commented on SOLR-6892: - q. The default urp chain must be immutable Careful with that one. There are sometimes valid reasons with putting an URP *after* DistributedUpdateProcessor. I believe it is usually connected with accessing stored content during the atomic update. We don't want to completely loose that flexibility. Also, Debugging URP may want to be the last items in the chain too. Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259519#comment-14259519 ] Yonik Seeley commented on SOLR-6892: bq. This is about adding URP s before that chain . Dude, I'm not psychic ;-) I didn't see that anywhere in this issue before now. Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259566#comment-14259566 ] Jack Krupansky commented on SOLR-6892: -- Issue type should be Improvement, not Bug, right? Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259567#comment-14259567 ] Jack Krupansky commented on SOLR-6892: -- It might be instructive to look at how the search handler deals with search components and possibly consider rationalizing the two handlers so that there is a little more commonality in how lists of components/processors are specified. For example, consider a first, last, and full processor list. IOW, be able to specify a list of processors to apply before the solrconfig-specified list, after, or to completely replace the solrconfig-specified list of processors. Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6892) Make update processors toplevel components
[ https://issues.apache.org/jira/browse/SOLR-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259573#comment-14259573 ] Noble Paul commented on SOLR-6892: -- Thanks everyone. Currently the ticket is short on details. I hope to update this with finer details soon. Make update processors toplevel components --- Key: SOLR-6892 URL: https://issues.apache.org/jira/browse/SOLR-6892 Project: Solr Issue Type: Bug Reporter: Noble Paul Assignee: Noble Paul The current update processor chain is rather cumbersome and we should be able to use the updateprocessors without a chain. The scope of this ticket is * updateProcessor tag becomes a toplevel tag and it will be equivalent to the processor tag inside updateRequestProcessorChain . The only difference is that it should require a {{name}} attribute * Any update request will be able to pass a param {{processor=a,b,c}} , where a,b,c are names of update processors. A just in time chain will be created with those update processors * Some in built update processors (wherever possible) will be predefined with standard names and can be directly used in requests -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org