[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2008-07-25 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616923#action_12616923
 ] 

Yonik Seeley commented on SOLR-269:
---

While I like the syntax of the config (getting rid of explicit chained update 
processor), I'm not sure about the internal changes:
-  I think that removing the factories does not simplify things... most 
processors that do interesting things will need to parse some request arguments 
and keep some state. So they will end up with a separate object that is looked 
up in the Context (and created if it's not there and stuffed into the Context). 
 Same number of classes, but maybe even a little more complex.
- We lose power by removing the explicit calling of next by components.  I 
actually have a component that needs to buffer up some documents and pass them 
down the chain in batches later.  I think Ryan might have something like this 
too.


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-simple.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2008-07-25 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616927#action_12616927
 ] 

Ryan McKinley commented on SOLR-269:


I also like the simplified syntax, and I think the parent should always be a 
'chain' -- this can get rid of some of the ugliness.

But the power of the chain model is that each link can take over control 
without the others needing to know.  For example, I have a processor that 
validates everything in the request before passing it on to next processors.  
To do this, it reads them all in without passing them down the chain and only 
continues when finish() is called.

I also don't see a problem with the factory model.  creating a factory is no 
more/less difficult then creating a special 'state' object that gets put into 
the context.  But the the context option, the state is always a Map call away 
rather them being right there.  Now you have to worry about what key you used 
etc...

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-simple.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2008-07-25 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616932#action_12616932
 ] 

Ryan McKinley commented on SOLR-269:


bq. A request scope will create the chain or individual processor for each 
request so that you may maintain state without using request's context. 
Otherwise, it will be created once and re-used for all requests. Will that 
solve this problem?

To me, that makes it more confusing then having each processor call next() 
explicitly...  

bq. In Noble's patch, instead of calling super.processXXX method, you can 
return true/false to signal or avoid chaining.

but then how would a processor be able to continue the chain?  Consider the 
buffering example... how would I be able to call all buffered functions on 
finish()?  What if I want a processor to make sure only one document is sent at 
a time?


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-simple.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2008-07-25 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616949#action_12616949
 ] 

Shalin Shekhar Mangar commented on SOLR-269:


bq. This gets overly complex too... do we add a special init() function? would 
everything need a factory, but it may or may not be used?
No, why would we need special methods or a factory? Just the init/inform will 
be fine. Just that they would be called once in their scope. Am I missing 
something?

I don't really care about sharing objects across requests. My motivation is 
only to help make the API simpler.

bq. Consider the buffering example... how would I be able to call all buffered 
functions on finish()? What if I want a processor to make sure only one 
document is sent at a time?
I see your point here. The next UpdateProcessor or a Servlet FilterChain like 
design will be necessary in that case.

Let me think more on this since I've obviously under-estimated the use-cases 
for this API. I always thought that one should do heavy-duty processing like 
authentication etc. on the client side before sending documents to Solr or else 
one should extend/write an UpdateHandler.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-simple.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2008-07-25 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616966#action_12616966
 ] 

Noble Paul commented on SOLR-269:
-

The idea is to make the API simple. If a Processor wishes to create a
state object , it is easier to do it without a factory than with a
factory. The user has to care about very few interfaces. I can draw
parallels with Servlet Filter. Users write very complex  filters and I
have never seen people complaining about it not having a factory .
SolrDispatchFilter is a very good example. If it is simple enough
people will use it. If it is complex only the 'very smart people' use
it. Most of the users are not power users and they just want to get
things done.

On Fri, Jul 25, 2008 at 10:27 PM, Shalin Shekhar Mangar (JIRA)



-- 
--Noble Paul


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-simple.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2008-07-24 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616457#action_12616457
 ] 

Shalin Shekhar Mangar commented on SOLR-269:


I've read through most of the discussion here and the wiki page at 
http://wiki.apache.org/solr/UpdateRequestProcessor but I couldn't understand 
the reasons behind the current design.

Looking at the configuration we have:
{code:xml}
updateRequestProcessor   
   factory name=standard class=solr.ChainedUpdateProcessorFactory 
default=true 
 chain class=org.apache.solr.ConditionalCopyProcessorFactory /  
  
 chain class=solr.RunUpdateProcessorFactory /
 chain class=solr.LogUpdateProcessorFactory /   
   /factory   
 /updateRequestProcessor
{code}
Why can't it be written as:
{code:xml}
updateRequestProcessor name=standard default=true
  processor class=com.MyUpdateProcessor /
  processor class=solr.RunUpdateProcessor /
/updateRequestProcessor

!-- Another one --
updateRequestProcessor name=alternate
  processor class=org.apache.solr.ConditionalCopyProcessor /
  processor class=solr.RunUpdateProcessor /
  processor class=solr.LogUpdateProcessor /
/updateRequestProcessor
{code}
Why do we need factories here? It seems like there is no advantage being added 
by multiple factories. If the only advantage is with the factory being able to 
choose between instantiating on each request or using an already instantiated 
processor then one can argue on having factories for RequestHandlers or 
SearchComponents too. The Processors should be created once and re-used. Most 
of them are stateless and the others can use the init method and store state in 
instance variables. The same is done with RequestHandlers and SearchComponents 
at present.

Why should we have a explicit ChainedUpdateRequestProcessorFactory? Seems from 
the use-cases that processors will always be chained. Let us have the 
implementation do the chaining instead of asking users to add a factory in the 
configuration.

Not trying to be critical but seems like this is too complex for the use-cases 
it needs to support.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2008-07-24 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616483#action_12616483
 ] 

Ryan McKinley commented on SOLR-269:


bq.  Not trying to be critical but seems like this is too complex for the 
use-cases it needs to support.
Nonsense -- the more review / feedback / critique we get, the better -- 
especially *before* a release :)

Why do we need factories here?  --  the model came from how things work with 
Token/Filter factories.  Many processors need to maintain state within a 
request.  Check the 'log' processor.  I have one that checks if the user has 
permission on *everything* in the request before executing the commands.  We 
could have something that keeps track of what it did and backs out the changes 
if there is an error.  If each processor were shared across all requests, any 
state access would need to be synchronized and have some MapRequest,State 
that seems to get ugly pretty fast.

Why ChainedUpdateRequestProcessorFactory?  I see your point here.  I think we 
can force everything to be 'chained' -- The original implementation was not 
chained, but then the functional parts got split into their own components and 
chained together.  Removing the parent chained factory could simplify the whole 
thing.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2008-07-24 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616511#action_12616511
 ] 

Noble Paul commented on SOLR-269:
-

bq.Many processors need to maintain state within a request. Check the 'log' 
processor. I have one that checks if the user has permission on everything in 
the request before executing the commands. 
I do not think we need a factory where we need to maintain local state . 
Everything can be maintained in the method stack 

example
{code}
class LocalState{
class LogUpdateProcessor extends UpdateRequestProcessor {
  private final SolrQueryRequest req;
  private final SolrQueryResponse rsp;
  private final UpdateRequestProcessor next;
  private final NamedListObject toLog;
doSomething(){
//do your thing
}
}

 public class LogUpdateProcessor extends UpdateRequestProcessor
   
  
  
  @Override
  public void processAdd(AddUpdateCommand cmd) throws IOException {
 LocalState state = new LocalState ();//pass the params
 state.doSomeThing()
}

}

{code}


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2008-07-24 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616520#action_12616520
 ] 

Shalin Shekhar Mangar commented on SOLR-269:


bq. If each processor were shared across all requests, any state access would 
need to be synchronized and have some MapRequest,State that seems to get ugly 
pretty fast.
But we do have SolrQueryRequest#getContext to handle those cases, don't we? 
IMHO, we should not force users to write a factory class for each processor 
when the benefit is minimal and easy workarounds exist. Please correct me if 
I'm misunderstanding something.

bq. Nonsense - the more review / feedback / critique we get, the better - 
especially before a release :)
Glad to hear that, though I realize that I'm a year late and that we are very 
close to a release :)

It's just that I set out to use this API and had to jump around for quite a 
while to figure out how to use it and how it works. I was quite surprised to 
find the actual chaining happening in a class which is named 
NoOpUpdateProcessor -- though it made sense to me later. Also, it took me a 
while to find the wiki page for this feature because it is not linked off the 
main page (or the update xml/csv pages). I could find it because I knew that a 
class named UpdateRequestProcessor existed. We should link it off the main page 
so that it can be found more easily.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2008-07-24 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616544#action_12616544
 ] 

Ryan McKinley commented on SOLR-269:


I'm all for simplifying the API.  If you guys want to take a crack at it, I'll 
review it ASAP.  

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2008-07-24 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616649#action_12616649
 ] 

Yonik Seeley commented on SOLR-269:
---

bq. But we do have SolrQueryRequest#getContext to handle those cases, don't we? 
IMHO, we should not force users to write a factory class for each processor 
when the benefit is minimal and easy workarounds exist.

Right... the alternative to a per-request instance would be to use the request 
context.
In general, I think that would be more complex for a user though (if it's 
something they want to do per request-batch).
I think that can be made more efficient for bulk loading by using factories 
too... context lookups and decisions don't have to be made  for every document.


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-13 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512502
 ] 

Yonik Seeley commented on SOLR-269:
---

 How do you all feel about the basic structure?
It's a go!
It will get more complicated, I think, with document modification (SOLR-139)

 While it would be nice to keep the base stuff package protected,

I'm more concerned with the other parts of the API that this moves 
front-and-center... 
mainly UpdateCommand and friends... those were really quick hacks on my part 
since there were no custom update handlers at the time.

 One clever change is to have the LogUpdateProcessorFactory skip building a 
 LogUpdateProcessor if the log level is not INFO rather then keep a flag.

Nice!

I also need SOLR-139 btw, is it easy for you to commit this first to limit the 
size and scope of that patch?

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-12 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512356
 ] 

Ryan McKinley commented on SOLR-269:


 the RequestHandler has final say over what Processor gets used

absolutely!  The question is just what do in the default /update case.  I'm 
inclined to have the request say what processor to use.  With 'invariants' that 
can be fixed to a single implementation, and will let people configure 
processors without a custom handler.

How do you all feel about the basic structure?  I like the structure, but am 
not sure how 'public' to make the configuration and implementation.  While it 
would be nice to keep the base stuff package protected, then we can't have 
external configuration and external classes could not reuse the other bits of 
the chain (defeating the 'chain' advantages)

I have a pending deadline that depends on input processing and SOLR-139 
modifiable documents -- it would be great to work from a lightly patched trunk 
rather then a heavily patched one ;)

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-09 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511178
 ] 

Yonik Seeley commented on SOLR-269:
---

One issue: the current way of having a custom processor 
(CustomUpdateRequestHandler) seems less than ideal.
First is that CustomUpdateRequestHandler extends XMLUpdateRequestHandler 
but what if I want one for CSV, etc.
If update processors are to be a first-class part of Solr, it seems like one 
should be able to specify the processor to use
 for any update handler (CSV, XML, etc) without having to write extra classes 
for those.

Perhaps something like:
 requestHandler name=/customupdate class=solr.XmlUpdateRequestHandler 
   str name=update.processorstandard/str
 /requestHandler


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-09 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511195
 ] 

Ryan McKinley commented on SOLR-269:


Hymm.  the behavior on trunk is:
requestHandler name=/customupdate class=solr.XmlUpdateRequestHandler 
   str name=update.processor.factoryclass name for factory/str
 /requestHandler 
The latest patch has the argument lookup an XML configured factory.

Do you mean:

 requestHandler name=/customupdate class=solr.XmlUpdateRequestHandler 
lst name=invariants
  str name=update.processorstandard/str
/lst
 /requestHandler 

Given the direction we are heading, it seems nice to be able to change the 
update behavior from:
 /update?update.processor=do-fancy-document-cleanup
 /update?update.processor=go-quick-i-know-the-docs-are-clean

I made it a 1-1 relation (processor-handler) to avoid a hash lookup for each 
request, but from a pram would be ok.



 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-09 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511198
 ] 

Yonik Seeley commented on SOLR-269:
---

 I made it a 1-1 relation (processor-handler) to avoid a hash lookup for each 
 request,

That was my thinking too... I wasn't suggesting making it  an overrideable 
parameter, but I'm not really against it either.




 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-09 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511206
 ] 

Hoss Man commented on SOLR-269:
---

the idea of letting people override the processor on a per request basis seems 
very scary and depending on what kinds of stuff yo uwere expecting hte 
processor to do, could introduce some serous bugs ... but then again, if it's a 
param, it can be specified as an invariant if you want to ensure that doesn't 
happen.

i guess hte key thing is just that the RequestHandler has final say over what 
Processor gets used ... we can provide handy tools/conventions to get that info 
from the config or the request, but a very simplistic RequestHandler should be 
able to hardcode it for absolute control.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-06 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510749
 ] 

Yonik Seeley commented on SOLR-269:
---

 I'm not sure we need to have XML configuration for this

If we have those multiple update processor factories, I agree we don't need XML 
config for the transformers.

 I need a custom UpdateRequestProcessor that checks all the requests before 
 executing any of them. I plan to store the valid commands in a list and only 
 execute them in the finish() call. I'm not sure how to map that plan to an 
 chain. How would I pass the output from one processor to the next?

I had thought of that use-case too (bulk operations), which is why I added 
explicit flow contol (explicit calling of next.handleAdd() in the processor). 
You can buffer up all the requests (you want to clone the UpdateCommands as 
they might be reused though) and not call next.
Then in finish, you can delegate all of the buffered commands.


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-06 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510759
 ] 

Ryan McKinley commented on SOLR-269:


The only one I'm not sure about is:
- explicit flow control between processors for greatest flexibility 

I'm still trying to avoid the parent UpdateRequestProcessorFactory chain as a 
default behavior.  It seems fine as a super-duper custom controlller, but 
unurly in the default/slightly custom case.

Folding in:
- removal of NamedList return (as you say, chaining those makes less sense 
anyway)
- already extracted and optimized the complex (or rather bigger) logging logic 
from the simple index updating
- passed in SolrQueryResponse as well, enabling a processor to change the 
response 
is no problem.

If you like the general structure / flow of 
SOLR-269-UpdateRequestProcessorFactory.patch, I'll clean it up and work in this 
stuff.  Otherwise I'll look at how to make UpdateRequestProcessorFactory[] feel 
more palatable.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-06 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510771
 ] 

Yonik Seeley commented on SOLR-269:
---

 The only one I'm not sure about is:
 - explicit flow control between processors for greatest flexibility

It's a single call per hook:
  if (next != null) next.processAdd();

And it's exactly what you need for your buffering situation.
Chaining is the model that Lucene uses for it's analyzers too (only difference 
is that it's a pull instead of a push).

 I'm still trying to avoid the parent UpdateRequestProcessorFactory chain as a 
 default behavior. It seems fine as a super-duper custom controlller, but 
 unurly in the default/slightly custom case. 

I'm not clear on why... the configuration is more complex?

 If you like the general structure / flow of 
 SOLR-269-UpdateRequestProcessorFactory.patch

I'm not sure about the named processors... are they needed?
It seems like we need a standard one that is used by default everywhere,
and then *maybe* we need to be able to change them per-handler.  Do we need 
this up front, or could it be deferred?

It seems like there does need to be a method on SolrCore to get a 
RequestProcessor or Factory, since that becomes
the new interface to do an index change (otherwise you miss the doc 
transformations, etc).

 Otherwise I'll look at how to make UpdateRequestProcessorFactory[] feel more 
 palatable.

That could be wrapped in another UpdateRequestProcessorFactory if desired... it 
doesn't matter much if the impl is hidden by a class or a method IMO.


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-06 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510774
 ] 

Ryan McKinley commented on SOLR-269:



 It's a single call per hook:
   if (next != null) next.processAdd();
 

Ok.  I'm convinced.


 
 I'm not sure about the named processors... are they needed?
 It seems like we need a standard one that is used by default everywhere,
 and then *maybe* we need to be able to change them per-handler.  Do we need 
 this up front, or could it be deferred?

I'm not sure.  The only reason I think we *may* want to do it now is to keep 
the initialization standard and in a single place.  If we declare a default 
processor and have each handler optionally initialize their own, the config may 
look different.  RequestHandlers only have access to a NamedList while 
initialized, they can't (without serious changes) declare something like:
 requestHandler ...
   updateProcessor class= /
 /requestHandler

With that in mind, I think it best to build the updateProcessors using the 
standard PluginLoader framework and then have RequestHandlers access them by 
name.


 
 Otherwise I'll look at how to make UpdateRequestProcessorFactory[] feel more 
 palatable.
 
 That could be wrapped in another UpdateRequestProcessorFactory if desired... 
 it doesn't matter much if the impl is hidden by a class or a method IMO.

Ok, I'll start with UpdateProcessor.patch and fold in my changes.


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-05 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510425
 ] 

Yonik Seeley commented on SOLR-269:
---

FYI, I'm working up a prototype right now to handle multiple request processors.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-05 Thread Ryan McKinley


FYI, I'm working up a prototype right now to handle multiple request processors.



excellent.

check: https://issues.apache.org/jira/browse/SOLR-139

to see how I was thinking about supporting multiple processors. 
Essentially a single parent processor that may loop through 
'transformers' or whatever in the custom case.


(not that I think my design is *the* answer)

ryan




[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-05 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510484
 ] 

Ryan McKinley commented on SOLR-269:


I like the AddUpdateCommand changes

What do you see as the common use case for wanting to chain request processors? 
 Is the LogUpdateRequestProcessor just an example?  

The one compelling chained use case I can think of is for document 
transformation.  In SOLR-139, I toyed with SolrInputDocumentTransformer.  The 
default case does nothing, and a subclass may use something like:
  for( SolrInputDocumentTransformer t : transformers ) {
doc = t.transform( doc );
  }


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-05 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510487
 ] 

Yonik Seeley commented on SOLR-269:
---

 What do you see as the common use case for wanting to chain request 
 processors?
 - conditional copyField, field transformations (between multiple fields too... 
something Analyzer can't do), loading certain fields from a database if missing,
   updating a related document, etc.

 Is the LogUpdateRequestProcessor just an example?

IMO, it's a default since no logging is done by the 
ChangeUpdateRequestProcessor (anyone think of a better name for that?).
Then in a Benchmarking section of the Solr Wiki, we could advise to remove 
logging altogether.  Or you could remove the ChangeUpdateRequestProcessor  to 
skip index
changes to better benchmark hotspots in the parsing + doc creation phase, etc.

 The one compelling chained use case I can think of is for document 
 transformation

Ah, I briefly looked at SOLR-139 when you mentioned it before, but missed the 
transformer stuff.
In a way multiple update processors are more generic and wide open... you could 
actually insert two documents into the index for each doc added,
you could do transforms on the actual Lucene document (add Field options that 
Solr doesn't currently support, etc.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-05 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510504
 ] 

Yonik Seeley commented on SOLR-269:
---

Some other issues how to configure processors for multiple update handlers? 
Perhaps allow configuration of a global default for update handlers with no 
processors specified?  That would make it easy to make sure your custom 
processor was used everywhere.
We should probably have a base class for update handlers to implement 
initialization logic.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-05 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510506
 ] 

Ryan McKinley commented on SOLR-269:


  - conditional copyField, field transformations (between multiple fields 
 too... something Analyzer can't do), loading certain fields from a database 
 if missing,
updating a related document, etc.
 

This all works nicely with a simple 'transform' chain.


 Is the LogUpdateRequestProcessor just an example?
 
 IMO, it's a default since no logging is done by the 
 ChangeUpdateRequestProcessor (anyone think of a better name for that?).
 Then in a Benchmarking section of the Solr Wiki, we could advise to remove 
 logging altogether.  Or you could remove the ChangeUpdateRequestProcessor  to 
 skip index
 changes to better benchmark hotspots in the parsing + doc creation phase, etc.
 

Isn't logging best configured with standard java.util.logging settings? If 
necessary, the base processor could check if the logging level is high enough 
to keep track of somethings.

For benchmarking, don't we just want a single noop processor?


 The one compelling chained use case I can think of is for document 
 transformation
 
 Ah, I briefly looked at SOLR-139 when you mentioned it before, but missed the 
 transformer stuff.
 In a way multiple update processors are more generic and wide open... you 
 could actually insert two documents into the index for each doc added,
 you could do transforms on the actual Lucene document (add Field options that 
 Solr doesn't currently support, etc.
 

I see what you are getting at, but makes the basic cases more complicated then 
it needs to be.  I have been considering UpdateRequestProcessor as an 
'advanced' option where changing their behavior is writing custom code -- not 
text configuration.

In the advanced case where you want to build multiple documents or munge the 
actual Lucene document existing it may be more difficult to live in a chain 
rather then have explicit control.  If 

I think the cleanest design would be a single entry point and keeping the real 
functionality in easily subclassed functions or utility classes.  The latest 
SOLR-139 tries that (but it could still use some cleanup)

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-05 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510510
 ] 

Ryan McKinley commented on SOLR-269:



 Some other issues how to configure processors for multiple update 
 handlers? Perhaps allow configuration of a global default for update handlers 
 with no processors specified?  That would make it easy to make sure your 
 custom processor was used everywhere.

SolrCore could have a single UpdateRequestProcessorFactory that handlers could 
use as the default.  I'm reluctant to add another plugin layer, but this would 
make it easier to share with the CSV update handler and others.

Since its a factory, it will be thread safe across multiple handlers.

Again, I'm reluctant to think about configuring a processor chain in 
solrconfig.xml -- we should make the most sensible/extendible default 
implementation, but IMO tweeking RequestProcessor functionality should be done 
with custom code.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-05 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510563
 ] 

Ryan McKinley commented on SOLR-269:


 perhaps the extra functionality of transformations and updating should be 
 pushed into the UpdateHandler interface

That was the first SOLR-139 design!

Having thought about it for a while, i think there are nice advantages to 
keeping the updating/modifying outside of the UpdateHandler - the biggest one 
is that various RequestHandlers *could* transform the document differently.

I'm putting together a hybrid example that (I hope) answers questions about 
chains/configuration/transformation, etc.  I'll post it shortly.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-03 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510004
 ] 

Hoss Man commented on SOLR-269:
---

 Different Q on usage: is this where my document mutator stuff should go??? If 
 I want a transformation done on a field, regardless of 
 where the data is coming from (XML update handler, CSV update handler, future 
 REST update handler, etc), how should that be done? 
 Is there a single place I can register a plugin to do this, and is 
 UpdateRequestProcessor where you see it happening?

i believe that was acutally the initial intent of UpdateRequestProcesso, note 
the javadocs...

 * This is a good place for subclassed update handlers to process the document 
before it is 
 * indexed.  You may wish to add/remove fields or check if the requested user 
is allowed to 
 * update the given document...
 * 
 * Perhaps you continue adding an error message (without indexing the 
document)...
 * perhaps you throw an error and halt indexing (remove anything already 
indexed??)


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-03 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510009
 ] 

Ryan McKinley commented on SOLR-269:



 getCore(), getSchema(), getUpdateHandler() all just return instance 
 variables.  
 

ok, getCore() isn't a good canidate; It is annoying to start every function 
with a train wreck: req.getCore().getUpdateHandler()

Having a single class per request makes sense for a subclass I am working with 
-- it does some expensive initialization and stores the results.  I could put 
this in req.getContext()


 instantiating and initializing all those request processors will get 
 expensive.

Really?  the default initialize is trivial - stuff that would happen at the 
beginning of every function anyway.  I suppose GC could be an issue


 I do see your usecase though, in the case of multiple docs per add and you 
 have some expensive state you only want to calculate once.

In r552986, I changed the logging to match solr 1.2 -- this required 
accumulating the id's and spitting them out at the end.  In 1.2 with processing 
and parsing entwined, this was just a giant loop.  To get the same behavior we 
need to stash it somewhere...


 Different Q on usage: is this where my document mutator stuff should go??? 

Yes.  The intent is to have a simple place between document parsing and 
indexing where you can do whatever you need to do.  Any parsing strategy 
(XML,JSON,etc) could share the same processor.

Looking at SOLR-139, I now think the most flexible/useful way to support 
modifiable documents is to build utility functions for the UpdateProcessor that 
can manipulate SolrInputDocuments. 

- - -

I will take another crack at SOLR-139 implemented in the UpdateProcessor, then 
we should return to the question of singleton vs factory - trying to work with 
a more complex processor may make this choice more obvious.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-02 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509725
 ] 

Yonik Seeley commented on SOLR-269:
---

Looking at UpdateRequestProcessor further, it seems like these should be 
singletons (instance per entry in solrconfig, no factory needed), and any extra 
state that is needed
should be added to classes we already have (like AddCommand, etc), no?


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-02 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509732
 ] 

Yonik Seeley commented on SOLR-269:
---

I think the newly added incremental time should not be on by default, as well 
as logging per id for deletes and adds.
Mike added the id aggregation code specifically because logging each add was 
taking so much time.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-02 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509733
 ] 

Ryan McKinley commented on SOLR-269:


maybe.  I'm not sure I totally understand your suggestion though.  

I need something that is easily subclassed and can cleanly holds state across 
an entire request cycle.  The alternative is to pass the 
SolrQueryRequest/Response into each action and maybe pull out the 
schema/updateHandler/logged in user/etc for each command (each document in the 
list of 100)

Is the factory a performance concern?  (to my tastes) it seems nicer to work 
with:
 
 processDelete( DeleteUpdateCommand cmd )
 {
   if( user.isAdmin() ) {
 updateHandler.delete( cmd );
   }  
   else {
...
   }
 }
 
 than:
 
 processDelete( DeleteUpdateCommand cmd, SolrQueryRequest req, 
SolrQueryResponse rsp )
 {
   User user = req.getContext().get( user );
   if( user.isAdmin() ) {
SolrCore core = req.getCore();
SolrSchema schema = core.getSchema();
UpdateHandler updateHandler = core.getUpdateHandler();
updateHandler.delete( cmd );
   }  
   else {
 ...
   }
 }
 
I'm fine either way, like the easy 1 per-request interface.  

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-02 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509734
 ] 

Ryan McKinley commented on SOLR-269:


 I think the newly added incremental time should not be on by default, as well 
 as logging per id for deletes and adds.
 Mike added the id aggregation code specifically because logging each add was 
 taking so much time.

sounds good.  the testing I did showed that lots of time is spent in the 
logging phase.

I will remove it from the default implementation.


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-02 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509737
 ] 

Yonik Seeley commented on SOLR-269:
---

 I need something that is easily subclassed and can cleanly holds state across 
 an entire request cycle.

Having a factory and separate object so that one can use core instead of 
req.getCore(), etc, seems like overkill for the normal case though since
getCore(), getSchema(), getUpdateHandler() all just return instance variables.  
I was thinking any state like that could be on the UpdateCommand.

I'd like to have potentially several request processors, but if people start 
doing single doc add requests, instantiating and initializing all those request 
processors will get expensive.

I do see your usecase though, in the case of multiple docs per add and you have 
some expensive state you only want to calculate once.
If it's a relatively rare case, one could put it in the request context.
The tradeoff would be an extra hash lookup per-document of a multi-document add 
vs an extra object creation for single-doc adds.

Different Q on usage: is this where my document mutator stuff should go???  If 
I want a transformation done on a field, regardless of where the data is coming 
from (XML update handler, CSV update handler, future REST update handler, etc), 
how should that be done?  Is there a single place I can register a plugin to do 
this, and is UpdateRequestProcessor where you see it happening?

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.