Re: LICENSE.txt nitpick?

2007-07-06 Thread Chris Hostetter

: Is line 189 of LICENSE.txt supposed to say something other then:
:
: Copyright [] [name of copyright owner]

I dont' think so, that's the section of the Apache license that explains
how to license your work, it's the boilerplate example of what needs to be
included in each file.



-Hoss



[jira] Commented: (SOLR-215) Multiple Solr Cores

2007-07-06 Thread Walter Ferrara (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510723
 ] 

Walter Ferrara commented on SOLR-215:
-

By using the patch, (assuming I'm using it correctly), it seems that Solr is 
not able anymore to load my handlers, which resides in a jar under solr/lib 
dir. The exception I've got is (handler class name censored):

GRAVE: org.apache.solr.common.SolrException: Error loading class 
'com.**.**'
at org.apache.solr.core.Config.findClass(Config.java:295)
[..]
Caused by: java.lang.ClassNotFoundException: com.**.**
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
[..]
(full stack trace available if needed)

The problem arise in both patched trunk I've tested (550264 with previous 
patch, and 552910 with latest patch), I've been compiling it using Netbeans 5.5 
and java1.6 on windows.
To resolve the issue, I modified a bit the Config.java. Now it works fine, it 
loads all the jars, but full implication of the change I made have to be 
determined.

Here the modification I made on patched (org.apache.solr.core) Config.java 
(working Config.java versus original solr-215  Config_solr215.java)

*** Config.java
--- Config_origSolr215.java
***
*** 393,399 
SolrException.log(log,Can't construct solr lib class loader, e);
  }
}
!   if (null == classLoader) classLoader = loader;
  }
  return classLoader;
}
--- 393,399 
SolrException.log(log,Can't construct solr lib class loader, e);
  }
}
!   classLoader = loader;
  }
  return classLoader;
}


 Multiple Solr Cores
 ---

 Key: SOLR-215
 URL: https://issues.apache.org/jira/browse/SOLR-215
 Project: Solr
  Issue Type: Improvement
Reporter: Henri Biestro
Priority: Minor
 Attachments: solr-215.patch, solr-215.patch, solr-215.patch, 
 solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, 
 solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, 
 solr-trunk-src.patch


 WHAT:
 As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
 This patch is intended to allow multiple cores in Solr which also brings 
 multiple indexes capability.
 WHY:
 The current Solr practical wisdom is that one schema - thus one index - is 
 most likely to accomodate your indexing needs, using a filter to segregate 
 documents if needed. If you really need multiple indexes, deploy multiple web 
 applications.
 There are a some use cases however where having multiple indexes or multiple 
 cores through Solr itself may make sense.
 Multiple cores:
 Deployment issues within some organizations where IT will resist deploying 
 multiple web applications.
 Seamless schema update where you can create a new core and switch to it 
 without starting/stopping servers.
 Embedding Solr in your own application (instead of 'raw' Lucene) and 
 functionally need to segregate schemas  collections.
 Multiple indexes:
 Multiple language collections where each document exists in different 
 languages, analysis being language dependant.
 Having document types that have nothing (or very little) in common with 
 respect to their schema, their lifetime/update frequencies or even collection 
 sizes.
 HOW:
 The best analogy is to consider that instead of deploying multiple 
 web-application, you can have one web-application that hosts more than one 
 Solr core. The patch does not change any of the core logic (nor the core 
 code); each core is configured  behaves exactly as the one core in 1.2; the 
 various caches are per-core  so is the info-bean-registry.
 What the patch does is replace the SolrCore singleton by a collection of 
 cores; all the code modifications are driven by the removal of the different 
 singletons (the config, the schema  the core).
 Each core is 'named' and a static map (keyed by name) allows to easily manage 
 them.
 You declare one servlet filter mapping per core you want to expose in the 
 web.xml; this allows easy to access each core through a different url. 
 USAGE (example web deployment, patch installed):
 Step0
 java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml 
 monitor.ml
 Will index the 2 documents in solr.xml  monitor.xml
 Step1:
 http://localhost:8983/solr/core0/admin/stats.jsp
 Will produce the statistics page from the admin servlet on core0 index; 2 
 documents
 Step2:
 http://localhost:8983/solr/core1/admin/stats.jsp
 Will produce the statistics page from the admin servlet on core1 index; no 
 documents
 Step3:
 java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
 java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
 Adds the ipod*.xml to 

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-06 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510749
 ] 

Yonik Seeley commented on SOLR-269:
---

 I'm not sure we need to have XML configuration for this

If we have those multiple update processor factories, I agree we don't need XML 
config for the transformers.

 I need a custom UpdateRequestProcessor that checks all the requests before 
 executing any of them. I plan to store the valid commands in a list and only 
 execute them in the finish() call. I'm not sure how to map that plan to an 
 chain. How would I pass the output from one processor to the next?

I had thought of that use-case too (bulk operations), which is why I added 
explicit flow contol (explicit calling of next.handleAdd() in the processor). 
You can buffer up all the requests (you want to clone the UpdateCommands as 
they might be reused though) and not call next.
Then in finish, you can delegate all of the buffered commands.


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-06 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510759
 ] 

Ryan McKinley commented on SOLR-269:


The only one I'm not sure about is:
- explicit flow control between processors for greatest flexibility 

I'm still trying to avoid the parent UpdateRequestProcessorFactory chain as a 
default behavior.  It seems fine as a super-duper custom controlller, but 
unurly in the default/slightly custom case.

Folding in:
- removal of NamedList return (as you say, chaining those makes less sense 
anyway)
- already extracted and optimized the complex (or rather bigger) logging logic 
from the simple index updating
- passed in SolrQueryResponse as well, enabling a processor to change the 
response 
is no problem.

If you like the general structure / flow of 
SOLR-269-UpdateRequestProcessorFactory.patch, I'll clean it up and work in this 
stuff.  Otherwise I'll look at how to make UpdateRequestProcessorFactory[] feel 
more palatable.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-06 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510771
 ] 

Yonik Seeley commented on SOLR-269:
---

 The only one I'm not sure about is:
 - explicit flow control between processors for greatest flexibility

It's a single call per hook:
  if (next != null) next.processAdd();

And it's exactly what you need for your buffering situation.
Chaining is the model that Lucene uses for it's analyzers too (only difference 
is that it's a pull instead of a push).

 I'm still trying to avoid the parent UpdateRequestProcessorFactory chain as a 
 default behavior. It seems fine as a super-duper custom controlller, but 
 unurly in the default/slightly custom case. 

I'm not clear on why... the configuration is more complex?

 If you like the general structure / flow of 
 SOLR-269-UpdateRequestProcessorFactory.patch

I'm not sure about the named processors... are they needed?
It seems like we need a standard one that is used by default everywhere,
and then *maybe* we need to be able to change them per-handler.  Do we need 
this up front, or could it be deferred?

It seems like there does need to be a method on SolrCore to get a 
RequestProcessor or Factory, since that becomes
the new interface to do an index change (otherwise you miss the doc 
transformations, etc).

 Otherwise I'll look at how to make UpdateRequestProcessorFactory[] feel more 
 palatable.

That could be wrapped in another UpdateRequestProcessorFactory if desired... it 
doesn't matter much if the impl is hidden by a class or a method IMO.


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-06 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510774
 ] 

Ryan McKinley commented on SOLR-269:



 It's a single call per hook:
   if (next != null) next.processAdd();
 

Ok.  I'm convinced.


 
 I'm not sure about the named processors... are they needed?
 It seems like we need a standard one that is used by default everywhere,
 and then *maybe* we need to be able to change them per-handler.  Do we need 
 this up front, or could it be deferred?

I'm not sure.  The only reason I think we *may* want to do it now is to keep 
the initialization standard and in a single place.  If we declare a default 
processor and have each handler optionally initialize their own, the config may 
look different.  RequestHandlers only have access to a NamedList while 
initialized, they can't (without serious changes) declare something like:
 requestHandler ...
   updateProcessor class= /
 /requestHandler

With that in mind, I think it best to build the updateProcessors using the 
standard PluginLoader framework and then have RequestHandlers access them by 
name.


 
 Otherwise I'll look at how to make UpdateRequestProcessorFactory[] feel more 
 palatable.
 
 That could be wrapped in another UpdateRequestProcessorFactory if desired... 
 it doesn't matter much if the impl is hidden by a class or a method IMO.

Ok, I'll start with UpdateProcessor.patch and fold in my changes.


 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-215) Multiple Solr Cores

2007-07-06 Thread Henri Biestro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510853
 ] 

Henri Biestro commented on SOLR-215:


Thanks Walter.

I've been fighting a bit with this code in the same kind of environment 
(NB5.5 / JVM 1.5).
The static classLoader was not assigned correctly and I already had to modify 
the original code to workaround it.
Looks like the JVM 1.6 reintroduces the issue. I don't understand why this 
happens - may be class loading through NB...
The fix you propose seems totally harmless; I'll check against a 1.5 JVM  
introduce it in the next upload.

Using the patch should be straightforward besides handler classes needing a 
constructor with a SolrCore.
Let me know how it goes.

 Multiple Solr Cores
 ---

 Key: SOLR-215
 URL: https://issues.apache.org/jira/browse/SOLR-215
 Project: Solr
  Issue Type: Improvement
Reporter: Henri Biestro
Priority: Minor
 Attachments: solr-215.patch, solr-215.patch, solr-215.patch, 
 solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, 
 solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, 
 solr-trunk-src.patch


 WHAT:
 As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
 This patch is intended to allow multiple cores in Solr which also brings 
 multiple indexes capability.
 WHY:
 The current Solr practical wisdom is that one schema - thus one index - is 
 most likely to accomodate your indexing needs, using a filter to segregate 
 documents if needed. If you really need multiple indexes, deploy multiple web 
 applications.
 There are a some use cases however where having multiple indexes or multiple 
 cores through Solr itself may make sense.
 Multiple cores:
 Deployment issues within some organizations where IT will resist deploying 
 multiple web applications.
 Seamless schema update where you can create a new core and switch to it 
 without starting/stopping servers.
 Embedding Solr in your own application (instead of 'raw' Lucene) and 
 functionally need to segregate schemas  collections.
 Multiple indexes:
 Multiple language collections where each document exists in different 
 languages, analysis being language dependant.
 Having document types that have nothing (or very little) in common with 
 respect to their schema, their lifetime/update frequencies or even collection 
 sizes.
 HOW:
 The best analogy is to consider that instead of deploying multiple 
 web-application, you can have one web-application that hosts more than one 
 Solr core. The patch does not change any of the core logic (nor the core 
 code); each core is configured  behaves exactly as the one core in 1.2; the 
 various caches are per-core  so is the info-bean-registry.
 What the patch does is replace the SolrCore singleton by a collection of 
 cores; all the code modifications are driven by the removal of the different 
 singletons (the config, the schema  the core).
 Each core is 'named' and a static map (keyed by name) allows to easily manage 
 them.
 You declare one servlet filter mapping per core you want to expose in the 
 web.xml; this allows easy to access each core through a different url. 
 USAGE (example web deployment, patch installed):
 Step0
 java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml 
 monitor.ml
 Will index the 2 documents in solr.xml  monitor.xml
 Step1:
 http://localhost:8983/solr/core0/admin/stats.jsp
 Will produce the statistics page from the admin servlet on core0 index; 2 
 documents
 Step2:
 http://localhost:8983/solr/core1/admin/stats.jsp
 Will produce the statistics page from the admin servlet on core1 index; no 
 documents
 Step3:
 java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
 java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
 Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
 running queries from the admin interface, you can verify indexes have 
 different content. 
 USAGE (Java code):
 //create a configuration
 SolrConfig config = new SolrConfig(solrconfig.xml);
 //create a schema
 IndexSchema schema = new IndexSchema(config, schema0.xml);
 //create a core from the 2 other.
 SolrCore core = new SolrCore(core0, /path/to/index, config, schema);
 //Accessing a core:
 SolrCore core = SolrCore.getCore(core0); 
 PATCH MODIFICATIONS DETAILS (per package):
 org.apache.solr.core:
 The heaviest modifications are in SolrCore  SolrConfig.
 SolrCore is the most obvious modification; instead of a singleton, there is a 
 static map of cores keyed by names and assorted methods. To retain some 
 compatibility, the 'null' named core replaces the singleton for the relevant 
 methods, for instance SolrCore.getCore(). One small constraint on the core 
 name is they can't contain '/' or '\' avoiding