date:20070706

Re: LICENSE.txt nitpick?

2007-07-06 Thread Chris Hostetter


: Is line 189 of LICENSE.txt supposed to say something other then:
:
: Copyright [] [name of copyright owner]

I dont' think so, that's the section of the Apache license that explains
how to license your work, it's the boilerplate example of what needs to be
included in each file.



-Hoss

[jira] Commented: (SOLR-215) Multiple Solr Cores

2007-07-06 Thread Walter Ferrara (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510723
 ] 

Walter Ferrara commented on SOLR-215:
-

By using the patch, (assuming I'm using it correctly), it seems that Solr is 
not able anymore to load my handlers, which resides in a jar under solr/lib 
dir. The exception I've got is (handler class name censored):

GRAVE: org.apache.solr.common.SolrException: Error loading class 
'com.**.**'
at org.apache.solr.core.Config.findClass(Config.java:295)
[..]
Caused by: java.lang.ClassNotFoundException: com.**.**
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
[..]
(full stack trace available if needed)

The problem arise in both patched trunk I've tested (550264 with previous 
patch, and 552910 with latest patch), I've been compiling it using Netbeans 5.5 
and java1.6 on windows.
To resolve the issue, I modified a bit the Config.java. Now it works fine, it 
loads all the jars, but full implication of the change I made have to be 
determined.

Here the modification I made on patched (org.apache.solr.core) Config.java 
(working Config.java versus original solr-215  Config_solr215.java)

*** Config.java
--- Config_origSolr215.java
***
*** 393,399 
SolrException.log(log,Can't construct solr lib class loader, e);
  }
}
!   if (null == classLoader) classLoader = loader;
  }
  return classLoader;
}
--- 393,399 
SolrException.log(log,Can't construct solr lib class loader, e);
  }
}
!   classLoader = loader;
  }
  return classLoader;
}


 Multiple Solr Cores
 ---

 Key: SOLR-215
 URL: https://issues.apache.org/jira/browse/SOLR-215
 Project: Solr
  Issue Type: Improvement
Reporter: Henri Biestro
Priority: Minor
 Attachments: solr-215.patch, solr-215.patch, solr-215.patch, 
 solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, 
 solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, 
 solr-trunk-src.patch


 WHAT:
 As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
 This patch is intended to allow multiple cores in Solr which also brings 
 multiple indexes capability.
 WHY:
 The current Solr practical wisdom is that one schema - thus one index - is 
 most likely to accomodate your indexing needs, using a filter to segregate 
 documents if needed. If you really need multiple indexes, deploy multiple web 
 applications.
 There are a some use cases however where having multiple indexes or multiple 
 cores through Solr itself may make sense.
 Multiple cores:
 Deployment issues within some organizations where IT will resist deploying 
 multiple web applications.
 Seamless schema update where you can create a new core and switch to it 
 without starting/stopping servers.
 Embedding Solr in your own application (instead of 'raw' Lucene) and 
 functionally need to segregate schemas  collections.
 Multiple indexes:
 Multiple language collections where each document exists in different 
 languages, analysis being language dependant.
 Having document types that have nothing (or very little) in common with 
 respect to their schema, their lifetime/update frequencies or even collection 
 sizes.
 HOW:
 The best analogy is to consider that instead of deploying multiple 
 web-application, you can have one web-application that hosts more than one 
 Solr core. The patch does not change any of the core logic (nor the core 
 code); each core is configured  behaves exactly as the one core in 1.2; the 
 various caches are per-core  so is the info-bean-registry.
 What the patch does is replace the SolrCore singleton by a collection of 
 cores; all the code modifications are driven by the removal of the different 
 singletons (the config, the schema  the core).
 Each core is 'named' and a static map (keyed by name) allows to easily manage 
 them.
 You declare one servlet filter mapping per core you want to expose in the 
 web.xml; this allows easy to access each core through a different url. 
 USAGE (example web deployment, patch installed):
 Step0
 java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml 
 monitor.ml
 Will index the 2 documents in solr.xml  monitor.xml
 Step1:
 http://localhost:8983/solr/core0/admin/stats.jsp
 Will produce the statistics page from the admin servlet on core0 index; 2 
 documents
 Step2:
 http://localhost:8983/solr/core1/admin/stats.jsp
 Will produce the statistics page from the admin servlet on core1 index; no 
 documents
 Step3:
 java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
 java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
 Adds the ipod*.xml to

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-06 Thread Yonik Seeley (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510749
]

Yonik Seeley commented on SOLR-269:
---

I'm not sure we need to have XML configuration for this

If we have those multiple update processor factories, I agree we don't need XML
config for the transformers.

I need a custom UpdateRequestProcessor that checks all the requests before
executing any of them. I plan to store the valid commands in a list and only
execute them in the finish() call. I'm not sure how to map that plan to an
chain. How would I pass the output from one processor to the next?

I had thought of that use-case too (bulk operations), which is why I added
explicit flow contol (explicit calling of next.handleAdd() in the processor).
You can buffer up all the requests (you want to clone the UpdateCommands as
they might be reused though) and not call next.
Then in finish, you can delegate all of the buffered commands.

UpdateRequestProcessorFactory - process requests before submitting them
---

Key: SOLR-269
URL: https://issues.apache.org/jira/browse/SOLR-269
Project: Solr
Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
Fix For: 1.3

Attachments: SOLR-269-UpdateRequestProcessorFactory.patch,
SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch

A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit.
An UpdateRequestProcessor lets clients plug in logic after a document has
been parsed and before it has been 'updated' with the index. This is a good
place to add custom logic for:
* transforming the document fields
* fine grained authorization (can user X updated document Y?)
* allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler
str
name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
lst name=update.processor.args
... (optionally pass in arguments to the factory init method) ...
/lst
/requestHandler
http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-06 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510759
 ] 

Ryan McKinley commented on SOLR-269:


The only one I'm not sure about is:
- explicit flow control between processors for greatest flexibility 

I'm still trying to avoid the parent UpdateRequestProcessorFactory chain as a 
default behavior.  It seems fine as a super-duper custom controlller, but 
unurly in the default/slightly custom case.

Folding in:
- removal of NamedList return (as you say, chaining those makes less sense 
anyway)
- already extracted and optimized the complex (or rather bigger) logging logic 
from the simple index updating
- passed in SolrQueryResponse as well, enabling a processor to change the 
response 
is no problem.

If you like the general structure / flow of 
SOLR-269-UpdateRequestProcessorFactory.patch, I'll clean it up and work in this 
stuff.  Otherwise I'll look at how to make UpdateRequestProcessorFactory[] feel 
more palatable.

 UpdateRequestProcessorFactory - process requests before submitting them
 ---

 Key: SOLR-269
 URL: https://issues.apache.org/jira/browse/SOLR-269
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 1.3

 Attachments: SOLR-269-UpdateRequestProcessorFactory.patch, 
 SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch


 A simple UpdateRequestProcessor was added to a bloated SOLR-133 commit. 
 An UpdateRequestProcessor lets clients plug in logic after a document has 
 been parsed and before it has been 'updated' with the index.  This is a good 
 place to add custom logic for:
  * transforming the document fields
  * fine grained authorization (can user X updated document Y?)
  * allow update, but not delete (by query?)
requestHandler name=/update class=solr.StaxUpdateRequestHandler 
  str 
 name=update.processor.classorg.apache.solr.handler.UpdateRequestProcessor/str
  lst name=update.processor.args
   ... (optionally pass in arguments to the factory init method) ...
  /lst 
/requestHandler
 http://www.nabble.com/Re%3A-svn-commit%3A-r547495---in--lucene-solr-trunk%3A-example-solr-conf-solrconfig.xml-src-java-org-apache-solr-handler-StaxUpdateRequestHandler.java-src-java-org-apache-solr-handler-UpdateRequestProcessor.jav-tf3950072.html#a11206583

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-06 Thread Yonik Seeley (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510771
]

Yonik Seeley commented on SOLR-269:
---

The only one I'm not sure about is:
- explicit flow control between processors for greatest flexibility

It's a single call per hook:
if (next != null) next.processAdd();

And it's exactly what you need for your buffering situation.
Chaining is the model that Lucene uses for it's analyzers too (only difference
is that it's a pull instead of a push).

I'm still trying to avoid the parent UpdateRequestProcessorFactory chain as a
default behavior. It seems fine as a super-duper custom controlller, but
unurly in the default/slightly custom case.

I'm not clear on why... the configuration is more complex?

If you like the general structure / flow of
SOLR-269-UpdateRequestProcessorFactory.patch

I'm not sure about the named processors... are they needed?
It seems like we need a standard one that is used by default everywhere,
and then *maybe* we need to be able to change them per-handler. Do we need
this up front, or could it be deferred?

It seems like there does need to be a method on SolrCore to get a
RequestProcessor or Factory, since that becomes
the new interface to do an index change (otherwise you miss the doc
transformations, etc).

Otherwise I'll look at how to make UpdateRequestProcessorFactory[] feel more
palatable.

That could be wrapped in another UpdateRequestProcessorFactory if desired... it
doesn't matter much if the impl is hidden by a class or a method IMO.

UpdateRequestProcessorFactory - process requests before submitting them
---

Key: SOLR-269
URL: https://issues.apache.org/jira/browse/SOLR-269
Project: Solr
Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
Fix For: 1.3

Attachments: SOLR-269-UpdateRequestProcessorFactory.patch,
SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

2007-07-06 Thread Ryan McKinley (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510774
]

Ryan McKinley commented on SOLR-269:

It's a single call per hook:
if (next != null) next.processAdd();

Ok. I'm convinced.

I'm not sure. The only reason I think we *may* want to do it now is to keep
the initialization standard and in a single place. If we declare a default
processor and have each handler optionally initialize their own, the config may
look different. RequestHandlers only have access to a NamedList while
initialized, they can't (without serious changes) declare something like:
requestHandler ...
updateProcessor class= /
/requestHandler

With that in mind, I think it best to build the updateProcessors using the
standard PluginLoader framework and then have RequestHandlers access them by
name.

Otherwise I'll look at how to make UpdateRequestProcessorFactory[] feel more
palatable.

That could be wrapped in another UpdateRequestProcessorFactory if desired...
it doesn't matter much if the impl is hidden by a class or a method IMO.

Ok, I'll start with UpdateProcessor.patch and fold in my changes.

UpdateRequestProcessorFactory - process requests before submitting them
---

Key: SOLR-269
URL: https://issues.apache.org/jira/browse/SOLR-269
Project: Solr
Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
Fix For: 1.3

Attachments: SOLR-269-UpdateRequestProcessorFactory.patch,
SOLR-269-UpdateRequestProcessorFactory.patch, UpdateProcessor.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-215) Multiple Solr Cores

2007-07-06 Thread Henri Biestro (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510853
]

Henri Biestro commented on SOLR-215:

Thanks Walter.

I've been fighting a bit with this code in the same kind of environment
(NB5.5 / JVM 1.5).
The static classLoader was not assigned correctly and I already had to modify
the original code to workaround it.
Looks like the JVM 1.6 reintroduces the issue. I don't understand why this
happens - may be class loading through NB...
The fix you propose seems totally harmless; I'll check against a 1.5 JVM
introduce it in the next upload.

Using the patch should be straightforward besides handler classes needing a
constructor with a SolrCore.
Let me know how it goes.

Multiple Solr Cores
---

Key: SOLR-215
URL: https://issues.apache.org/jira/browse/SOLR-215
Project: Solr
Issue Type: Improvement
Reporter: Henri Biestro
Priority: Minor
Attachments: solr-215.patch, solr-215.patch, solr-215.patch,
solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch,
solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch,
solr-trunk-src.patch

WHAT:
As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
This patch is intended to allow multiple cores in Solr which also brings
multiple indexes capability.
WHY:
The current Solr practical wisdom is that one schema - thus one index - is
most likely to accomodate your indexing needs, using a filter to segregate
documents if needed. If you really need multiple indexes, deploy multiple web
applications.
There are a some use cases however where having multiple indexes or multiple
cores through Solr itself may make sense.
Multiple cores:
Deployment issues within some organizations where IT will resist deploying
multiple web applications.
Seamless schema update where you can create a new core and switch to it
without starting/stopping servers.
Embedding Solr in your own application (instead of 'raw' Lucene) and
functionally need to segregate schemas collections.
Multiple indexes:
Multiple language collections where each document exists in different
languages, analysis being language dependant.
Having document types that have nothing (or very little) in common with
respect to their schema, their lifetime/update frequencies or even collection
sizes.
HOW:
The best analogy is to consider that instead of deploying multiple
web-application, you can have one web-application that hosts more than one
Solr core. The patch does not change any of the core logic (nor the core
code); each core is configured behaves exactly as the one core in 1.2; the
various caches are per-core so is the info-bean-registry.
What the patch does is replace the SolrCore singleton by a collection of
cores; all the code modifications are driven by the removal of the different
singletons (the config, the schema the core).
Each core is 'named' and a static map (keyed by name) allows to easily manage
them.
You declare one servlet filter mapping per core you want to expose in the
web.xml; this allows easy to access each core through a different url.
USAGE (example web deployment, patch installed):
Step0
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml
monitor.ml
Will index the 2 documents in solr.xml monitor.xml
Step1:
http://localhost:8983/solr/core0/admin/stats.jsp
Will produce the statistics page from the admin servlet on core0 index; 2
documents
Step2:
http://localhost:8983/solr/core1/admin/stats.jsp
Will produce the statistics page from the admin servlet on core1 index; no
documents
Step3:
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
running queries from the admin interface, you can verify indexes have
different content.
USAGE (Java code):
//create a configuration
SolrConfig config = new SolrConfig(solrconfig.xml);
//create a schema
IndexSchema schema = new IndexSchema(config, schema0.xml);
//create a core from the 2 other.
SolrCore core = new SolrCore(core0, /path/to/index, config, schema);
//Accessing a core:
SolrCore core = SolrCore.getCore(core0);
PATCH MODIFICATIONS DETAILS (per package):
org.apache.solr.core:
The heaviest modifications are in SolrCore SolrConfig.
SolrCore is the most obvious modification; instead of a singleton, there is a
static map of cores keyed by names and assorted methods. To retain some
compatibility, the 'null' named core replaces the singleton for the relevant
methods, for instance SolrCore.getCore(). One small constraint on the core
name is they can't contain '/' or '\' avoiding

Re: LICENSE.txt nitpick?

[jira] Commented: (SOLR-215) Multiple Solr Cores

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

[jira] Commented: (SOLR-269) UpdateRequestProcessorFactory - process requests before submitting them

[jira] Commented: (SOLR-215) Multiple Solr Cores

7 matches

Site Navigation

Mail list logo

Footer information