Solr nightly build failure

2008-12-16 Thread solr-dev

init-forrest-entities:
[mkdir] Created dir: /tmp/apache-solr-nightly/build
[mkdir] Created dir: /tmp/apache-solr-nightly/build/web

compile-solrj:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/solrj
[javac] Compiling 68 source files to /tmp/apache-solr-nightly/build/solrj
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

compile:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/solr
[javac] Compiling 350 source files to /tmp/apache-solr-nightly/build/solr
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

compileTests:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/tests
[javac] Compiling 134 source files to /tmp/apache-solr-nightly/build/tests
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

junit:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/test-results
[junit] Running org.apache.solr.BasicFunctionalityTest
[junit] Tests run: 19, Failures: 0, Errors: 0, Time elapsed: 17.131 sec
[junit] Running org.apache.solr.ConvertedLegacyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 7.647 sec
[junit] Running org.apache.solr.DisMaxRequestHandlerTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 5.534 sec
[junit] Running org.apache.solr.EchoParamsTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 2.565 sec
[junit] Running org.apache.solr.OutputWriterTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 2.901 sec
[junit] Running org.apache.solr.SampleTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.339 sec
[junit] Running org.apache.solr.SolrInfoMBeanTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.774 sec
[junit] Running org.apache.solr.TestDistributedSearch
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 26.641 sec
[junit] Running org.apache.solr.analysis.DoubleMetaphoneFilterFactoryTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.405 sec
[junit] Running org.apache.solr.analysis.DoubleMetaphoneFilterTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.386 sec
[junit] Running org.apache.solr.analysis.EnglishPorterFilterFactoryTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.236 sec
[junit] Running org.apache.solr.analysis.HTMLStripReaderTest
[junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 0.881 sec
[junit] Running org.apache.solr.analysis.LengthFilterTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.059 sec
[junit] Running org.apache.solr.analysis.TestBufferedTokenStream
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.425 sec
[junit] Running org.apache.solr.analysis.TestCapitalizationFilter
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.102 sec
[junit] Running org.apache.solr.analysis.TestCharFilter
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.377 sec
[junit] Running org.apache.solr.analysis.TestHyphenatedWordsFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.948 sec
[junit] Running org.apache.solr.analysis.TestKeepWordFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.026 sec
[junit] Running org.apache.solr.analysis.TestMappingCharFilter
[junit] Tests run: 11, Failures: 0, Errors: 0, Time elapsed: 0.425 sec
[junit] Running org.apache.solr.analysis.TestMappingCharFilterFactory
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.461 sec
[junit] Running org.apache.solr.analysis.TestPatternReplaceFilter
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 2.024 sec
[junit] Running org.apache.solr.analysis.TestPatternTokenizerFactory
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1 sec
[junit] Running org.apache.solr.analysis.TestPhoneticFilter
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.223 sec
[junit] Running org.apache.solr.analysis.TestRemoveDuplicatesTokenFilter
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.518 sec
[junit] Running org.apache.solr.analysis.TestSynonymFilter
[junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 2.08 sec
[junit] Running org.apache.solr.analysis.TestSynony

Build failed in Hudson: Solr-trunk #656

2008-12-16 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Solr-trunk/656/changes

Changes:

[ryan] ignoring *.jar in lib

[ryan] SOLR-868 removing libs

[ryan] SOLR-868 -- updating javascript example

[ehatcher] Convert templates to using new QueryResponse response object

[ehatcher] Add additional Velocity tools and layout capability

--
[...truncated 2175 lines...]

build:
  [jar] Building jar: 
http://hudson.zones.apache.org/hudson/job/Solr-trunk/ws/trunk/contrib/velocity/src/main/solr/lib/apache-solr-velocity-2008-12-16_08-06-42.jar
 

test-contrib:

init:

init-forrest-entities:

compile-solrj:

compile:

make-manifest:

compile:

compileTests:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Solr-trunk/ws/trunk/contrib/dataimporthandler/target/test-classes
 
[javac] Compiling 21 source files to 
http://hudson.zones.apache.org/hudson/job/Solr-trunk/ws/trunk/contrib/dataimporthandler/target/test-classes
 
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

test:
[junit] Running 
org.apache.solr.handler.dataimport.TestCachedSqlEntityProcessor
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.455 sec
[junit] Running org.apache.solr.handler.dataimport.TestDataConfig
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 2.124 sec
[junit] Running org.apache.solr.handler.dataimport.TestDateFormatTransformer
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.423 sec
[junit] Running org.apache.solr.handler.dataimport.TestDocBuilder
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.554 sec
[junit] Running org.apache.solr.handler.dataimport.TestDocBuilder2
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 4.013 sec
[junit] Running org.apache.solr.handler.dataimport.TestEntityProcessorBase
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.45 sec
[junit] Running org.apache.solr.handler.dataimport.TestErrorHandling
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 3.871 sec
[junit] Running org.apache.solr.handler.dataimport.TestEvaluatorBag
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.395 sec
[junit] Running org.apache.solr.handler.dataimport.TestFieldReader
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.648 sec
[junit] Running 
org.apache.solr.handler.dataimport.TestFileListEntityProcessor
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.438 sec
[junit] Running org.apache.solr.handler.dataimport.TestJdbcDataSource
[junit] Tests run: 0, Failures: 0, Errors: 0, Time elapsed: 0.381 sec
[junit] Running 
org.apache.solr.handler.dataimport.TestNumberFormatTransformer
[junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 0.492 sec
[junit] Running org.apache.solr.handler.dataimport.TestRegexTransformer
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.42 sec
[junit] Running org.apache.solr.handler.dataimport.TestScriptTransformer
[junit] Tests run: 0, Failures: 0, Errors: 0, Time elapsed: 0.375 sec
[junit] Running org.apache.solr.handler.dataimport.TestSqlEntityProcessor
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.441 sec
[junit] Running org.apache.solr.handler.dataimport.TestSqlEntityProcessor2
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 3.622 sec
[junit] Running org.apache.solr.handler.dataimport.TestTemplateString
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.37 sec
[junit] Running org.apache.solr.handler.dataimport.TestTemplateTransformer
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.407 sec
[junit] Running org.apache.solr.handler.dataimport.TestVariableResolver
[junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.459 sec
[junit] Running org.apache.solr.handler.dataimport.TestXPathEntityProcessor
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.789 sec
[junit] Running org.apache.solr.handler.dataimport.TestXPathRecordReader
[junit] Tests run: 10, Failures: 0, Errors: 0, Time elapsed: 0.562 sec

init:

init-forrest-entities:

compile-solrj:

compile:

make-manifest:

compile:

compileTests:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Solr-trunk/ws/trunk/contrib/extraction/build/test-classes
 
[javac] Compiling 1 source file to 
http://hudson.zones.apache.org/hudson/job/Solr-trunk/ws/trunk/contrib/extraction/build/test-classes
 

test:
[junit] Running org.apache.solr.handler.ExtractingRequestHandlerTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 7.909 sec

test:

init:

init-forrest-entities:

compile-solrj:

compile:

make-manifest:

compile:

compileTests:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Solr-trunk/ws/trunk/con

[jira] Created: (SOLR-919) Cache and reuse the DOM object of solrconfig.xml

2008-12-16 Thread Noble Paul (JIRA)
Cache and reuse the DOM object of solrconfig.xml


 Key: SOLR-919
 URL: https://issues.apache.org/jira/browse/SOLR-919
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul


If there are 1000's of cores the no:of times we need to load and parse the 
solrconfig.xml is going to be very expensive. In such a case we may mostly have 
same solrconfig.xml with some core properties embedded

The solution is  , if the location of the xml file is same (on disk) let us 
keep only one DOM object and reuse it for every other cores. So we can keep a 
cache at the CoreContainer level .

We save file reading , XML parsing and RAM in one go

The only challenge here is that it is we currently substitute core properties 
at read time. We will have to postpone this to consumption time. So whenever we 
read the values from the DOM the values can be substituted and used



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-920) Cache and reuse IndexSchema

2008-12-16 Thread Noble Paul (JIRA)
Cache and reuse IndexSchema
---

 Key: SOLR-920
 URL: https://issues.apache.org/jira/browse/SOLR-920
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul


if there are 1000's of cores then the cost of loading unloading schema.xml can 
be prohibitive
similar to SOLR-919 we can also cache the DOM object of schema.xml if the 
location on disk is same.  All the dynamic properties can be replaced lazily 
when they are read.

We can go one step ahead in this case. Th IndexSchema object is immutable . So 
if there are no core properties then the same IndexSchema object can be used 
across all the cores

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-921) SolrResourceLoader must cache name vs class

2008-12-16 Thread Noble Paul (JIRA)
SolrResourceLoader must cache name vs class
---

 Key: SOLR-921
 URL: https://issues.apache.org/jira/browse/SOLR-921
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul


every class that is loaded through SolrResourceLoader does a Class.forName() 
and when if it is not found a ClassNotFoundExcepton is thrown

Then , it looks up with the various packages and finds the right class if the 
name starts with solr. Considering the fact that we usually use this 
solr. format we pay too much of a price for this. After every lookup 
the result can be cached in a Map and can be shared across all 
the cores and this Map can be stored at the CoreContainer level

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-821) replication must allow copying conf file in a different name to slave

2008-12-16 Thread Akshay K. Ukey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshay K. Ukey updated SOLR-821:


Attachment: SOLR-821.patch

# Patch with test case.
# Removed unused parts from solrconfig.xml and schema.xml in the test files.
# Few missing null pointer checks added.

> replication must allow copying conf file in a different name to slave
> -
>
> Key: SOLR-821
> URL: https://issues.apache.org/jira/browse/SOLR-821
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-821.patch, SOLR-821.patch, SOLR-821.patch, 
> SOLR-821.patch, SOLR-821.patch
>
>
> It is likely that a file is different in master and slave. for instance 
> replicating solrconfig.xml is not possible with the current config if master 
> and slave has diffferent solrconfig.xml (which is always true)
> We can add an alias feature in the confFiles as
> {code}
>  name="confFiles">slave_solrconfig.xml:solrconfig.xml,slave_schema.xml:schema.xml
> {code}
> This means that the file slave_solrconfig.xml should be copied to the slave 
> as solrconfig.xml and slave_schema.xml must be saved to slave as schema.xml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-917) RequestHandlers#handlers is a synchronizedMap()

2008-12-16 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657011#action_12657011
 ] 

Yonik Seeley commented on SOLR-917:
---

Remember to test patches - this causes everything to fail since 
ConcurrentHashMap can't take null keys.

> RequestHandlers#handlers is a synchronizedMap()
> ---
>
> Key: SOLR-917
> URL: https://issues.apache.org/jira/browse/SOLR-917
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-917.patch
>
>
> {code}
>   private final Map handlers = 
> Collections.synchronizedMap(
>   new HashMap() );
> {code}
> this map is queried for every request and it can easily be made 
> ConcurrentHashMap

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-913) org/apache/solr/handler/SnapPuller.java - Expensive Pattern object made static

2008-12-16 Thread Kay Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657017#action_12657017
 ] 

Kay Kay commented on SOLR-913:
--

Shalin - the new patch looks good. Thanks for helping with it. 

Thanks everybody in the thread for the discussions regarding this. 

> org/apache/solr/handler/SnapPuller.java  - Expensive Pattern object made 
> static 
> 
>
> Key: SOLR-913
> URL: https://issues.apache.org/jira/browse/SOLR-913
> Project: Solr
>  Issue Type: Improvement
>  Components: replication (java)
> Environment: Tomcat 6, JRE 6 
>Reporter: Kay Kay
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-913.patch, SOLR-913.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> In the class -  org.apache.solr.handler.SnapPuller - there seems to be an 
> expensive Pattern object created locally in the method 
>   static Integer readInterval(String interval) ; 
> Pattern instances are better created as static objects and reused.
> The same is true for HttpClient instances. These are one per core right now. 
> We can make that static too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-914) Presence of finalize() in the codebase

2008-12-16 Thread Kay Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657023#action_12657023
 ] 

Kay Kay commented on SOLR-914:
--

Primarily SolrCore and SolrIndexWriter ( specifically for my use-case ). 

Also - just noticed that - CoreContainer.finalize() ( calls shutdown() ) - has 
a synchronized block.  While it is not a bottleneck for me , per se, (since I 
believe all through the life of the web-app , an instance of CoreContainer is 
alive and reachable , correct me if I am wrong here ). I believe we might need 
to revisit this if we were to extend this / provide orthogonal integration with 
other apps. 




> Presence of finalize() in the codebase 
> ---
>
> Key: SOLR-914
> URL: https://issues.apache.org/jira/browse/SOLR-914
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Affects Versions: 1.3
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Priority: Minor
> Fix For: 1.4
>
>   Original Estimate: 480h
>  Remaining Estimate: 480h
>
> There seems to be a number of classes - that implement finalize() method.  
> Given that it is perfectly ok for a Java VM to not to call it - may be - 
> there has to some other way  { try .. finally - when they are created to 
> destroy them } to destroy them and the presence of finalize() method , ( 
> depending on implementation ) might not serve what we want and in some cases 
> can end up delaying the gc process, depending on the algorithms. 
> $ find . -name *.java | xargs grep finalize
> ./contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/JdbcDataSource.java:
>   protected void finalize() {
> ./src/java/org/apache/solr/update/SolrIndexWriter.java:  protected void 
> finalize() {
> ./src/java/org/apache/solr/core/CoreContainer.java:  protected void 
> finalize() {
> ./src/java/org/apache/solr/core/SolrCore.java:  protected void finalize() {
> ./src/common/org/apache/solr/common/util/ConcurrentLRUCache.java:  protected 
> void finalize() throws Throwable {
> May be we need to revisit these occurences from a design perspective to see 
> if they are necessary / if there is an alternate way of managing guaranteed 
> destruction of resources. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-913) org/apache/solr/handler/SnapPuller.java - Expensive Pattern object made static (HttpClient object too )

2008-12-16 Thread Kay Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated SOLR-913:
-

Summary: org/apache/solr/handler/SnapPuller.java  - Expensive Pattern 
object made static (HttpClient object too )   (was: 
org/apache/solr/handler/SnapPuller.java  - Expensive Pattern object made static 
)

> org/apache/solr/handler/SnapPuller.java  - Expensive Pattern object made 
> static (HttpClient object too ) 
> -
>
> Key: SOLR-913
> URL: https://issues.apache.org/jira/browse/SOLR-913
> Project: Solr
>  Issue Type: Improvement
>  Components: replication (java)
> Environment: Tomcat 6, JRE 6 
>Reporter: Kay Kay
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-913.patch, SOLR-913.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> In the class -  org.apache.solr.handler.SnapPuller - there seems to be an 
> expensive Pattern object created locally in the method 
>   static Integer readInterval(String interval) ; 
> Pattern instances are better created as static objects and reused.
> The same is true for HttpClient instances. These are one per core right now. 
> We can make that static too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-914) Presence of finalize() in the codebase

2008-12-16 Thread Kay Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated SOLR-914:
-

Component/s: (was: Analysis)
 clients - java

> Presence of finalize() in the codebase 
> ---
>
> Key: SOLR-914
> URL: https://issues.apache.org/jira/browse/SOLR-914
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Affects Versions: 1.3
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Priority: Minor
> Fix For: 1.4
>
>   Original Estimate: 480h
>  Remaining Estimate: 480h
>
> There seems to be a number of classes - that implement finalize() method.  
> Given that it is perfectly ok for a Java VM to not to call it - may be - 
> there has to some other way  { try .. finally - when they are created to 
> destroy them } to destroy them and the presence of finalize() method , ( 
> depending on implementation ) might not serve what we want and in some cases 
> can end up delaying the gc process, depending on the algorithms. 
> $ find . -name *.java | xargs grep finalize
> ./contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/JdbcDataSource.java:
>   protected void finalize() {
> ./src/java/org/apache/solr/update/SolrIndexWriter.java:  protected void 
> finalize() {
> ./src/java/org/apache/solr/core/CoreContainer.java:  protected void 
> finalize() {
> ./src/java/org/apache/solr/core/SolrCore.java:  protected void finalize() {
> ./src/common/org/apache/solr/common/util/ConcurrentLRUCache.java:  protected 
> void finalize() throws Throwable {
> May be we need to revisit these occurences from a design perspective to see 
> if they are necessary / if there is an alternate way of managing guaranteed 
> destruction of resources. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (SOLR-886) DataImportHandler should rollback when an import fails or is aborted

2008-12-16 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reopened SOLR-886:



For rollback to work, a commit needs to be called. The committed code was not 
correct. I'll give another patch shortly.

> DataImportHandler should rollback when an import fails or is aborted
> 
>
> Key: SOLR-886
> URL: https://issues.apache.org/jira/browse/SOLR-886
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-886.patch
>
>
> DataImportHandler should call rollback when an import fails or is aborted. 
> This will make sure that uncommitted changes are not committed when the 
> IndexWriter is closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (SOLR-906) Buffered / Streaming SolrServer implementaion

2008-12-16 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-906:
--

Assignee: Shalin Shekhar Mangar

> Buffered / Streaming SolrServer implementaion
> -
>
> Key: SOLR-906
> URL: https://issues.apache.org/jira/browse/SOLR-906
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Reporter: Ryan McKinley
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-906-StreamingHttpSolrServer.patch, 
> SOLR-906-StreamingHttpSolrServer.patch, StreamingHttpSolrServer.java
>
>
> While indexing lots of documents, the CommonsHttpSolrServer add( 
> SolrInputDocument ) is less then optimal.  This makes a new request for each 
> document.
> With a "StreamingHttpSolrServer", documents are buffered and then written to 
> a single open Http connection.
> For related discussion see:
> http://www.nabble.com/solr-performance-tt9055437.html#a20833680

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-921) SolrResourceLoader must cache name vs class

2008-12-16 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-921:


Attachment: SOLR-921.patch

SolrResourceLoader is modified to load classes and cache them if they are not 
loaded by the ${solr.home}/lib classloader

We need the caching where we use Solr specific classes which are not found in 
${solr.home}/lib  and we access them using solr. usually

> SolrResourceLoader must cache name vs class
> ---
>
> Key: SOLR-921
> URL: https://issues.apache.org/jira/browse/SOLR-921
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
> Attachments: SOLR-921.patch
>
>
> every class that is loaded through SolrResourceLoader does a Class.forName() 
> and when if it is not found a ClassNotFoundExcepton is thrown
> Then , it looks up with the various packages and finds the right class if the 
> name starts with solr. Considering the fact that we usually use this 
> solr. format we pay too much of a price for this. After every 
> lookup the result can be cached in a Map and can be shared 
> across all the cores and this Map can be stored at the CoreContainer level

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-917) RequestHandlers#handlers is a synchronizedMap()

2008-12-16 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657053#action_12657053
 ] 

Noble Paul commented on SOLR-917:
-

sure it will have to be tested.

If there are null keys let us put it in as empty string.

all get(null) can be done as get("") 





> RequestHandlers#handlers is a synchronizedMap()
> ---
>
> Key: SOLR-917
> URL: https://issues.apache.org/jira/browse/SOLR-917
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-917.patch
>
>
> {code}
>   private final Map handlers = 
> Collections.synchronizedMap(
>   new HashMap() );
> {code}
> this map is queried for every request and it can easily be made 
> ConcurrentHashMap

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-821) replication must allow copying conf file in a different name to slave

2008-12-16 Thread Akshay K. Ukey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshay K. Ukey updated SOLR-821:


Attachment: (was: SOLR-821.patch)

> replication must allow copying conf file in a different name to slave
> -
>
> Key: SOLR-821
> URL: https://issues.apache.org/jira/browse/SOLR-821
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-821.patch, SOLR-821.patch, SOLR-821.patch, 
> SOLR-821.patch
>
>
> It is likely that a file is different in master and slave. for instance 
> replicating solrconfig.xml is not possible with the current config if master 
> and slave has diffferent solrconfig.xml (which is always true)
> We can add an alias feature in the confFiles as
> {code}
>  name="confFiles">slave_solrconfig.xml:solrconfig.xml,slave_schema.xml:schema.xml
> {code}
> This means that the file slave_solrconfig.xml should be copied to the slave 
> as solrconfig.xml and slave_schema.xml must be saved to slave as schema.xml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-917) RequestHandlers#handlers is a synchronizedMap()

2008-12-16 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-917:


Attachment: SOLR-917.patch

tests pass , Please review

> RequestHandlers#handlers is a synchronizedMap()
> ---
>
> Key: SOLR-917
> URL: https://issues.apache.org/jira/browse/SOLR-917
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-917.patch, SOLR-917.patch
>
>
> {code}
>   private final Map handlers = 
> Collections.synchronizedMap(
>   new HashMap() );
> {code}
> this map is queried for every request and it can easily be made 
> ConcurrentHashMap

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-821) replication must allow copying conf file in a different name to slave

2008-12-16 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656947#action_12656947
 ] 

Shalin Shekhar Mangar commented on SOLR-821:


This is looking good, thanks Akshay!

I'll commit this in a day or two.

> replication must allow copying conf file in a different name to slave
> -
>
> Key: SOLR-821
> URL: https://issues.apache.org/jira/browse/SOLR-821
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-821.patch, SOLR-821.patch, SOLR-821.patch, 
> SOLR-821.patch, SOLR-821.patch
>
>
> It is likely that a file is different in master and slave. for instance 
> replicating solrconfig.xml is not possible with the current config if master 
> and slave has diffferent solrconfig.xml (which is always true)
> We can add an alias feature in the confFiles as
> {code}
>  name="confFiles">slave_solrconfig.xml:solrconfig.xml,slave_schema.xml:schema.xml
> {code}
> This means that the file slave_solrconfig.xml should be copied to the slave 
> as solrconfig.xml and slave_schema.xml must be saved to slave as schema.xml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-914) Presence of finalize() in the codebase

2008-12-16 Thread Kay Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657067#action_12657067
 ] 

Kay Kay commented on SOLR-914:
--

Separately - it might be worth to wrap around the code with a try .. finally { 
super.finalize(); } for all the custom finalizers for better code correctness. 
Let me know what you think about the same.  

> Presence of finalize() in the codebase 
> ---
>
> Key: SOLR-914
> URL: https://issues.apache.org/jira/browse/SOLR-914
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Affects Versions: 1.3
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Priority: Minor
> Fix For: 1.4
>
>   Original Estimate: 480h
>  Remaining Estimate: 480h
>
> There seems to be a number of classes - that implement finalize() method.  
> Given that it is perfectly ok for a Java VM to not to call it - may be - 
> there has to some other way  { try .. finally - when they are created to 
> destroy them } to destroy them and the presence of finalize() method , ( 
> depending on implementation ) might not serve what we want and in some cases 
> can end up delaying the gc process, depending on the algorithms. 
> $ find . -name *.java | xargs grep finalize
> ./contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/JdbcDataSource.java:
>   protected void finalize() {
> ./src/java/org/apache/solr/update/SolrIndexWriter.java:  protected void 
> finalize() {
> ./src/java/org/apache/solr/core/CoreContainer.java:  protected void 
> finalize() {
> ./src/java/org/apache/solr/core/SolrCore.java:  protected void finalize() {
> ./src/common/org/apache/solr/common/util/ConcurrentLRUCache.java:  protected 
> void finalize() throws Throwable {
> May be we need to revisit these occurences from a design perspective to see 
> if they are necessary / if there is an alternate way of managing guaranteed 
> destruction of resources. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-897) abo/abc/backupcleaner/snapcleaner, if removinmg by number of snapshots/backups, gets Argument list too long

2008-12-16 Thread Bill Au (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Au updated SOLR-897:
-

Summary: abo/abc/backupcleaner/snapcleaner, if removinmg by number of 
snapshots/backups, gets Argument list too long  (was: snapcleaner, if removinmg 
by number of snapshots, gets Argument list too long)

> abo/abc/backupcleaner/snapcleaner, if removinmg by number of 
> snapshots/backups, gets Argument list too long
> ---
>
> Key: SOLR-897
> URL: https://issues.apache.org/jira/browse/SOLR-897
> Project: Solr
>  Issue Type: Bug
>  Components: replication (scripts)
>Affects Versions: 1.3
>Reporter: Dan Rosher
>Assignee: Bill Au
> Fix For: 1.4
>
> Attachments: SOLR-897.patch
>
>
> ls -cd ${data_dir}/snapshot.* returns Argument list too long, use find instead

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-922) Solr WebApp wide Executor for better efficient management of threads , separating the logic in the thread from the launch of the same.

2008-12-16 Thread Kay Kay (JIRA)
Solr WebApp wide Executor for better efficient management of threads , 
separating the logic in the thread from the launch of the same. 
---

 Key: SOLR-922
 URL: https://issues.apache.org/jira/browse/SOLR-922
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
 Environment: Tomcat 6, JRE 6
Reporter: Kay Kay
Priority: Minor


For a different jira - when we were discussing bringing in parallelism through 
threads and using Executors - encountered a case of using a webapp wide 
Executor for reusing thread pools for better use of thread resources , instead 
of thread.start() .  

pros:  Custom Request Handlers and other plugins to the Solr App server can use 
this Executor API to retrieve the executor and just submit the Runnable / 
Callable impls to get the job done while getting the benefits of a thread pool 
. This might be necessary as we continue to write plugins to the core 
architecture and centralizing the threadpools might make it easy to control / 
prevent global Executor objects across the codebase / recreating them locally ( 
as they might be expensive ). 


$ find . -name *.java | xargs grep -nr 'start()'  | grep "}"


./contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/XPathEntityProcessor.java:377:
}.start();
./contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/DataImporter.java:368:
}.start();
./src/java/org/apache/solr/handler/SnapPuller.java:382:}.start();
./src/java/org/apache/solr/handler/SnapShooter.java:52:}.start();
./src/java/org/apache/solr/handler/ReplicationHandler.java:134:  }.start();
./src/common/org/apache/solr/common/util/ConcurrentLRUCache.java:112:
}.start();





-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-897) snapcleaner, if removinmg by number of snapshots, gets Argument list too long

2008-12-16 Thread Bill Au (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657071#action_12657071
 ] 

Bill Au commented on SOLR-897:
--

The following scripts also have the same problem: abc, abo, backupcleaner

> snapcleaner, if removinmg by number of snapshots, gets Argument list too long
> -
>
> Key: SOLR-897
> URL: https://issues.apache.org/jira/browse/SOLR-897
> Project: Solr
>  Issue Type: Bug
>  Components: replication (scripts)
>Affects Versions: 1.3
>Reporter: Dan Rosher
>Assignee: Bill Au
> Fix For: 1.4
>
> Attachments: SOLR-897.patch
>
>
> ls -cd ${data_dir}/snapshot.* returns Argument list too long, use find instead

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-897) abo/abc/backupcleaner/snapcleaner, if removinmg by number of snapshots/backups, gets Argument list too long

2008-12-16 Thread Bill Au (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Au updated SOLR-897:
-

Attachment: solr-897.patch2

expanded patch to include fix for all four scripts with the same problem.

> abo/abc/backupcleaner/snapcleaner, if removinmg by number of 
> snapshots/backups, gets Argument list too long
> ---
>
> Key: SOLR-897
> URL: https://issues.apache.org/jira/browse/SOLR-897
> Project: Solr
>  Issue Type: Bug
>  Components: replication (scripts)
>Affects Versions: 1.3
>Reporter: Dan Rosher
>Assignee: Bill Au
> Fix For: 1.4
>
> Attachments: SOLR-897.patch
>
>
> ls -cd ${data_dir}/snapshot.* returns Argument list too long, use find instead

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-897) abo/abc/backupcleaner/snapcleaner, if removinmg by number of snapshots/backups, gets Argument list too long

2008-12-16 Thread Bill Au (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Au updated SOLR-897:
-

Attachment: (was: solr-897.patch2)

> abo/abc/backupcleaner/snapcleaner, if removinmg by number of 
> snapshots/backups, gets Argument list too long
> ---
>
> Key: SOLR-897
> URL: https://issues.apache.org/jira/browse/SOLR-897
> Project: Solr
>  Issue Type: Bug
>  Components: replication (scripts)
>Affects Versions: 1.3
>Reporter: Dan Rosher
>Assignee: Bill Au
> Fix For: 1.4
>
> Attachments: SOLR-897.patch
>
>
> ls -cd ${data_dir}/snapshot.* returns Argument list too long, use find instead

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-897) abo/abc/backupcleaner/snapcleaner, if removinmg by number of snapshots/backups, gets Argument list too long

2008-12-16 Thread Bill Au (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Au updated SOLR-897:
-

Attachment: solr-897-2.patch

reattach expanded patch for fixes to all four scritps with the same problem.

> abo/abc/backupcleaner/snapcleaner, if removinmg by number of 
> snapshots/backups, gets Argument list too long
> ---
>
> Key: SOLR-897
> URL: https://issues.apache.org/jira/browse/SOLR-897
> Project: Solr
>  Issue Type: Bug
>  Components: replication (scripts)
>Affects Versions: 1.3
>Reporter: Dan Rosher
>Assignee: Bill Au
> Fix For: 1.4
>
> Attachments: SOLR-897.patch
>
>
> ls -cd ${data_dir}/snapshot.* returns Argument list too long, use find instead

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-897) abo/abc/backupcleaner/snapcleaner, if removinmg by number of snapshots/backups, gets Argument list too long

2008-12-16 Thread Bill Au (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Au updated SOLR-897:
-

Attachment: (was: solr-897-2.patch)

> abo/abc/backupcleaner/snapcleaner, if removinmg by number of 
> snapshots/backups, gets Argument list too long
> ---
>
> Key: SOLR-897
> URL: https://issues.apache.org/jira/browse/SOLR-897
> Project: Solr
>  Issue Type: Bug
>  Components: replication (scripts)
>Affects Versions: 1.3
>Reporter: Dan Rosher
>Assignee: Bill Au
> Fix For: 1.4
>
> Attachments: SOLR-897.patch
>
>
> ls -cd ${data_dir}/snapshot.* returns Argument list too long, use find instead

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-911) multi-select facets

2008-12-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657077#action_12657077
 ] 

Hoss Man commented on SOLR-911:
---

{quote}
This patch also adds the ability to specify a different output name/key... 
useful for complex facet queries:

facet.query=\{!key=foo\}rng:[1 TO 2] OR rng:[5 TO 9]
will cause the results to come back under a key of "foo" rather than "rng:[1 TO 
2] OR rng:[5 TO 9]"
{quote}

but then how will the client know how what to filter on to actually constrain 
the query by "foo" since it won't actually ko what query that was?

> multi-select facets
> ---
>
> Key: SOLR-911
> URL: https://issues.apache.org/jira/browse/SOLR-911
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Yonik Seeley
> Fix For: 1.4
>
> Attachments: SOLR-911.patch, SOLR-911.patch, SOLR-911.patch
>
>
> plumbing to support the selection of multiple constraints in a facet

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-897) abo/abc/backupcleaner/snapcleaner, if removinmg by number of snapshots/backups, gets Argument list too long

2008-12-16 Thread Bill Au (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Au updated SOLR-897:
-

Attachment: solr-897-2.patch

reattach expanded patch for all four scripts with the same problem.  This time 
with correct license.

> abo/abc/backupcleaner/snapcleaner, if removinmg by number of 
> snapshots/backups, gets Argument list too long
> ---
>
> Key: SOLR-897
> URL: https://issues.apache.org/jira/browse/SOLR-897
> Project: Solr
>  Issue Type: Bug
>  Components: replication (scripts)
>Affects Versions: 1.3
>Reporter: Dan Rosher
>Assignee: Bill Au
> Fix For: 1.4
>
> Attachments: solr-897-2.patch, SOLR-897.patch
>
>
> ls -cd ${data_dir}/snapshot.* returns Argument list too long, use find instead

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-915) SolrCore;close() - scope to exploit parallelism among the number of closeHooks

2008-12-16 Thread Kay Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657081#action_12657081
 ] 

Kay Kay commented on SOLR-915:
--

| Another issue with calling close hooks in parallel is thread safety and race 
conditions which could cause double closes or null pointer exceptions, etc. 

Ideally the close hooks should all be independent of each other without any 
order among them. When we register multiple handlers (through  
in solrconfig.xml ) - they might register their own closehooks as necessary. In 
that case - the order becomes totally irrelevant ( so the member of CloseHooks 
is better defined as a Collection , rather than a List, irrespective of if we 
have parallelism or not ).  Maintaining the same order as the order of 
definition in solrconfig.xml is not very intuitive since it is possible for 
people to move around content in the .xml file. 

If they were going to totally parallel - then there is not much scope for race 
conditions since they do not share any object ( closeHook.close() has no 
objects ) and it merely serves as a notification to a hook . 

doubleCloses, are not possible since there seems to be an AtomicInteger right 
at the beginning of close() which executes only if the counter is set to 0 . 

I did not understand the part about NPE , though. 

> SolrCore;close()  - scope to exploit parallelism among the number of 
> closeHooks 
> 
>
> Key: SOLR-915
> URL: https://issues.apache.org/jira/browse/SOLR-915
> Project: Solr
>  Issue Type: Improvement
>  Components: search
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-915.patch, SOLR-915.patch, SOLR-915.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> In SolrCore: close() - all the way towards the end of the function - there 
> seems to be a sequential list of close method invocation. 
> if( closeHooks != null ) {
>for( CloseHook hook : closeHooks ) {
>  hook.close( this );
>   }
> }
> I believe this has scope to be parallelized ( actually the entire sequence of 
> close operations , updateHandler,close() etc.) - by means of launching them 
> in separate threads from an ExecutorService , for a much faster shutdown as 
> the process definitely does not need to be sequential. 
> This becomes all the more important in the multi-core context when we might 
> want to shutdown and restart a SolrCore altogether. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-915) SolrCore;close() - scope to exploit parallelism among the number of closeHooks

2008-12-16 Thread Kay Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657083#action_12657083
 ] 

kaykay.unique edited comment on SOLR-915 at 12/16/08 9:42 AM:


| As for new Thread() vs using a ThreadPool - can we open that as a new 
issue/discussion. It seems to make sense, but I have not thought about the 
consequences too much. If we are adding a container wide ThreadPool it would be 
good to do that explicitly rather then as a side effect of an edge case (slow 
shutdown hooks)


I have opened [https://issues.apache.org/jira/browse/SOLR-922] for a web-app 
wide ThreadPool design , which I agree - is independent of this issue. Started 
a 
[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200812.mbox/%3c4947de46.6030...@gmail.com%3e]
 in the mailing list ( solr-user ) about the problem. 

  was (Author: kaykay.unique):
| As for new Thread() vs using a ThreadPool - can we open that as a new 
issue/discussion. It seems to make sense, but I have not thought about the 
consequences too much. If we are adding a container wide ThreadPool it would be 
good to do that explicitly rather then as a side effect of an edge case (slow 
shutdown hooks)


I have opened [https://issues.apache.org/jira/browse/SOLR-922] for a web-app 
wide ThreadPool design , which I agree - is independent of this issue. Started 
a 
[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200812.mbox/%3c4947de46.6030...@gmail.com%3e
 thread] in the mailing list ( solr-user ) about the problem. 
  
> SolrCore;close()  - scope to exploit parallelism among the number of 
> closeHooks 
> 
>
> Key: SOLR-915
> URL: https://issues.apache.org/jira/browse/SOLR-915
> Project: Solr
>  Issue Type: Improvement
>  Components: search
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-915.patch, SOLR-915.patch, SOLR-915.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> In SolrCore: close() - all the way towards the end of the function - there 
> seems to be a sequential list of close method invocation. 
> if( closeHooks != null ) {
>for( CloseHook hook : closeHooks ) {
>  hook.close( this );
>   }
> }
> I believe this has scope to be parallelized ( actually the entire sequence of 
> close operations , updateHandler,close() etc.) - by means of launching them 
> in separate threads from an ExecutorService , for a much faster shutdown as 
> the process definitely does not need to be sequential. 
> This becomes all the more important in the multi-core context when we might 
> want to shutdown and restart a SolrCore altogether. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-915) SolrCore;close() - scope to exploit parallelism among the number of closeHooks

2008-12-16 Thread Kay Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657083#action_12657083
 ] 

Kay Kay commented on SOLR-915:
--

| As for new Thread() vs using a ThreadPool - can we open that as a new 
issue/discussion. It seems to make sense, but I have not thought about the 
consequences too much. If we are adding a container wide ThreadPool it would be 
good to do that explicitly rather then as a side effect of an edge case (slow 
shutdown hooks)


I have opened [https://issues.apache.org/jira/browse/SOLR-922] for a web-app 
wide ThreadPool design , which I agree - is independent of this issue. Started 
a 
[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200812.mbox/%3c4947de46.6030...@gmail.com%3e
 thread] in the mailing list ( solr-user ) about the problem. 

> SolrCore;close()  - scope to exploit parallelism among the number of 
> closeHooks 
> 
>
> Key: SOLR-915
> URL: https://issues.apache.org/jira/browse/SOLR-915
> Project: Solr
>  Issue Type: Improvement
>  Components: search
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-915.patch, SOLR-915.patch, SOLR-915.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> In SolrCore: close() - all the way towards the end of the function - there 
> seems to be a sequential list of close method invocation. 
> if( closeHooks != null ) {
>for( CloseHook hook : closeHooks ) {
>  hook.close( this );
>   }
> }
> I believe this has scope to be parallelized ( actually the entire sequence of 
> close operations , updateHandler,close() etc.) - by means of launching them 
> in separate threads from an ExecutorService , for a much faster shutdown as 
> the process definitely does not need to be sequential. 
> This becomes all the more important in the multi-core context when we might 
> want to shutdown and restart a SolrCore altogether. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-915) SolrCore;close() - scope to exploit parallelism among the number of closeHooks

2008-12-16 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657087#action_12657087
 ] 

Ryan McKinley commented on SOLR-915:


I don't see any issue with Collection vs List.

However I still don't see how making closeHook execution parallel helps your 
problem.  close() still returns when the core and all its hooks have closed.

Perhaps you just want to add a single closeHook that starts up a Thead for your 
long running operations?  

> SolrCore;close()  - scope to exploit parallelism among the number of 
> closeHooks 
> 
>
> Key: SOLR-915
> URL: https://issues.apache.org/jira/browse/SOLR-915
> Project: Solr
>  Issue Type: Improvement
>  Components: search
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-915.patch, SOLR-915.patch, SOLR-915.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> In SolrCore: close() - all the way towards the end of the function - there 
> seems to be a sequential list of close method invocation. 
> if( closeHooks != null ) {
>for( CloseHook hook : closeHooks ) {
>  hook.close( this );
>   }
> }
> I believe this has scope to be parallelized ( actually the entire sequence of 
> close operations , updateHandler,close() etc.) - by means of launching them 
> in separate threads from an ExecutorService , for a much faster shutdown as 
> the process definitely does not need to be sequential. 
> This becomes all the more important in the multi-core context when we might 
> want to shutdown and restart a SolrCore altogether. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: support for multi-select facets

2008-12-16 Thread Chris Hostetter

: Subject: support for multi-select facets

I'm confused by something ... is the issue here really "multi-select 
facets"? ... that can be dealt with "OR" queries.  What it seems you are 
trying to tackle isn't so much about UIs that want to allow a multi-select 
when faceting on a given field, but when the UI wants to display 
facet counts for one field which are not constrained by existing filters 
on another field.  correct?

that seems orthoginal to doing "multi-select"

: Option #1: ability to specify the query/filters per-facet:
...
: Option #2: ability to specify as a "local param" (meta-data on a parameter)
...
: Option #3: tag parts of a request using "local params"

Wouldn't the simplest solution just be to have a new variant of the "fq" 
param that is utilized by the QueryComponent but ignored by the 
FacetComponent?

Assume "rfq" is a "result fq" - affects the main result set, but not 
faceting...  

  q=foo
  fq=date:[1 TO 2]
  fq=securityfilter:42
  rfq=type:(pdf OR html)
  facet.field:type
  facet.field:author

...facet constraint counts are only bound by the "foo", "date" and 
"security".  main result is also limited by "type"

Your Option #3 seems to be an extension of this idea, (assuming i 
understand it correctly) using "fq={!tag=X}Y" instead of "rfq=Y" -- but 
also requires every facet.field to know about the "X" tag name  
wouldn't the common case be that you want all facet.fields to "exclude" 
all facet related fqs? ... should "facet.field={!ex}Y be shorthand for 
"facet.field={!ex=X1,X2,...XN}Y" where X1-XN are the full list of all 
known tags in "fq" params?


-Hoss



[jira] Commented: (SOLR-915) SolrCore;close() - scope to exploit parallelism among the number of closeHooks

2008-12-16 Thread Kay Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657091#action_12657091
 ] 

Kay Kay commented on SOLR-915:
--

|  I don't see any issue with Collection vs List. 

List, by default implies ordering and Collection does not. Making it a 
collection is more intuitive since there is really not a specific order ( at 
least intuitively) that could be obvious to the programmer . Since - there 
could be multiple plugins registering themselves as SolrCoreAware and adding 
closeHooks , Hence a Collection intuitively reflects what the underlying use 
case is. 

| Perhaps you just want to add a single closeHook that starts up a Thead for 
your long running operations? 

Sure - I could. But I believe by nature of being a closeHook - it should not 
interfere with the actual close process , but act as a plugin to the process 
that is guaranteed to be notified when the process happens. 

> SolrCore;close()  - scope to exploit parallelism among the number of 
> closeHooks 
> 
>
> Key: SOLR-915
> URL: https://issues.apache.org/jira/browse/SOLR-915
> Project: Solr
>  Issue Type: Improvement
>  Components: search
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-915.patch, SOLR-915.patch, SOLR-915.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> In SolrCore: close() - all the way towards the end of the function - there 
> seems to be a sequential list of close method invocation. 
> if( closeHooks != null ) {
>for( CloseHook hook : closeHooks ) {
>  hook.close( this );
>   }
> }
> I believe this has scope to be parallelized ( actually the entire sequence of 
> close operations , updateHandler,close() etc.) - by means of launching them 
> in separate threads from an ExecutorService , for a much faster shutdown as 
> the process definitely does not need to be sequential. 
> This becomes all the more important in the multi-core context when we might 
> want to shutdown and restart a SolrCore altogether. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-915) SolrCore;close() - scope to exploit parallelism among the number of closeHooks

2008-12-16 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657092#action_12657092
 ] 

Ryan McKinley commented on SOLR-915:


sorry, i should have said, "I have no objection to changing List -> Collection"

So this issue is really about changing the semantics of closeHook -- you are 
suggesting changing the meaning so that it is just a notification, not code 
inserted in the shutdown cycle.

To me, the existing logic makes the most sense -- the close hook is called when 
the core shuts down; after core.close() returns you know that all the hooks 
have been called.  In the case where you want to fire a long running process, 
the close hook can start its own thread.   

The existing process makes *both* options possible, the way you are proposing 
forces everyone to run the hook asynchronously (even existing code that assumes 
it won't be).

I am -1 on a change like that...


> SolrCore;close()  - scope to exploit parallelism among the number of 
> closeHooks 
> 
>
> Key: SOLR-915
> URL: https://issues.apache.org/jira/browse/SOLR-915
> Project: Solr
>  Issue Type: Improvement
>  Components: search
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-915.patch, SOLR-915.patch, SOLR-915.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> In SolrCore: close() - all the way towards the end of the function - there 
> seems to be a sequential list of close method invocation. 
> if( closeHooks != null ) {
>for( CloseHook hook : closeHooks ) {
>  hook.close( this );
>   }
> }
> I believe this has scope to be parallelized ( actually the entire sequence of 
> close operations , updateHandler,close() etc.) - by means of launching them 
> in separate threads from an ExecutorService , for a much faster shutdown as 
> the process definitely does not need to be sequential. 
> This becomes all the more important in the multi-core context when we might 
> want to shutdown and restart a SolrCore altogether. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: support for multi-select facets

2008-12-16 Thread Yonik Seeley
On Tue, Dec 16, 2008 at 12:49 PM, Chris Hostetter
 wrote:
>
> : Subject: support for multi-select facets
>
> I'm confused by something ... is the issue here really "multi-select
> facets"? ... that can be dealt with "OR" queries.  What it seems you are
> trying to tackle isn't so much about UIs that want to allow a multi-select
> when faceting on a given field, but when the UI wants to display
> facet counts for one field which are not constrained by existing filters
> on another field.  correct?

multi-select in a single request requires getting facets that are
constrained differently per-facet (and may be constrained differently
than for the top N docs returned.)

> that seems orthoginal to doing "multi-select"
>
> : Option #1: ability to specify the query/filters per-facet:
>...
> : Option #2: ability to specify as a "local param" (meta-data on a parameter)
>...
> : Option #3: tag parts of a request using "local params"
>
> Wouldn't the simplest solution just be to have a new variant of the "fq"
> param that is utilized by the QueryComponent but ignored by the
> FacetComponent?

That would work for a single facet, but not for multiple facets.
Going back to the original example:

-- type --
 x pdf (32)
   word (17)
 x html(46)
   excel(11)

 author 
 erik (31)
 grant (27)
 yonik (14)


The normal case would be that you only want to remove constraints
related to what you are faceting on.
So when faceting on type, disregard any type related filters.  When
faceting on author, disregard any author related filters.  If I click
on "word" docs above, I'd want to see all other constraint counts
change except for those under 'type".

A simpler way to think about this multi-select scenario is at the GUI
level: you want faceting to work exactly as it did before, but you
don't want the multi-select facet to "disappear" when you click on one
of the items.

-Yonik


[jira] Commented: (SOLR-911) multi-select facets

2008-12-16 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657096#action_12657096
 ] 

Yonik Seeley commented on SOLR-911:
---

bq. but then how will the client know how what to filter on to actually 
constrain the query by "foo" since it won't actually ko what query that was?

Well, it's optional... it's only an issue for stateless clients without 
conventions or application specific knowledge - and they could always look at 
the original parameters via echoParams.


> multi-select facets
> ---
>
> Key: SOLR-911
> URL: https://issues.apache.org/jira/browse/SOLR-911
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Yonik Seeley
> Fix For: 1.4
>
> Attachments: SOLR-911.patch, SOLR-911.patch, SOLR-911.patch
>
>
> plumbing to support the selection of multiple constraints in a facet

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-915) SolrCore;close() - scope to exploit parallelism among the number of closeHooks

2008-12-16 Thread Kay Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657097#action_12657097
 ] 

Kay Kay commented on SOLR-915:
--

Ok - then let me submit a revised patch with List change to Collection and some 
javadoc added to CloseHook explaining why it needs to be short-lived. 

> SolrCore;close()  - scope to exploit parallelism among the number of 
> closeHooks 
> 
>
> Key: SOLR-915
> URL: https://issues.apache.org/jira/browse/SOLR-915
> Project: Solr
>  Issue Type: Improvement
>  Components: search
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-915.patch, SOLR-915.patch, SOLR-915.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> In SolrCore: close() - all the way towards the end of the function - there 
> seems to be a sequential list of close method invocation. 
> if( closeHooks != null ) {
>for( CloseHook hook : closeHooks ) {
>  hook.close( this );
>   }
> }
> I believe this has scope to be parallelized ( actually the entire sequence of 
> close operations , updateHandler,close() etc.) - by means of launching them 
> in separate threads from an ExecutorService , for a much faster shutdown as 
> the process definitely does not need to be sequential. 
> This becomes all the more important in the multi-core context when we might 
> want to shutdown and restart a SolrCore altogether. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-915) SolrCore;close() - scope to exploit parallelism among the number of closeHooks

2008-12-16 Thread Kay Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated SOLR-915:
-

Attachment: SOLR-915.patch

1) Add comment to CloseHook.close() 

2) SolrCore.closeHooks ( change it from a List to a Collection interface. )

> SolrCore;close()  - scope to exploit parallelism among the number of 
> closeHooks 
> 
>
> Key: SOLR-915
> URL: https://issues.apache.org/jira/browse/SOLR-915
> Project: Solr
>  Issue Type: Improvement
>  Components: search
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-915.patch, SOLR-915.patch, SOLR-915.patch, 
> SOLR-915.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> In SolrCore: close() - all the way towards the end of the function - there 
> seems to be a sequential list of close method invocation. 
> if( closeHooks != null ) {
>for( CloseHook hook : closeHooks ) {
>  hook.close( this );
>   }
> }
> I believe this has scope to be parallelized ( actually the entire sequence of 
> close operations , updateHandler,close() etc.) - by means of launching them 
> in separate threads from an ExecutorService , for a much faster shutdown as 
> the process definitely does not need to be sequential. 
> This becomes all the more important in the multi-core context when we might 
> want to shutdown and restart a SolrCore altogether. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-906) Buffered / Streaming SolrServer implementaion

2008-12-16 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657102#action_12657102
 ] 

Shalin Shekhar Mangar commented on SOLR-906:


I've started taking a look at this. A couple of points:

* Instantiating the lock in blockUntilFinished and nulling it can cause a race 
condition. A thread in the 'add' method can find that the lock is not null, 
another thread can null it and the first thread proceeds to lock on it leading 
to NPE. In the same way, creation of multiple locks is possible in the 
blockUntilFinished method.
* The run method calling itself recursively looks suspicious. We may be in 
danger of overflowing the stack.
* The SolrExampleTest cannot be used directly because it depends on the order 
of the commands being executed. We must clearly document that clients should 
not depend on the order of commands being executed in the same order as they 
are given.
* The if (req.getCommitWithin() < 0) should be > 0, right?
* The add calls do not come to the process method. Due to this some add calls 
may still get in before the commit acquires the lock (assuming multiple 
producers). Is this class strictly meant for a single document producer 
use-case?
* The wait loop in blockUntilFinished is very CPU intensive. It can probably be 
optimized.

I'm experimenting with a slightly different implementation. Still trying to tie 
the loose ends. I hope to have a patch soon.

> Buffered / Streaming SolrServer implementaion
> -
>
> Key: SOLR-906
> URL: https://issues.apache.org/jira/browse/SOLR-906
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Reporter: Ryan McKinley
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-906-StreamingHttpSolrServer.patch, 
> SOLR-906-StreamingHttpSolrServer.patch, StreamingHttpSolrServer.java
>
>
> While indexing lots of documents, the CommonsHttpSolrServer add( 
> SolrInputDocument ) is less then optimal.  This makes a new request for each 
> document.
> With a "StreamingHttpSolrServer", documents are buffered and then written to 
> a single open Http connection.
> For related discussion see:
> http://www.nabble.com/solr-performance-tt9055437.html#a20833680

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-906) Buffered / Streaming SolrServer implementaion

2008-12-16 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657107#action_12657107
 ] 

Ryan McKinley commented on SOLR-906:


Thanks for looking at this!

| The if (req.getCommitWithin() < 0) should be > 0, right?

no -- if a commit within time is specified, we can not use the open request.  
It needs to start a new request so that a new  command could be sent.  
I experimeted with sending everything over the open connection, but we would 
need to add a new parent tag to the xml format.   That might not be a bad idea. 
 Then we could send:

  
 
  
  
  ...
 
 

and finally



> Buffered / Streaming SolrServer implementaion
> -
>
> Key: SOLR-906
> URL: https://issues.apache.org/jira/browse/SOLR-906
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Reporter: Ryan McKinley
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-906-StreamingHttpSolrServer.patch, 
> SOLR-906-StreamingHttpSolrServer.patch, StreamingHttpSolrServer.java
>
>
> While indexing lots of documents, the CommonsHttpSolrServer add( 
> SolrInputDocument ) is less then optimal.  This makes a new request for each 
> document.
> With a "StreamingHttpSolrServer", documents are buffered and then written to 
> a single open Http connection.
> For related discussion see:
> http://www.nabble.com/solr-performance-tt9055437.html#a20833680

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-906) Buffered / Streaming SolrServer implementaion

2008-12-16 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657108#action_12657108
 ] 

Ryan McKinley commented on SOLR-906:


| The SolrExampleTest cannot be used directly

why not?  All the tests pass for me... 

sending a commit() (with waitSearcher=true) should wait for all the docs to get 
added, then issue the commit, then return.



> Buffered / Streaming SolrServer implementaion
> -
>
> Key: SOLR-906
> URL: https://issues.apache.org/jira/browse/SOLR-906
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Reporter: Ryan McKinley
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-906-StreamingHttpSolrServer.patch, 
> SOLR-906-StreamingHttpSolrServer.patch, StreamingHttpSolrServer.java
>
>
> While indexing lots of documents, the CommonsHttpSolrServer add( 
> SolrInputDocument ) is less then optimal.  This makes a new request for each 
> document.
> With a "StreamingHttpSolrServer", documents are buffered and then written to 
> a single open Http connection.
> For related discussion see:
> http://www.nabble.com/solr-performance-tt9055437.html#a20833680

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-906) Buffered / Streaming SolrServer implementaion

2008-12-16 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657110#action_12657110
 ] 

Ryan McKinley commented on SOLR-906:


| The add calls do not come to the process method. Due to this some add calls 
may still get in before the commit acquires the lock (assuming multiple 
producers). Is this class strictly meant for a single document producer 
use-case?

I don't totally follow... but if possible, it would be good if multiple threads 
could fill the same queue.  This would let the StreamingHttpSolrServer  manage 
all solr communication

> Buffered / Streaming SolrServer implementaion
> -
>
> Key: SOLR-906
> URL: https://issues.apache.org/jira/browse/SOLR-906
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Reporter: Ryan McKinley
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-906-StreamingHttpSolrServer.patch, 
> SOLR-906-StreamingHttpSolrServer.patch, StreamingHttpSolrServer.java
>
>
> While indexing lots of documents, the CommonsHttpSolrServer add( 
> SolrInputDocument ) is less then optimal.  This makes a new request for each 
> document.
> With a "StreamingHttpSolrServer", documents are buffered and then written to 
> a single open Http connection.
> For related discussion see:
> http://www.nabble.com/solr-performance-tt9055437.html#a20833680

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-906) Buffered / Streaming SolrServer implementaion

2008-12-16 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657115#action_12657115
 ] 

Shalin Shekhar Mangar commented on SOLR-906:


bq. why not? All the tests pass for me... 
There are multiple places where SolrExampleTest calls commit without 
waitSearcher=true and proceeds to query and assert on results. The failure 
happens intermittently. Try varying the number of threads and you may be able 
to reproduce the failure.

bq. The add calls do not come to the process method.
I meant the request method. Sorry about that. The SolrServer.add() calls the 
request method but this implementation does not. If there are multiple threads 
using this class, new documents may get added to the queue before we acquire 
the lock inside blockUntilFinished due to the call to commit.

> Buffered / Streaming SolrServer implementaion
> -
>
> Key: SOLR-906
> URL: https://issues.apache.org/jira/browse/SOLR-906
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Reporter: Ryan McKinley
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-906-StreamingHttpSolrServer.patch, 
> SOLR-906-StreamingHttpSolrServer.patch, StreamingHttpSolrServer.java
>
>
> While indexing lots of documents, the CommonsHttpSolrServer add( 
> SolrInputDocument ) is less then optimal.  This makes a new request for each 
> document.
> With a "StreamingHttpSolrServer", documents are buffered and then written to 
> a single open Http connection.
> For related discussion see:
> http://www.nabble.com/solr-performance-tt9055437.html#a20833680

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: support for multi-select facets

2008-12-16 Thread Chris Hostetter

: multi-select in a single request requires getting facets that are
: constrained differently per-facet (and may be constrained differently
: than for the top N docs returned.)

I still feel like we're having a huge disconcet in terminology, and that 
there are really two orthoginal issues here.

In all of the GUI lingo i've seen, the concept of "multi-select" can be 
related to a single GUI element (in the case of Solr: a single field).  
when talking about "multi-select faceting" (even for a single field) there 
are two radically differnet use cases:
  case#1: when the user selects a constraint, the main result set is 
filtered to only documents mathing that constraint and the counts for all 
other constraints are reduced accordingly.  when the user selects 
aditional constraints the main result set and the counts for other 
constraints are further refined and limited to the *intersection* of all 
selected constraints and the original document set.
  case#2: when the user selects a constraint, the main result set is
filtered to only documents mathing that constraint but the counts for all
other constraints remain unaffected.  when the user selects 
aditional constraints the main result set grows to include the 
intersection of the original document set with the *union* of all selected 
constraints.

case#1 is currently supported by Solr (for single value fields the use 
case is trivial, but Solr also handles the multivalued field use case as 
well).  case#2 is not currently possible without making two seperate 
requests (one with fq's constraining the selected constraints to get the 
main results; one w/o those fq's to get the facet counts)

that's what i think of (and what i suspect most people think of) when 
discussing multi-value faceting.

: The normal case would be that you only want to remove constraints
: related to what you are faceting on.
: So when faceting on type, disregard any type related filters.  When
: faceting on author, disregard any author related filters.  If I click
: on "word" docs above, I'd want to see all other constraint counts
: change except for those under 'type".

I *think* i understand what you just said, it sounds like you're 
describing the same thing i did in case#2 above but you are adding in 
another facet field (author) whose counts should be constrained in the 
same way as the main result set when a "type" constraint is selected.

My confusion is that what you just said don't seem to agree with your 
suggested syntax...

: Option #3: tag parts of a request using "local params"
: q=foo&fq=date:[1 TO 2]&fq=securityfilter:42&fq={!tag=type}type:(pdf OR
: html)&facet.field=type&facet.field={!exclude=type}author
: 
: So here, one fq is tagged with "type" {!tag=type}
: and then excluded when faceting on author.

...if the goal is that filtering on "type" causes the other facet 
counts (ie: "author") to be reduced to reflect the new constraint, but you 
don't want the type facet counts to change (so that the other type options 
don't vanish) then what is the "author" facet field excluding the taged 
"fq" ... shouldn't your example be...

  q=foo
  fq=date:[1 TO 2]
  fq=securityfilter:42
  fq={!tag=type}type:(pdf OR html)
  facet.field={!exclude=type}type
  facet.field=author

?

Assuming i'm understand you (and your original example was just 
transcriptiong mistake) then i go back to the basic point of my original 
question about "huffman encoding the common case ...  
should we make "facet.field={!exclude}X" be shorthand for 
"facet.field={!exclude=X}X" ?

Alternately: is seems likely that people who want this type of 
multi-select behavior will want it for all/most of their facets ... so 
perhaps instead of the {!exclude=X} local param, we should add a new 
facet.multi=(true|false) param that can be specified on a per-field basis 
... so f.type.facet.multi=true means facet.field=type ignores any fq's 
where "tag=type".  This would probably even be possible as syntactic sugar 
in addition to the {!exclude=X} syntax, so in the common case people would 
just use facet.multi=true and forget about it, but in weird esoteric 
situations where they want filters on field X or field Y to be ignored by 
faceting on field Z they could still use "facet.field={!exclude=X,Y}Z"


-Hoss



[jira] Resolved: (SOLR-915) SolrCore;close() - scope to exploit parallelism among the number of closeHooks

2008-12-16 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-915.


Resolution: Fixed

check: rev 727122

thanks Kay

> SolrCore;close()  - scope to exploit parallelism among the number of 
> closeHooks 
> 
>
> Key: SOLR-915
> URL: https://issues.apache.org/jira/browse/SOLR-915
> Project: Solr
>  Issue Type: Improvement
>  Components: search
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-915.patch, SOLR-915.patch, SOLR-915.patch, 
> SOLR-915.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> In SolrCore: close() - all the way towards the end of the function - there 
> seems to be a sequential list of close method invocation. 
> if( closeHooks != null ) {
>for( CloseHook hook : closeHooks ) {
>  hook.close( this );
>   }
> }
> I believe this has scope to be parallelized ( actually the entire sequence of 
> close operations , updateHandler,close() etc.) - by means of launching them 
> in separate threads from an ExecutorService , for a much faster shutdown as 
> the process definitely does not need to be sequential. 
> This becomes all the more important in the multi-core context when we might 
> want to shutdown and restart a SolrCore altogether. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-911) multi-select facets

2008-12-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657125#action_12657125
 ] 

Hoss Man commented on SOLR-911:
---

bq. Well, it's optional... it's only an issue for stateless clients without 
conventions or application specific knowledge - and they could always look at 
the original parameters via echoParams.

that assumes it wasn't baked into the config file.

I guess there's no harm in adding it (except perhaps confusion: i'd hate to see 
people assume that they can add "fq=key" since they got "key" back in the 
facet_queries section of the response)

my point was really just that facet constraint "labels" are really only useful 
if each label is easily associated with a way of applying that constraint.  
replacing the query string with a "key" in the response is only useful if that 
key can be used for something.

It seems to me like it would be far more useful to just reserve "key" as a 
special local param that we guarantee will never get used, so people can 
include it in the facet.query and then parse it out of the response themselves. 
 

You can already do this today...

{noformat}
In Config...

  {!ignoredparam="Low Price"}price:[* TO 500]
  {!ignoredparam="Hight Price"}price:[500 TO 
*]
  ...
In Query Response...
   
 3
 0
 ...
{noformat}

...but it would be nice to have a parma name reserved for this.

(admittedly it never occurred to me until this Jira post that that worked, i'm 
going to start encouraging every one i know to start doing that)


> multi-select facets
> ---
>
> Key: SOLR-911
> URL: https://issues.apache.org/jira/browse/SOLR-911
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Yonik Seeley
> Fix For: 1.4
>
> Attachments: SOLR-911.patch, SOLR-911.patch, SOLR-911.patch
>
>
> plumbing to support the selection of multiple constraints in a facet

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-915) SolrCore;close() - scope to exploit parallelism among the number of closeHooks

2008-12-16 Thread Kay Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657130#action_12657130
 ] 

Kay Kay commented on SOLR-915:
--

Thanks Ryan for helping with the fix. 

> SolrCore;close()  - scope to exploit parallelism among the number of 
> closeHooks 
> 
>
> Key: SOLR-915
> URL: https://issues.apache.org/jira/browse/SOLR-915
> Project: Solr
>  Issue Type: Improvement
>  Components: search
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-915.patch, SOLR-915.patch, SOLR-915.patch, 
> SOLR-915.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> In SolrCore: close() - all the way towards the end of the function - there 
> seems to be a sequential list of close method invocation. 
> if( closeHooks != null ) {
>for( CloseHook hook : closeHooks ) {
>  hook.close( this );
>   }
> }
> I believe this has scope to be parallelized ( actually the entire sequence of 
> close operations , updateHandler,close() etc.) - by means of launching them 
> in separate threads from an ExecutorService , for a much faster shutdown as 
> the process definitely does not need to be sequential. 
> This becomes all the more important in the multi-core context when we might 
> want to shutdown and restart a SolrCore altogether. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-922) Solr WebApp wide Executor for better efficient management of threads , separating the logic in the thread from the launch of the same.

2008-12-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657141#action_12657141
 ] 

Hoss Man commented on SOLR-922:
---

I can understand why there might be value add in having each SolrCore maintain 
an Executor (or more likely and ExecutorService) which the SolrCore is 
responsible for starting up and shutting down, but It's not clear to me what 
value add Solr would provide by creating a *global* Executor for custom plugins 
to use -- It seems like a very generic (not Solr specific) ServletFilter could 
accomplish the same goal in webapps, or just a simple Static singleton in non 
webapp code.

> Solr WebApp wide Executor for better efficient management of threads , 
> separating the logic in the thread from the launch of the same. 
> ---
>
> Key: SOLR-922
> URL: https://issues.apache.org/jira/browse/SOLR-922
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Priority: Minor
>
> For a different jira - when we were discussing bringing in parallelism 
> through threads and using Executors - encountered a case of using a webapp 
> wide Executor for reusing thread pools for better use of thread resources , 
> instead of thread.start() .  
> pros:  Custom Request Handlers and other plugins to the Solr App server can 
> use this Executor API to retrieve the executor and just submit the Runnable / 
> Callable impls to get the job done while getting the benefits of a thread 
> pool . This might be necessary as we continue to write plugins to the core 
> architecture and centralizing the threadpools might make it easy to control / 
> prevent global Executor objects across the codebase / recreating them locally 
> ( as they might be expensive ). 
> $ find . -name *.java | xargs grep -nr 'start()'  | grep "}"
> ./contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/XPathEntityProcessor.java:377:
> }.start();
> ./contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/DataImporter.java:368:
> }.start();
> ./src/java/org/apache/solr/handler/SnapPuller.java:382:}.start();
> ./src/java/org/apache/solr/handler/SnapShooter.java:52:}.start();
> ./src/java/org/apache/solr/handler/ReplicationHandler.java:134:  
> }.start();
> ./src/common/org/apache/solr/common/util/ConcurrentLRUCache.java:112:
> }.start();

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-923) SolrIndexWriter.getDirectory cleanup

2008-12-16 Thread Hoss Man (JIRA)
SolrIndexWriter.getDirectory cleanup


 Key: SOLR-923
 URL: https://issues.apache.org/jira/browse/SOLR-923
 Project: Solr
  Issue Type: Improvement
  Components: update
Reporter: Hoss Man
Priority: Minor
 Fix For: 1.4
 Attachments: SOLR-923.patch

bad @deprecated msg on legacy impl, plus redundant code and lack of NIO 
goodness in legacy impl (pretty sure that was an over sight resulting from 
rouge-tile pattern, and not intentional)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-923) SolrIndexWriter.getDirectory cleanup

2008-12-16 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-923:
--

Attachment: SOLR-923.patch

patch to fix

> SolrIndexWriter.getDirectory cleanup
> 
>
> Key: SOLR-923
> URL: https://issues.apache.org/jira/browse/SOLR-923
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-923.patch
>
>
> bad @deprecated msg on legacy impl, plus redundant code and lack of NIO 
> goodness in legacy impl (pretty sure that was an over sight resulting from 
> rouge-tile pattern, and not intentional)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: SolrIndexWriter :: getDirectory(String path, SolrIndexConfig config) - deprecated method - replacement method ?

2008-12-16 Thread Chris Hostetter

I believe you are correct ... fixed in SOLR-923

: There seems to be no method with that particular signature. But there does
: seem to be another method -
: getDirectory(String path, DirectoryFactory , SolrIndexConfig ) .
: 
: Just to confirm - is that what is meant here ?


-Hoss



[jira] Resolved: (SOLR-923) SolrIndexWriter.getDirectory cleanup

2008-12-16 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-923.
---

Resolution: Fixed
  Assignee: Hoss Man

hoss...@coaster:~/lucene/solr$ svn commit -m "SOLR-923: some 
SolrIndexWriter.getDirectory cleanup"
Sendingsrc/java/org/apache/solr/update/SolrIndexWriter.java
Transmitting file data .
Committed revision 727139.


> SolrIndexWriter.getDirectory cleanup
> 
>
> Key: SOLR-923
> URL: https://issues.apache.org/jira/browse/SOLR-923
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-923.patch
>
>
> bad @deprecated msg on legacy impl, plus redundant code and lack of NIO 
> goodness in legacy impl (pretty sure that was an over sight resulting from 
> rouge-tile pattern, and not intentional)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: support for multi-select facets

2008-12-16 Thread Yonik Seeley
On Tue, Dec 16, 2008 at 2:10 PM, Chris Hostetter
 wrote:
> Assuming i'm understand you (and your original example was just
> transcriptiong mistake)

Correct, it was a mistake.

> then i go back to the basic point of my original
> question about "huffman encoding the common case ...
> should we make "facet.field={!exclude}X" be shorthand for
> "facet.field={!exclude=X}X" ?

We need the latter for more complex scenarios (what is displayed as
one facet may not be), or for even faceting on the same field
different ways.

As for shorthand, {!exclude} is already shorthand for {!type=exclude}
in localParams syntax.

> Alternately: is seems likely that people who want this type of
> multi-select behavior will want it for all/most of their facets ...

But the relationship between a single "GUI" facet may not be
one-to-one with behind-the-scenes Solr facets.

> so
> perhaps instead of the {!exclude=X} local param, we should add a new
> facet.multi=(true|false) param that can be specified on a per-field basis
> ... so f.type.facet.multi=true means facet.field=type ignores any fq's
> where "tag=type".

That also wouldn't work for facet.query

> This would probably even be possible as syntactic sugar
> in addition to the {!exclude=X} syntax, so in the common case people would
> just use facet.multi=true and forget about it,

It's not that simple though, since they have to correctly tag all the filters.
And setting multi=true isn't really descriptive (as you've noted in
the terminology discussions).  Excluding certain filters when faceting
is very descriptive, and avoids any mention of what the GUI layer is
trying to accomplish.

-Yonik

> but in weird esoteric
> situations where they want filters on field X or field Y to be ignored by
> faceting on field Z they could still use "facet.field={!exclude=X,Y}Z"
>
>
> -Hoss


[jira] Commented: (SOLR-922) Solr WebApp wide Executor for better efficient management of threads , separating the logic in the thread from the launch of the same.

2008-12-16 Thread Kay Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657162#action_12657162
 ] 

Kay Kay commented on SOLR-922:
--

Ok - I believe having a SolrCore specific Executor/ ExecutorService might work 
well fine as I believe plugins that are SolrCoreAware can still make use of the 
same. 

The motivation is to prevent multiple Executor policies (/static objects 
especially ) locally at each place where we need to launch threads or worse - 
not using them at all . 

> Solr WebApp wide Executor for better efficient management of threads , 
> separating the logic in the thread from the launch of the same. 
> ---
>
> Key: SOLR-922
> URL: https://issues.apache.org/jira/browse/SOLR-922
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Priority: Minor
>
> For a different jira - when we were discussing bringing in parallelism 
> through threads and using Executors - encountered a case of using a webapp 
> wide Executor for reusing thread pools for better use of thread resources , 
> instead of thread.start() .  
> pros:  Custom Request Handlers and other plugins to the Solr App server can 
> use this Executor API to retrieve the executor and just submit the Runnable / 
> Callable impls to get the job done while getting the benefits of a thread 
> pool . This might be necessary as we continue to write plugins to the core 
> architecture and centralizing the threadpools might make it easy to control / 
> prevent global Executor objects across the codebase / recreating them locally 
> ( as they might be expensive ). 
> $ find . -name *.java | xargs grep -nr 'start()'  | grep "}"
> ./contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/XPathEntityProcessor.java:377:
> }.start();
> ./contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/DataImporter.java:368:
> }.start();
> ./src/java/org/apache/solr/handler/SnapPuller.java:382:}.start();
> ./src/java/org/apache/solr/handler/SnapShooter.java:52:}.start();
> ./src/java/org/apache/solr/handler/ReplicationHandler.java:134:  
> }.start();
> ./src/common/org/apache/solr/common/util/ConcurrentLRUCache.java:112:
> }.start();

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2008-12-16 Thread Karsten Sperling (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657167#action_12657167
 ] 

Karsten Sperling commented on SOLR-236:
---

I'm pretty sure the problem Stephen ran into is an off-by-one error in the 
bitset allocation inside the collapsing code; I ran into the same problem when 
I customized it for internal use about half a year ago -- and unfortunately 
forgot all about the problem until reading Stephen's comment just now. 
Basically the bitset gets allocated 1 bit too small, so there's about a 1/32 
chance that if the bit for the document with the highest ID gets set it will 
cause the AIOOB exception.

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Emmanuel Keller
> Fix For: 1.4
>
> Attachments: collapsing-patch-to-1.3.0-ivan.patch, 
> collapsing-patch-to-1.3.0-ivan_2.patch, 
> field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
> field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
> field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
> SOLR-236-FieldCollapsing.patch, solr-236.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-924) Solr: Making finalizers call super.finalize() wrapped in try..finally block

2008-12-16 Thread Kay Kay (JIRA)
Solr: Making finalizers call super.finalize() wrapped in try..finally block 


 Key: SOLR-924
 URL: https://issues.apache.org/jira/browse/SOLR-924
 Project: Solr
  Issue Type: Improvement
 Environment: Tomcat 6, JRE 6
Reporter: Kay Kay


There are some occurences of finalizers in the code base. While the presence of 
them is debatable and discussed in a separate JIRA - the ones that are retained 
are better off wrapped around a try .. finally block to recursively call the 
finalizer of the super class for proper resource usage unwinding , (in case 
finalizers get invoked ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-914) Presence of finalize() in the codebase

2008-12-16 Thread Kay Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657067#action_12657067
 ] 

kaykay.unique edited comment on SOLR-914 at 12/16/08 2:01 PM:


Separately - it might be worth to wrap around the code with a try .. finally { 
super.finalize(); } for all the custom finalizers for better code correctness. 

JIRA SOLR-924 logged for the same. Patch submitted for the new jira as well. 

  was (Author: kaykay.unique):
Separately - it might be worth to wrap around the code with a try .. 
finally { super.finalize(); } for all the custom finalizers for better code 
correctness. Let me know what you think about the same.  
  
> Presence of finalize() in the codebase 
> ---
>
> Key: SOLR-914
> URL: https://issues.apache.org/jira/browse/SOLR-914
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Affects Versions: 1.3
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Priority: Minor
> Fix For: 1.4
>
>   Original Estimate: 480h
>  Remaining Estimate: 480h
>
> There seems to be a number of classes - that implement finalize() method.  
> Given that it is perfectly ok for a Java VM to not to call it - may be - 
> there has to some other way  { try .. finally - when they are created to 
> destroy them } to destroy them and the presence of finalize() method , ( 
> depending on implementation ) might not serve what we want and in some cases 
> can end up delaying the gc process, depending on the algorithms. 
> $ find . -name *.java | xargs grep finalize
> ./contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/JdbcDataSource.java:
>   protected void finalize() {
> ./src/java/org/apache/solr/update/SolrIndexWriter.java:  protected void 
> finalize() {
> ./src/java/org/apache/solr/core/CoreContainer.java:  protected void 
> finalize() {
> ./src/java/org/apache/solr/core/SolrCore.java:  protected void finalize() {
> ./src/common/org/apache/solr/common/util/ConcurrentLRUCache.java:  protected 
> void finalize() throws Throwable {
> May be we need to revisit these occurences from a design perspective to see 
> if they are necessary / if there is an alternate way of managing guaranteed 
> destruction of resources. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-924) Solr: Making finalizers call super.finalize() wrapped in try..finally block

2008-12-16 Thread Kay Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated SOLR-924:
-

Attachment: SOLR-924.patch

The following classes have their try .. finally clause implemented in 
finalizers set up for recursive call. 

Index: 
contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/JdbcDataSource.java
Index: src/java/org/apache/solr/update/SolrIndexWriter.java
Index: src/java/org/apache/solr/core/SolrCore.java
Index: src/java/org/apache/solr/core/CoreContainer.java
Index: src/common/org/apache/solr/common/util/ConcurrentLRUCache.java


> Solr: Making finalizers call super.finalize() wrapped in try..finally block 
> 
>
> Key: SOLR-924
> URL: https://issues.apache.org/jira/browse/SOLR-924
> Project: Solr
>  Issue Type: Improvement
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
> Attachments: SOLR-924.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> There are some occurences of finalizers in the code base. While the presence 
> of them is debatable and discussed in a separate JIRA - the ones that are 
> retained are better off wrapped around a try .. finally block to recursively 
> call the finalizer of the super class for proper resource usage unwinding , 
> (in case finalizers get invoked ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-924) Solr: Making finalizers call super.finalize() wrapped in try..finally block

2008-12-16 Thread Kay Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated SOLR-924:
-

Component/s: replication (java)
 contrib - DataImportHandler

> Solr: Making finalizers call super.finalize() wrapped in try..finally block 
> 
>
> Key: SOLR-924
> URL: https://issues.apache.org/jira/browse/SOLR-924
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler, replication (java)
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-924.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> There are some occurences of finalizers in the code base. While the presence 
> of them is debatable and discussed in a separate JIRA - the ones that are 
> retained are better off wrapped around a try .. finally block to recursively 
> call the finalizer of the super class for proper resource usage unwinding , 
> (in case finalizers get invoked ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-924) Solr: Making finalizers call super.finalize() wrapped in try..finally block

2008-12-16 Thread Kay Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated SOLR-924:
-

Fix Version/s: 1.4
 Priority: Minor  (was: Major)

> Solr: Making finalizers call super.finalize() wrapped in try..finally block 
> 
>
> Key: SOLR-924
> URL: https://issues.apache.org/jira/browse/SOLR-924
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler, replication (java)
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-924.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> There are some occurences of finalizers in the code base. While the presence 
> of them is debatable and discussed in a separate JIRA - the ones that are 
> retained are better off wrapped around a try .. finally block to recursively 
> call the finalizer of the super class for proper resource usage unwinding , 
> (in case finalizers get invoked ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-916) CoreContainer :: register(String, SolrCore, boolean) documentation clarification about returnPrev argument

2008-12-16 Thread Kay Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated SOLR-916:
-

Fix Version/s: (was: 1.3.1)
   1.4

> CoreContainer :: register(String, SolrCore, boolean)  documentation 
> clarification about returnPrev argument
> ---
>
> Key: SOLR-916
> URL: https://issues.apache.org/jira/browse/SOLR-916
> Project: Solr
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 1.3
> Environment: Tomcat 6, JRE 6 
>Reporter: Kay Kay
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-916.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> In CoreContainer.java :: register(name, core, returnPrev) - the documentation 
> says 
>   it would return a previous core having the same name if it existed *and 
> returnPrev = true*.
>   * @return a previous core having the same name if it existed and 
> returnPrev==true
>   */
>  public SolrCore register(String name, SolrCore core, boolean returnPrev) ..
> But as per the code towards the end - the previous core is returned anyway, 
> irrespective of the value of returnPrev. The difference, though, seems to be 
> that when returnPrev is false, the previous core (of the same name, if 
> exists) is closed.
> Which one of them is correct . If the code were correct , would the variable 
> be better renamed as closePrevious , as opposed to returnPrevious.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: ant example, tika

2008-12-16 Thread Chris Hostetter

: I think, eventually, and I really hate to say this b/c classloading is a
: nightmare, but we may want to look into isolated classloaders or OSGi or
: something for the Solr Home lib directory.  The benefits being that I already
: see library collisions in our future.

we already have clasloaders isolated by SolrCore.  multiple plugins used 
by the same core with conflicting dependencies might cause library 
collision, but i don't see how that would be much different using any 
other approach.  if we switch to a config management framework that 
already has "plugin management" as a feature then by all means lets use 
it, but i don't think we need to go looking for a change -- we've alrady 
done the hard work.


-Hoss



[jira] Commented: (SOLR-906) Buffered / Streaming SolrServer implementaion

2008-12-16 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657278#action_12657278
 ] 

Noble Paul commented on SOLR-906:
-

another observation :
why do we need a ScheduledExecutorService we only need a 
ThreadPoolExecutorService

> Buffered / Streaming SolrServer implementaion
> -
>
> Key: SOLR-906
> URL: https://issues.apache.org/jira/browse/SOLR-906
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Reporter: Ryan McKinley
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-906-StreamingHttpSolrServer.patch, 
> SOLR-906-StreamingHttpSolrServer.patch, StreamingHttpSolrServer.java
>
>
> While indexing lots of documents, the CommonsHttpSolrServer add( 
> SolrInputDocument ) is less then optimal.  This makes a new request for each 
> document.
> With a "StreamingHttpSolrServer", documents are buffered and then written to 
> a single open Http connection.
> For related discussion see:
> http://www.nabble.com/solr-performance-tt9055437.html#a20833680

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-922) Solr WebApp wide Executor for better efficient management of threads , separating the logic in the thread from the launch of the same.

2008-12-16 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657279#action_12657279
 ] 

Noble Paul commented on SOLR-922:
-

hi Kay 
I am in agreement with the general goal of having a central ExecutorService . 
But the risk is that we will never know that which consumer is eating up the 
threads and some other consumers will be starved for threads. These things 
become hard to debug.


leave aside DIH because it almost entirely needs its threads and it uses it 
100% . And leave aside the classes in common.util

Probably Snapshooter/Snappuller can use that

> Solr WebApp wide Executor for better efficient management of threads , 
> separating the logic in the thread from the launch of the same. 
> ---
>
> Key: SOLR-922
> URL: https://issues.apache.org/jira/browse/SOLR-922
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Priority: Minor
>
> For a different jira - when we were discussing bringing in parallelism 
> through threads and using Executors - encountered a case of using a webapp 
> wide Executor for reusing thread pools for better use of thread resources , 
> instead of thread.start() .  
> pros:  Custom Request Handlers and other plugins to the Solr App server can 
> use this Executor API to retrieve the executor and just submit the Runnable / 
> Callable impls to get the job done while getting the benefits of a thread 
> pool . This might be necessary as we continue to write plugins to the core 
> architecture and centralizing the threadpools might make it easy to control / 
> prevent global Executor objects across the codebase / recreating them locally 
> ( as they might be expensive ). 
> $ find . -name *.java | xargs grep -nr 'start()'  | grep "}"
> ./contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/XPathEntityProcessor.java:377:
> }.start();
> ./contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/DataImporter.java:368:
> }.start();
> ./src/java/org/apache/solr/handler/SnapPuller.java:382:}.start();
> ./src/java/org/apache/solr/handler/SnapShooter.java:52:}.start();
> ./src/java/org/apache/solr/handler/ReplicationHandler.java:134:  
> }.start();
> ./src/common/org/apache/solr/common/util/ConcurrentLRUCache.java:112:
> }.start();

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-906) Buffered / Streaming SolrServer implementaion

2008-12-16 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657278#action_12657278
 ] 

noble.paul edited comment on SOLR-906 at 12/16/08 8:16 PM:
---

another observation :
why do we need a ScheduledExecutorService we only need a 
ThreadPoolExecutorService
The name of the class is somewhat misleading. We must document that this may be 
exclusively used for updates

How about renaming this to StreamingUpdateSolrServer

  was (Author: noble.paul):
another observation :
why do we need a ScheduledExecutorService we only need a 
ThreadPoolExecutorService
  
> Buffered / Streaming SolrServer implementaion
> -
>
> Key: SOLR-906
> URL: https://issues.apache.org/jira/browse/SOLR-906
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Reporter: Ryan McKinley
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-906-StreamingHttpSolrServer.patch, 
> SOLR-906-StreamingHttpSolrServer.patch, StreamingHttpSolrServer.java
>
>
> While indexing lots of documents, the CommonsHttpSolrServer add( 
> SolrInputDocument ) is less then optimal.  This makes a new request for each 
> document.
> With a "StreamingHttpSolrServer", documents are buffered and then written to 
> a single open Http connection.
> For related discussion see:
> http://www.nabble.com/solr-performance-tt9055437.html#a20833680

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] LOGO

2008-12-16 Thread Mike Klaas


On 13-Dec-08, at 2:52 PM, Ryan McKinley wrote:


Ok, all votes are cast (except Grant who is abstaining)


Thanks for tallying the votes, Ryan.  You're too damn quick for me!

-Mike



[jira] Commented: (SOLR-924) Solr: Making finalizers call super.finalize() wrapped in try..finally block

2008-12-16 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657293#action_12657293
 ] 

Noble Paul commented on SOLR-924:
-

Why do we need to call a super.finalize() when the super class is 
java.lang.Object ?

It achieves no purpose





> Solr: Making finalizers call super.finalize() wrapped in try..finally block 
> 
>
> Key: SOLR-924
> URL: https://issues.apache.org/jira/browse/SOLR-924
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler, replication (java)
> Environment: Tomcat 6, JRE 6
>Reporter: Kay Kay
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-924.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> There are some occurences of finalizers in the code base. While the presence 
> of them is debatable and discussed in a separate JIRA - the ones that are 
> retained are better off wrapped around a try .. finally block to recursively 
> call the finalizer of the super class for proper resource usage unwinding , 
> (in case finalizers get invoked ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

2008-12-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657299#action_12657299
 ] 

Hoss Man commented on SOLR-912:
---

If i'm understanding the discussion so far...

* ModernNamedList is being suggested as an alternate implementation of 
NamedList ... ideally the internals of NamedLIst would be replaced with the 
internals of ModernNamedList, but in this patch they are seperate classes so 
they can be compared.
* INamedList is included in the patch as a way to demonstrate that 
ModernNamedList fulfills the same contract as NamedList (for the purposes of 
testing etc)

do i have those aspects correct?

with that in mind: i'm not sure i understand what "itch" changing the 
implementation "scratches" ... the initial issue description says it's because 
NamedList " is not necessarily type-safe" but it's not clear what that 
statement is referring to ... later comments suggest that the motivation is to 
improve the performance of "remove" ... which hardly seems like something worth 
optimizing for.

I agree that having the internals based on a "list of pairs" certainly seems 
like it might be more intuitive to developers looking at the internals (then 
the current approach is), but how is the current approach less type safe for 
consumers using just the NamedList API?

If the "modern" approach is more performant then the existing impl and passes 
all of the tests then i suppose it would make sense to switch -- but i'm far 
more interested in how the performance compares for common cases 
(add/get/iterate) then for cases that hardly ever come up (remove).

My suggestion: provide two independent attachments.  One patch that just 
replaces the internals of NamedList with the approach you suggest so people can 
apply the patch, test it out, and verify the API/behavior; A second attachment 
that provides some benchmarks against the NmaedList class -- so people can 
read/run your benchmark with and with out the patch to see how the performance 
changes.


> org.apache.solr.common.util.NamedList - Typesafe efficient variant - 
> ModernNamedList introduced - implementing the same API as NamedList
> 
>
> Key: SOLR-912
> URL: https://issues.apache.org/jira/browse/SOLR-912
> Project: Solr
>  Issue Type: Improvement
>  Components: Analysis
>Affects Versions: 1.4
> Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>Reporter: Kay Kay
>Priority: Minor
> Fix For: 1.3.1
>
> Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily 
> type-safe. I have implemented an additional implementation of the same - 
> ModernNamedList (a type-safe variation providing the same interface as 
> NamedList) - while preserving the semantics in terms of ordering of elements 
> and allowing null elements for key and values (keys are always Strings , 
> while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [Solr Wiki] Update of "SolrLogging" by HossMan

2008-12-16 Thread Chris Hostetter

: The following page has been changed by HossMan:
: http://wiki.apache.org/solr/SolrLogging

FWIW: It would be great if someone who actually knew what they were 
talking about (*cough* ryan *cough*) could take a pass at this and make 
it, you know, correct.  

I was basically just guessing


-Hoss