Re: [Why] some PyLucene tests fail on Windows
On Tue, 8 May 2012, Thomas Koch wrote: There's a known issue with some PyLucene tests that fail on Windows - reported/discussed before - see http://mail-archives.apache.org/mod_mbox/lucene-pylucene-dev/201104.mbox/00 0d01cbfa7b$c60ca530$5225ef90$@de While some tests have been fixed, some others still show errors / fail on windows, currently that's ERROR: test_removeDocument (__main__.PythonDirectoryTests) FAIL: testTiming (lia.indexing.CompoundVersusMultiFileIndexTest.CompoundVersusMultiFileIndexT est) I had a look at this and tried to figure out why the PythonDirectoryTests fail. A possible cause is a windows issue with os.unlink where in certain situations (timing issue!) a file that is going to be deleted is still locked by windows - which may happen because of some process being notified about file removal still holds a lock on it (sth. like an indexer or virus checker). Another reason could simply be that some tests keep files open... anyway: On Windows, attempting to remove a file that is in use causes an exception to be raised. (from the Python doc on os.remove(path) and unlink(path)). The traceback below shows that the test in PythonDirectoryTests fails in deleteFile (where lock is on an index file like '_0.tis' or '_0.fdt'). There's been a similar problem with Python unit tests (on windows) as discussed here: http://bugs.python.org/issue7443 (can't see that they came to a solution though ... it's still open) I've added special hack for windows to the test_PythonDirectory and added a try-delete-wait-retry-loop there - this shows that it's probably not a timing issue (after 10 secs/retries, the file still cannot be deleted) - so it rather looks like the index files are still open! (couldn't find a bug in the python code though) Thus I tried to simply ignore the exception - the files are then *not* removed, but the test succeeds: ... indexing 98 indexing 99 failed to delete: testpyrepo\_0.tis failed to delete: testpyrepo\_0.nrm failed to delete: testpyrepo\_0.frq failed to delete: testpyrepo\_0.fdx failed to delete: testpyrepo\_0.prx failed to delete: testpyrepo\_0.fdt failed to delete: testpyrepo\_0.tis failed to delete: testpyrepo\_0.nrm failed to delete: testpyrepo\_0.frq failed to delete: testpyrepo\_0.fdx failed to delete: testpyrepo\_0.prx failed to delete: testpyrepo\_0.fdt ... Ran 10 tests in 3.659s (The failed to delete: is a debug statement - to be removed) I could provide a patch that allows the test_PythonDirectory to pass on windows - if that's an accepted solution. Still I see possibility that the PythonDirectory has a bug (i.e. index files are not closed always) - on the other hand it could be timing issue or sth related to my environment (?) I don't think that hiding failures under the rug is the right thing to do. I wonder if anyone has used PythonDirectory yet (in production) or if someone else came across these issues on windows? (I'm not using PythonDirectory and haven't had similar lock issues except of when running the tests...) If someone could confirm that the test_PythonDirectory passes on his/her windows environment that would be good to know ,-) PythonDirectory is meant as an example of python extension and tests that the extension is functional. It's not intended as a production-level implementation. Andi.. Haven't had a look at testTiming yet. Regards, Thomas -- Details : == ERROR: test_removeDocument (__main__.PythonDirectoryTests) -- Traceback (most recent call last): File F:\Devel\workspaces\workspace.pylucene\pylucene-3.6.0-2\test\test_PyLucene. py, line 178, in test_removeDocument self.closeStore(store, searcher, reader) File test_PythonDirectory.py, line 241, in closeStore arg.close() JavaError: org.apache.jcc.PythonException: (32, 'Der Prozess kann nicht auf die Datei zugreifen, da sie von einem anderen Prozess verwendet wird', u'testpyrepo_0.tis') Traceback (most recent call last): File test_PythonDirectory.py, line 181, in deleteFile os.unlink(os.path.join(self.path, name)) WindowsError: [Error 32] Der Prozess kann nicht auf die Datei zugreifen, da sie von einem anderen Prozess verwendet wird: u'testpyrepo\\_0.tis' Java stacktrace: ... at org.apache.pylucene.store.PythonDirectory.deleteFile(Native Method) at org.apache.lucene.index.IndexFileDeleter.deleteFile(IndexFileDeleter.java:57 8) at org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:517) at org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:504) at org.apache.lucene.index.IndexFileDeleter.close(IndexFileDeleter.java:377) at org.apache.lucene.index.DirectoryReader.doCommit(DirectoryReader.java:854) at org.apache.lucene.index.IndexReader.commit(IndexReader.java:1520) at
Easy way to find JAVA_HOME
Hello, I found a much easier to detect the path to JAVA_HOME on Unix-like platforms where the java command is in the search path. java -verbose prints out the paths of all loaded JAR files. Christian --- import subprocess import re import os PATH_RE = re.compile(Loaded\ .*\ from\ (.*)/jre/lib/rt.jar) def find_java(): Find java home by running java -verbose In verbose mode, java prints lines like [Loaded java.lang.Object from /usr/lib/jvm/java-6-openjdk-amd64/jre/lib/rt.jar] to stdout. try: proc = subprocess.Popen([java, -verbose], stdout=subprocess.PIPE) except OSError: return None out, err = proc.communicate() for line in out.split(\n): mo = PATH_RE.search(line) if mo is None: continue javahome = mo.group(1) if os.path.isdir(javahome): return javahome if __name__ == __main__: print find_java() ---
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #480: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/480/ No tests ran. Build Log (for compile errors): [...truncated 6847 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 13874 - Failure
I'll fix. On Tue, May 8, 2012 at 5:36 PM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13874/ All tests passed Build Log (for compile errors): [...truncated 24376 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Chris Male | Software Developer | DutchWorks | www.dutchworks.nl
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 13875 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13875/ All tests passed Build Log (for compile errors): [...truncated 24129 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 13875 - Still Failing
Fixed in r1335354. On Tue, May 8, 2012 at 6:32 PM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13875/ All tests passed Build Log (for compile errors): [...truncated 24129 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Chris Male | Software Developer | DutchWorks | www.dutchworks.nl
Lucene Index Size and Performance
Hi, I have Index with size 1GB. Its each documents consist five Fields which are use for search.For single result it take 30 to 40 milliseconds.I want to reduce this time.How can I do this? is search performance depends on a Index size? What is maximum capacity to add Documents in Index? -- View this message in context: http://lucene.472066.n3.nabble.com/Lucene-Index-Size-and-Performance-tp3970551.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 13876 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13876/ All tests passed Build Log (for compile errors): [...truncated 24092 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4039) Add AddIndexesTask to Benchmark
[ https://issues.apache.org/jira/browse/LUCENE-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-4039. Resolution: Fixed Committed revision 1335363. Add AddIndexesTask to Benchmark --- Key: LUCENE-4039 URL: https://issues.apache.org/jira/browse/LUCENE-4039 Project: Lucene - Java Issue Type: New Feature Components: modules/benchmark Reporter: Shai Erera Assignee: Shai Erera Priority: Minor Fix For: 4.0 Attachments: LUCENE-4039.patch I was interested in measuring the performance of IndexWriter.addIndexes(Directory) vs. IndexWriter.addIndexes(IndexReader). I wrote an AddIndexesTask and a matching .alg. The task takes a parameter whether to use the IndexReader or Directory variants. I'll upload the patch and describe the perf results. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4038) some testcases not executed by 'ant test'
[ https://issues.apache.org/jira/browse/LUCENE-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270307#comment-13270307 ] Dawid Weiss commented on LUCENE-4038: - That's the way it's always been -- I didn't change it when switching to junit4 (I think). some testcases not executed by 'ant test' - Key: LUCENE-4038 URL: https://issues.apache.org/jira/browse/LUCENE-4038 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Fix For: 4.0 Look under 'spatial', RecursivePrefixTreeStrategyTestCase and TwoDoublesStrategyTestCase don't get invoked. I suspect this is something in junit4 that doesnt like the fact that these classes extend a base class that takes a generic type? Because if i just click 'run tests' from this folder in my IDE, then they run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene Index Size and Performance
what's the hardware configuration of your machines? if you have enough RAM, you could use RAMDirectory. On Tue, May 8, 2012 at 2:52 PM, parkhekishor kishor.par...@highmark.in wrote: Hi, I have Index with size 1GB. Its each documents consist five Fields which are use for search.For single result it take 30 to 40 milliseconds.I want to reduce this time.How can I do this? is search performance depends on a Index size? What is maximum capacity to add Documents in Index? -- View this message in context: http://lucene.472066.n3.nabble.com/Lucene-Index-Size-and-Performance-tp3970551.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Multi-content-type /update handler
+1 !! On May 7, 2012, at 20:28 , Ryan McKinley wrote: I'd like to commit SOLR-2857 soon -- it would be great for 4.0 to assume XML/JSON/CSV/JAVABIN at the same endpoint rather then 4 configured RequestHandlers The bulk of the patch is refactoring the tests to all point to the same handler Any objections? If so, should we improve things on trunk or more patches? ryan - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4022) Offline Sorter wrongly uses MIN_BUFFER_SIZE if there is more memory available
[ https://issues.apache.org/jira/browse/LUCENE-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-4022: Attachment: LUCENE-4022.patch here is a patch with a slightly change algorithm. It still takes free/2 as the base buffer size but checks if it is reasonable to grow the heap if the total available mem is 10x larger than the free memory or if the free memory is smaller than MIN_BUFFER_SIZE_MB. If we run into small heaps like on mobile phones where you only have up to 3MB this falls back to the 1/2 or the ABSOLUTE_MIN_SORT_BUFFER_SIZE. The actual buffer size is bounded by Integer.MAX_VALUE Offline Sorter wrongly uses MIN_BUFFER_SIZE if there is more memory available - Key: LUCENE-4022 URL: https://issues.apache.org/jira/browse/LUCENE-4022 Project: Lucene - Java Issue Type: Bug Components: modules/spellchecker Affects Versions: 3.6, 4.0 Reporter: Simon Willnauer Fix For: 4.0, 3.6.1 Attachments: LUCENE-4022.patch The Sorter we use for offline sorting seems to use the MIN_BUFFER_SIZE as a upper bound even if there is more memory available. See this snippet: {code} long half = free/2; if (half = ABSOLUTE_MIN_SORT_BUFFER_SIZE) { return new BufferSize(Math.min(MIN_BUFFER_SIZE_MB * MB, half)); } // by max mem (heap will grow) half = (max - total) / 2; return new BufferSize(Math.min(MIN_BUFFER_SIZE_MB * MB, half)); {code} use use use Math.max instead of min here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-4022) Offline Sorter wrongly uses MIN_BUFFER_SIZE if there is more memory available
[ https://issues.apache.org/jira/browse/LUCENE-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer reassigned LUCENE-4022: --- Assignee: Simon Willnauer Offline Sorter wrongly uses MIN_BUFFER_SIZE if there is more memory available - Key: LUCENE-4022 URL: https://issues.apache.org/jira/browse/LUCENE-4022 Project: Lucene - Java Issue Type: Bug Components: modules/spellchecker Affects Versions: 3.6, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0, 3.6.1 Attachments: LUCENE-4022.patch The Sorter we use for offline sorting seems to use the MIN_BUFFER_SIZE as a upper bound even if there is more memory available. See this snippet: {code} long half = free/2; if (half = ABSOLUTE_MIN_SORT_BUFFER_SIZE) { return new BufferSize(Math.min(MIN_BUFFER_SIZE_MB * MB, half)); } // by max mem (heap will grow) half = (max - total) / 2; return new BufferSize(Math.min(MIN_BUFFER_SIZE_MB * MB, half)); {code} use use use Math.max instead of min here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4042) New snowball stemmers (Irish gaelic and Czech)
Dawid Weiss created LUCENE-4042: --- Summary: New snowball stemmers (Irish gaelic and Czech) Key: LUCENE-4042 URL: https://issues.apache.org/jira/browse/LUCENE-4042 Project: Lucene - Java Issue Type: New Feature Reporter: Dawid Weiss Priority: Trivial Fix For: 4.0 New stemmers have been added to snowball (Irish gaelic and Czech). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2834) AnalysisResponseBase.java doesn't handle org.apache.solr.analysis.HTMLStripCharFilter
[ https://issues.apache.org/jira/browse/SOLR-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane updated SOLR-2834: Affects Version/s: 3.6 AnalysisResponseBase.java doesn't handle org.apache.solr.analysis.HTMLStripCharFilter - Key: SOLR-2834 URL: https://issues.apache.org/jira/browse/SOLR-2834 Project: Solr Issue Type: Bug Components: clients - java, Schema and Analysis Affects Versions: 3.4, 3.6 Reporter: Shane When using FieldAnalysisRequest.java to analysis a field, a ClassCastExcpetion is thrown if the schema defines the filter org.apache.solr.analysis.HTMLStripCharFilter. The exception is: java.lang.ClassCastException: java.lang.String cannot be cast to java.util.List at org.apache.solr.client.solrj.response.AnalysisResponseBase.buildPhases(AnalysisResponseBase.java:69) at org.apache.solr.client.solrj.response.FieldAnalysisResponse.setResponse(FieldAnalysisResponse.java:66) at org.apache.solr.client.solrj.request.FieldAnalysisRequest.process(FieldAnalysisRequest.java:107) My schema definition is: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.HTMLStripCharFilterFactory / tokenizer class=solr.StandardTokenizerFactory / filter class=solr.StandardFilterFactory / filter class=solr.TrimFilterFactory / filter class=solr.LowerCaseFilterFactory / /analyzer /fieldType The response is part is: lst name=query str name=org.apache.solr.analysis.HTMLStripCharFiltertesting analysis/str arr name=org.apache.lucene.analysis.standard.StandardTokenizer lst... A simplistic fix would be to test if the Entry value is an instance of List. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2834) AnalysisResponseBase.java doesn't handle org.apache.solr.analysis.HTMLStripCharFilter
[ https://issues.apache.org/jira/browse/SOLR-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane updated SOLR-2834: Attachment: AnalysisResponseBase.patch Patch file for fix to check if the Entry value is an instance of List. AnalysisResponseBase.java doesn't handle org.apache.solr.analysis.HTMLStripCharFilter - Key: SOLR-2834 URL: https://issues.apache.org/jira/browse/SOLR-2834 Project: Solr Issue Type: Bug Components: clients - java, Schema and Analysis Affects Versions: 3.4, 3.6 Reporter: Shane Attachments: AnalysisResponseBase.patch When using FieldAnalysisRequest.java to analysis a field, a ClassCastExcpetion is thrown if the schema defines the filter org.apache.solr.analysis.HTMLStripCharFilter. The exception is: java.lang.ClassCastException: java.lang.String cannot be cast to java.util.List at org.apache.solr.client.solrj.response.AnalysisResponseBase.buildPhases(AnalysisResponseBase.java:69) at org.apache.solr.client.solrj.response.FieldAnalysisResponse.setResponse(FieldAnalysisResponse.java:66) at org.apache.solr.client.solrj.request.FieldAnalysisRequest.process(FieldAnalysisRequest.java:107) My schema definition is: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.HTMLStripCharFilterFactory / tokenizer class=solr.StandardTokenizerFactory / filter class=solr.StandardFilterFactory / filter class=solr.TrimFilterFactory / filter class=solr.LowerCaseFilterFactory / /analyzer /fieldType The response is part is: lst name=query str name=org.apache.solr.analysis.HTMLStripCharFiltertesting analysis/str arr name=org.apache.lucene.analysis.standard.StandardTokenizer lst... A simplistic fix would be to test if the Entry value is an instance of List. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
AW: [VOTE] Release PyLucene 3.6.0 rc2
I could build JCC and PyLucene on Win7-32 with Python27 and Java16. The ivy thing gets installed automatically. All tests pass except of the PythonDirectoryTests and testTiming. However there's a known issue about some tests that fail on windows thus this shouldn't be a release blocker. I've investigated a bit further on the test issue and will send a separate email to the list about it. +1 for PyLucene 3.6.0 rc2 Regards, Thomas -Ursprüngliche Nachricht- Von: Andi Vajda [mailto:va...@apache.org] Gesendet: Dienstag, 8. Mai 2012 02:20 An: pylucene-...@lucene.apache.org Cc: gene...@lucene.apache.org Betreff: [VOTE] Release PyLucene 3.6.0 rc2 The ivy requirement for building Lucene Java is now handled by PyLucene's Makefile. A new release candidate is available for review. The PyLucene 3.6.0-2 release tracking the recent release of Apache Lucene 3.6.0 is ready. A release candidate is available from: http://people.apache.org/~vajda/staging_area/ A list of changes in this release can be seen at: http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_6/ CHANGES PyLucene 3.6.0 is built with JCC 2.13 included in these release artifacts: http://svn.apache.org/repos/asf/lucene/pylucene/trunk/jcc/CHANGES A list of Lucene Java changes can be seen at: http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_6_0/lucen e/CHANGES.txt Please vote to release these artifacts as PyLucene 3.6.0-2. Thanks ! Andi.. ps: the KEYS file for PyLucene release signing is at: http://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS http://people.apache.org/~vajda/staging_area/KEYS pps: here is my +1
[jira] [Commented] (SOLR-3221) Make Shard handler threadpool configurable
[ https://issues.apache.org/jira/browse/SOLR-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270498#comment-13270498 ] Mark Miller commented on SOLR-3221: --- bq. I am loathe to submit a patch for changing the CHANGES.txt Usually the committer handles this - no rules about it though. I'd go with something a bit shorter - no need to get into the gritty details - that's why the JIRA issue number is there. I'd stick to something closer to Make shard handler threadpool configurable. or Added the ability to directly configure aspects of the concurrency and thread-pooling used within distributed search in solr. Make Shard handler threadpool configurable -- Key: SOLR-3221 URL: https://issues.apache.org/jira/browse/SOLR-3221 Project: Solr Issue Type: Improvement Affects Versions: 3.6, 4.0 Reporter: Greg Bowyer Assignee: Erick Erickson Labels: distributed, http, shard Fix For: 3.6, 4.0 Attachments: SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch From profiling of monitor contention, as well as observations of the 95th and 99th response times for nodes that perform distributed search (or ‟aggregator‟ nodes) it would appear that the HttpShardHandler code currently does a suboptimal job of managing outgoing shard level requests. Presently the code contained within lucene 3.5's SearchHandler and Lucene trunk / 3x's ShardHandlerFactory create arbitrary threads in order to service distributed search requests. This is done presently to limit the size of the threadpool such that it does not consume resources in deployment configurations that do not use distributed search. This unfortunately has two impacts on the response time if the node coordinating the distribution is under high load. The usage of the MaxConnectionsPerHost configuration option results in aggressive activity on semaphores within HttpCommons, it has been observed that the aggregator can have a response time far greater than that of the searchers. The above monitor contention would appear to suggest that in some cases its possible for liveness issues to occur and for simple queries to be starved of resources simply due to a lack of attention from the viewpoint of context switching. With, as mentioned above the http commons connection being hotly contended The fair, queue based configuration eliminates this, at the cost of throughput. This patch aims to make the threadpool largely configurable allowing for those using solr to choose the throughput vs latency balance they desire. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3221) Make Shard handler threadpool configurable
[ https://issues.apache.org/jira/browse/SOLR-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270504#comment-13270504 ] Greg Bowyer commented on SOLR-3221: --- Sorry the changes.txt change was done so I dont think there is anything left for this jira ticket, I think Erick added the changes.txt for the 3.6 release Make Shard handler threadpool configurable -- Key: SOLR-3221 URL: https://issues.apache.org/jira/browse/SOLR-3221 Project: Solr Issue Type: Improvement Affects Versions: 3.6, 4.0 Reporter: Greg Bowyer Assignee: Erick Erickson Labels: distributed, http, shard Fix For: 3.6, 4.0 Attachments: SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch From profiling of monitor contention, as well as observations of the 95th and 99th response times for nodes that perform distributed search (or ‟aggregator‟ nodes) it would appear that the HttpShardHandler code currently does a suboptimal job of managing outgoing shard level requests. Presently the code contained within lucene 3.5's SearchHandler and Lucene trunk / 3x's ShardHandlerFactory create arbitrary threads in order to service distributed search requests. This is done presently to limit the size of the threadpool such that it does not consume resources in deployment configurations that do not use distributed search. This unfortunately has two impacts on the response time if the node coordinating the distribution is under high load. The usage of the MaxConnectionsPerHost configuration option results in aggressive activity on semaphores within HttpCommons, it has been observed that the aggregator can have a response time far greater than that of the searchers. The above monitor contention would appear to suggest that in some cases its possible for liveness issues to occur and for simple queries to be starved of resources simply due to a lack of attention from the viewpoint of context switching. With, as mentioned above the http commons connection being hotly contended The fair, queue based configuration eliminates this, at the cost of throughput. This patch aims to make the threadpool largely configurable allowing for those using solr to choose the throughput vs latency balance they desire. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4042) New snowball stemmers (Irish gaelic and Czech)
[ https://issues.apache.org/jira/browse/LUCENE-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270511#comment-13270511 ] Robert Muir commented on LUCENE-4042: - We have the irish one already, Jim contributed that on LUCENE-3883. I verified the .sbl is the same and already removed our local copy. We should add the Czech one imo. It differs from cz.CzechStemmer.java in that it implements the more aggressive variant, stemming derivational endings etc, so it gives users a choice. New snowball stemmers (Irish gaelic and Czech) -- Key: LUCENE-4042 URL: https://issues.apache.org/jira/browse/LUCENE-4042 Project: Lucene - Java Issue Type: New Feature Reporter: Dawid Weiss Priority: Trivial Fix For: 4.0 New stemmers have been added to snowball (Irish gaelic and Czech). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
document search returning no results
I have a search that is coming up empty despite a document existing with the search text. Is the / an illegal character? Here's the field when I'm creating the document: [5] = {indexed,tokenizedAssignedAreasWithId:3-Genetics,404-AnnalsofFamilyMedicine-July/August2009,60-Obesity/WeightManagement} Here's my lucene search query: {+(AssignedAreasWithId:*404-annalsoffamilymedicine-july/august2009*)} Thanks, Ryan Langton Engineer Digital Evolution Group 913.951.3175 x155 (office) 913.498.9985 (fax) langt...@digitalev.commailto:langt...@digitalev.com www.digitalev.comhttp://www.digitalev.com
[jira] [Commented] (SOLR-139) Support updateable/modifiable documents
[ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270612#comment-13270612 ] Yonik Seeley commented on SOLR-139: --- Committed (5 years after the issue was opened!) I'll keep this issue open and we can add follow-on patches to implement increment and other set operations. Support updateable/modifiable documents --- Key: SOLR-139 URL: https://issues.apache.org/jira/browse/SOLR-139 Project: Solr Issue Type: New Feature Components: update Reporter: Ryan McKinley Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-139.patch, SOLR-139.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch It would be nice to be able to update some fields on a document without having to insert the entire document. Given the way lucene is structured, (for now) one can only modify stored fields. While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers. for background, see: http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4043) Add scoring support for query time join
Martijn van Groningen created LUCENE-4043: - Summary: Add scoring support for query time join Key: LUCENE-4043 URL: https://issues.apache.org/jira/browse/LUCENE-4043 Project: Lucene - Java Issue Type: Improvement Components: modules/join Reporter: Martijn van Groningen Have similar scoring for query time joining just like the index time block join (with the score mode). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release PyLucene 3.6.0 rc2
+1 to release. I built/installed successfully on OS X 10.6.8, and ran my usual smoke test (index/search first 100 K docs from Wikipedia). Was the added 'print setup args = %s % args' intentional, in jcc/jcc/python.py? Just prints a lot of stuff out while building PyLucene... Mike McCandless http://blog.mikemccandless.com On Mon, May 7, 2012 at 8:20 PM, Andi Vajda va...@apache.org wrote: The ivy requirement for building Lucene Java is now handled by PyLucene's Makefile. A new release candidate is available for review. The PyLucene 3.6.0-2 release tracking the recent release of Apache Lucene 3.6.0 is ready. A release candidate is available from: http://people.apache.org/~vajda/staging_area/ A list of changes in this release can be seen at: http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_6/CHANGES PyLucene 3.6.0 is built with JCC 2.13 included in these release artifacts: http://svn.apache.org/repos/asf/lucene/pylucene/trunk/jcc/CHANGES A list of Lucene Java changes can be seen at: http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_6_0/lucene/CHANGES.txt Please vote to release these artifacts as PyLucene 3.6.0-2. Thanks ! Andi.. ps: the KEYS file for PyLucene release signing is at: http://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS http://people.apache.org/~vajda/staging_area/KEYS pps: here is my +1
[jira] [Updated] (LUCENE-4043) Add scoring support for query time join
[ https://issues.apache.org/jira/browse/LUCENE-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-4043: -- Attachment: LUCENE-4043.patch Draft patch. Added ScoreMode as parameter to JoinUtil#createJoinQuery(...). Maybe ScoreMode should be a public enum inside the join package. Add scoring support for query time join --- Key: LUCENE-4043 URL: https://issues.apache.org/jira/browse/LUCENE-4043 Project: Lucene - Java Issue Type: Improvement Components: modules/join Reporter: Martijn van Groningen Attachments: LUCENE-4043.patch Have similar scoring for query time joining just like the index time block join (with the score mode). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release PyLucene 3.6.0 rc2
On May 8, 2012, at 10:18, Michael McCandless luc...@mikemccandless.com wrote: +1 to release. I built/installed successfully on OS X 10.6.8, and ran my usual smoke test (index/search first 100 K docs from Wikipedia). Was the added 'print setup args = %s % args' intentional, in jcc/jcc/python.py? Just prints a lot of stuff out while building PyLucene... Yes, that was submitted by a user to make more explicit what was fed to setup(). An aid for debugging. Andi.. Mike McCandless http://blog.mikemccandless.com On Mon, May 7, 2012 at 8:20 PM, Andi Vajda va...@apache.org wrote: The ivy requirement for building Lucene Java is now handled by PyLucene's Makefile. A new release candidate is available for review. The PyLucene 3.6.0-2 release tracking the recent release of Apache Lucene 3.6.0 is ready. A release candidate is available from: http://people.apache.org/~vajda/staging_area/ A list of changes in this release can be seen at: http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_6/CHANGES PyLucene 3.6.0 is built with JCC 2.13 included in these release artifacts: http://svn.apache.org/repos/asf/lucene/pylucene/trunk/jcc/CHANGES A list of Lucene Java changes can be seen at: http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_6_0/lucene/CHANGES.txt Please vote to release these artifacts as PyLucene 3.6.0-2. Thanks ! Andi.. ps: the KEYS file for PyLucene release signing is at: http://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS http://people.apache.org/~vajda/staging_area/KEYS pps: here is my +1
[MAVEN] Heads up: build changes
If you use the Lucene/Solr Maven POMs to drive the build, I committed a major change last night (see https://issues.apache.org/jira/browse/LUCENE-3948 for more details): * 'ant get-maven-poms' no longer places pom.xml files under the lucene/ and solr/ directories. Instead, they are placed in a new top-level directory: maven-build/. * When you run 'mvn whatever' under maven-build/, build and test output now goes under the conventional Maven target/ directories associated with each module's POM under the top-level maven-build/ directory. Maven build and test outputs are now completely separate from those produced by the Ant build. The above changes don't affect the 'ant generate-maven-artifacts' process - the top-level maven-build/ directory is not involved. (Instead, the 'generate-maven-artifacts' target calls a separate target - 'filter-pom-templates' - to copy the POMs to lucene/build/poms/ and interpolate their versions.) Please let me know if you run into problems with the new setup. Thanks, Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270648#comment-13270648 ] Mike Bria commented on SOLR-1604: - Hi everyone, Sorry, but I'm green to this stuff. How do I apply/install a *.patch file? I downloaded and successfully built (well, packaged, via mvn) the ComplexPhrase.zip from Jul-2011. I then downloaded the SOLR-1604-alternative.patch from Feb-2012. I can open and view it via a text editor...but I have no idea what to do to apply it? I'm working on a RH Linux box at the moment. Can anyone guide me and/or point me in the right direction please? Thanks! Mike Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Fix For: 4.0 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604-alternative.patch, SOLR-1604.patch, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release PyLucene 3.6.0 rc2
On Tue, May 8, 2012 at 1:24 PM, Andi Vajda va...@apache.org wrote: On May 8, 2012, at 10:18, Michael McCandless luc...@mikemccandless.com wrote: Was the added 'print setup args = %s % args' intentional, in jcc/jcc/python.py? Just prints a lot of stuff out while building PyLucene... Yes, that was submitted by a user to make more explicit what was fed to setup(). An aid for debugging. OK, makes sense! Mike McCandless http://blog.mikemccandless.com
[jira] [Issue Comment Edited] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270648#comment-13270648 ] Mike Bria edited comment on SOLR-1604 at 5/8/12 6:00 PM: - Hi everyone, Sorry, but I'm green to this stuff. How do I apply/install a *.patch file? I downloaded and successfully built (well, packaged, via mvn) the ComplexPhrase.zip from Jul-2011. I then downloaded the 'SOLR-1604-alternative.patch' from Feb-2012. I can open and view it via a text editor...but I have no idea what to do to apply it? I'm working on a RH Linux box at the moment. Can anyone guide me and/or point me in the right direction please? Thanks! Mike was (Author: mbria): Hi everyone, Sorry, but I'm green to this stuff. How do I apply/install a *.patch file? I downloaded and successfully built (well, packaged, via mvn) the ComplexPhrase.zip from Jul-2011. I then downloaded the SOLR-1604-alternative.patch from Feb-2012. I can open and view it via a text editor...but I have no idea what to do to apply it? I'm working on a RH Linux box at the moment. Can anyone guide me and/or point me in the right direction please? Thanks! Mike Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Fix For: 4.0 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604-alternative.patch, SOLR-1604.patch, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3445) SOLR Stored field in ASCII
Bill Bell created SOLR-3445: --- Summary: SOLR Stored field in ASCII Key: SOLR-3445 URL: https://issues.apache.org/jira/browse/SOLR-3445 Project: Solr Issue Type: Improvement Reporter: Bill Bell In order to reduce the size of the stored fields and increase performance of SOLR by limiting the payload, we should consider adding a parameter for stored that will store the information in ASCII format instead of UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3445) SOLR Stored field in ASCII
[ https://issues.apache.org/jira/browse/SOLR-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270712#comment-13270712 ] Steven Rowe commented on SOLR-3445: --- For ASCII characters, UTF-8 has the same footprint as ASCII itself, so there is no space savings available here. But maybe you are thinking of a lossy conversion from UTF-8 to ASCII? SOLR Stored field in ASCII -- Key: SOLR-3445 URL: https://issues.apache.org/jira/browse/SOLR-3445 Project: Solr Issue Type: Improvement Reporter: Bill Bell In order to reduce the size of the stored fields and increase performance of SOLR by limiting the payload, we should consider adding a parameter for stored that will store the information in ASCII format instead of UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3445) SOLR Stored field in byte format
[ https://issues.apache.org/jira/browse/SOLR-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-3445: Description: In order to reduce the size of the stored fields and increase performance of SOLR by limiting the payload, we should consider adding a parameter for stored that will store the information in byte format instead of UTF-8. (was: In order to reduce the size of the stored fields and increase performance of SOLR by limiting the payload, we should consider adding a parameter for stored that will store the information in ASCII format instead of UTF-8.) Summary: SOLR Stored field in byte format (was: SOLR Stored field in ASCII) SOLR Stored field in byte format Key: SOLR-3445 URL: https://issues.apache.org/jira/browse/SOLR-3445 Project: Solr Issue Type: Improvement Reporter: Bill Bell In order to reduce the size of the stored fields and increase performance of SOLR by limiting the payload, we should consider adding a parameter for stored that will store the information in byte format instead of UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3445) SOLR Stored field in byte format
[ https://issues.apache.org/jira/browse/SOLR-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270726#comment-13270726 ] Steven Rowe commented on SOLR-3445: --- What is byte format? SOLR Stored field in byte format Key: SOLR-3445 URL: https://issues.apache.org/jira/browse/SOLR-3445 Project: Solr Issue Type: Improvement Reporter: Bill Bell In order to reduce the size of the stored fields and increase performance of SOLR by limiting the payload, we should consider adding a parameter for stored that will store the information in byte format instead of UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3445) SOLR Stored field in byte format
[ https://issues.apache.org/jira/browse/SOLR-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270727#comment-13270727 ] Bill Bell commented on SOLR-3445: - Well for most of my use cases I am okay with the 256 characters and don't need the overhead of UTF-8. So instead of converting to UTF-8 just store as a normal String. I would also be good with Lossy versions, but I am unaware of these algorithms. The goal: get the index smaller since I don't need the data in there in UTF-8 format. String x = new String(Store this into a field in solr); Instead of something like: String original = new String(A + \u00ea + \u00f1 + \u00fc + C); SOLR Stored field in byte format Key: SOLR-3445 URL: https://issues.apache.org/jira/browse/SOLR-3445 Project: Solr Issue Type: Improvement Reporter: Bill Bell In order to reduce the size of the stored fields and increase performance of SOLR by limiting the payload, we should consider adding a parameter for stored that will store the information in byte format instead of UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3445) SOLR Stored field in byte format
[ https://issues.apache.org/jira/browse/SOLR-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270728#comment-13270728 ] Bill Bell commented on SOLR-3445: - Non-Unicoded format? SOLR Stored field in byte format Key: SOLR-3445 URL: https://issues.apache.org/jira/browse/SOLR-3445 Project: Solr Issue Type: Improvement Reporter: Bill Bell In order to reduce the size of the stored fields and increase performance of SOLR by limiting the payload, we should consider adding a parameter for stored that will store the information in byte format instead of UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3445) SOLR Stored field in non UTF-8 (non-unicoded format)
[ https://issues.apache.org/jira/browse/SOLR-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-3445: Summary: SOLR Stored field in non UTF-8 (non-unicoded format) (was: SOLR Stored field in byte format) SOLR Stored field in non UTF-8 (non-unicoded format) Key: SOLR-3445 URL: https://issues.apache.org/jira/browse/SOLR-3445 Project: Solr Issue Type: Improvement Reporter: Bill Bell In order to reduce the size of the stored fields and increase performance of SOLR by limiting the payload, we should consider adding a parameter for stored that will store the information in byte format instead of UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3445) SOLR Stored field in non UTF-8 (non-unicoded format)
[ https://issues.apache.org/jira/browse/SOLR-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270752#comment-13270752 ] Bill Bell commented on SOLR-3445: - Does Codecs help with this? SOLR Stored field in non UTF-8 (non-unicoded format) Key: SOLR-3445 URL: https://issues.apache.org/jira/browse/SOLR-3445 Project: Solr Issue Type: Improvement Reporter: Bill Bell In order to reduce the size of the stored fields and increase performance of SOLR by limiting the payload, we should consider adding a parameter for stored that will store the information in byte format instead of UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release PyLucene 3.6.0 rc2
Am 08.05.2012 02:20, schrieb Andi Vajda: Please vote to release these artifacts as PyLucene 3.6.0-2. All tests are passing on Ubuntu 12.04 AMD64. This time I'm unable to test PyLucene 3.6 with our application since bobo browse is incompatible with Lucene 3.6. Here is my +1 Christian
Re: document search returning no results
Even with “multi-term aware” (in 3.6 and trunk) analysis, you can’t have a single query term that analyzes (tokenizes) into multiple index terms AND has wildcards. In other words, if you want to use wildcard, the query term has to analyze (tokenize) into a single term. Three strategies: 1. Split the query into multiple terms that are ANDed together and then use wildcards on the specific terms (words or tokens.) 2. Consider whether the field should be tokenized at all. Maybe it should be string or keyword and always wildcard to reference values. 3. Have two fields, one which is tokenized and lets you query by individual words embedded in the field values, and a second field which is a string or keyword and is not tokenized but use wildcards on the full field value, with a copyField to populate one field from the stored value of the other. -- Jack Krupansky From: Ryan Langton Sent: Tuesday, May 08, 2012 12:49 PM To: mailto:dev@lucene.apache.org Subject: document search returning no results I have a search that is coming up empty despite a document existing with the search text. Is the / an illegal character? Here’s the field when I’m creating the document: [5] = {indexed,tokenizedAssignedAreasWithId:3-Genetics,404-AnnalsofFamilyMedicine-July/August2009,60-Obesity/WeightManagement} Here’s my lucene search query: {+(AssignedAreasWithId:*404-annalsoffamilymedicine-july/august2009*)} Thanks, Ryan Langton Engineer Digital Evolution Group 913.951.3175 x155 (office) 913.498.9985 (fax) langt...@digitalev.com www.digitalev.com
[jira] [Commented] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270809#comment-13270809 ] Ahmet Arslan commented on SOLR-1604: There are two separate ways to enable this functionality. * You can consume *.zip attachments as a solr plugin. Which does not require source code modification, but this particular case requires re-creating solr.war. http://wiki.apache.org/solr/SolrPlugins * *.patch files contains source code modifications. http://wiki.apache.org/solr/HowToContribute#Working_With_Patches Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Fix For: 4.0 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604-alternative.patch, SOLR-1604.patch, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 13892 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13892/ 1 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest Error Message: ERROR: SolrIndexSearcher opens=80 closes=78 Stack Trace: java.lang.AssertionError: ERROR: SolrIndexSearcher opens=80 closes=78 at __randomizedtesting.SeedInfo.seed([F35879832CC30BE2]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:212) at org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:101) at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1961) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:742) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log (for compile errors): [...truncated 11159 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2857) Multi-content-type /update handler
[ https://issues.apache.org/jira/browse/SOLR-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley resolved SOLR-2857. - Resolution: Fixed Assignee: Ryan McKinley I added this in #1335768, and have put rough docs on the wiki -- I'm sure there are more links/references that should be updated Multi-content-type /update handler -- Key: SOLR-2857 URL: https://issues.apache.org/jira/browse/SOLR-2857 Project: Solr Issue Type: Improvement Reporter: Erik Hatcher Assignee: Ryan McKinley Fix For: 4.0 Attachments: SOLR-2857-content-type-refactor.patch, SOLR-2857-content-type-refactor.patch, SOLR-2857-content-type-refactor.patch, SOLR-2857-update-content-type.patch, SOLR-2857-update-content-type.patch, SOLR-2857-update-content-type.patch Something I've been thinking about lately... it'd be great to get rid of all the specific update handlers like /update/csv, /update/extract, and /update/json and collapse them all into a single /update that underneath uses the content-type(s) to hand off to specific content handlers. This would make it much easier to toss content at Solr and provide a single entry point for updates. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3446) PatternSyntaxException Crash from Unvalidated Regular Expression Usage
Eric Spishak created SOLR-3446: -- Summary: PatternSyntaxException Crash from Unvalidated Regular Expression Usage Key: SOLR-3446 URL: https://issues.apache.org/jira/browse/SOLR-3446 Project: Solr Issue Type: Bug Affects Versions: 3.5 Reporter: Eric Spishak Attachments: SOLR-3446.patch Solr sometimes crashes with an unhelpful stack trace. If the PatternTokenizerFactory's pattern attribute is set to an invalid regular expression, a PatternSyntaxException is thrown and Solr fails to start. The PatternSyntaxException is not useful to users in diagnosing the error. I think it would be better to report a detailed error message. The attached patch makes this change. Note that the patch adds a small RegexUtil class with helper methods to determine whether a String is a valid regular expression and to generate error messages for invalid regular expressions. I feel that these helper methods are more readable than catching the PatternSyntaxException. Furthermore, they can be re-used if more bugs like this one are found. Steps to reproduce: # Patch in bug.patch #* Note that this sets PatternTokenizerFactory's pattern attribute to an invalid regular expression. # Run 'ant run-example' from the solr folder # See exception in console output on startup: {code} Apr 3, 2012 2:07:29 PM org.apache.solr.common.SolrException log SEVERE: java.util.regex.PatternSyntaxException: Unclosed group near index 1 ( ^ at java.util.regex.Pattern.error(Pattern.java:1713) at java.util.regex.Pattern.accept(Pattern.java:1571) at java.util.regex.Pattern.group0(Pattern.java:2533) at java.util.regex.Pattern.sequence(Pattern.java:1806) at java.util.regex.Pattern.expr(Pattern.java:1752) at java.util.regex.Pattern.compile(Pattern.java:1460) at java.util.regex.Pattern.init(Pattern.java:1133) at java.util.regex.Pattern.compile(Pattern.java:847) at org.apache.solr.analysis.PatternTokenizerFactory.init(PatternTokenizerFactory.java:90) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:901) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:890) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:148) at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:910) at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:480) at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:125) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:461) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713) at org.mortbay.jetty.servlet.Context.startContext(Context.java:140) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130) at org.mortbay.jetty.Server.doStart(Server.java:224) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at
[jira] [Updated] (SOLR-3446) PatternSyntaxException Crash from Unvalidated Regular Expression Usage
[ https://issues.apache.org/jira/browse/SOLR-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Spishak updated SOLR-3446: --- Attachment: SOLR-3446.patch PatternSyntaxException Crash from Unvalidated Regular Expression Usage -- Key: SOLR-3446 URL: https://issues.apache.org/jira/browse/SOLR-3446 Project: Solr Issue Type: Bug Affects Versions: 3.5 Reporter: Eric Spishak Attachments: SOLR-3446.patch Solr sometimes crashes with an unhelpful stack trace. If the PatternTokenizerFactory's pattern attribute is set to an invalid regular expression, a PatternSyntaxException is thrown and Solr fails to start. The PatternSyntaxException is not useful to users in diagnosing the error. I think it would be better to report a detailed error message. The attached patch makes this change. Note that the patch adds a small RegexUtil class with helper methods to determine whether a String is a valid regular expression and to generate error messages for invalid regular expressions. I feel that these helper methods are more readable than catching the PatternSyntaxException. Furthermore, they can be re-used if more bugs like this one are found. Steps to reproduce: # Patch in bug.patch #* Note that this sets PatternTokenizerFactory's pattern attribute to an invalid regular expression. # Run 'ant run-example' from the solr folder # See exception in console output on startup: {code} Apr 3, 2012 2:07:29 PM org.apache.solr.common.SolrException log SEVERE: java.util.regex.PatternSyntaxException: Unclosed group near index 1 ( ^ at java.util.regex.Pattern.error(Pattern.java:1713) at java.util.regex.Pattern.accept(Pattern.java:1571) at java.util.regex.Pattern.group0(Pattern.java:2533) at java.util.regex.Pattern.sequence(Pattern.java:1806) at java.util.regex.Pattern.expr(Pattern.java:1752) at java.util.regex.Pattern.compile(Pattern.java:1460) at java.util.regex.Pattern.init(Pattern.java:1133) at java.util.regex.Pattern.compile(Pattern.java:847) at org.apache.solr.analysis.PatternTokenizerFactory.init(PatternTokenizerFactory.java:90) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:901) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:890) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:148) at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:910) at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:480) at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:125) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:461) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713) at org.mortbay.jetty.servlet.Context.startContext(Context.java:140) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130) at org.mortbay.jetty.Server.doStart(Server.java:224) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at
[jira] [Updated] (SOLR-3446) PatternSyntaxException Crash from Unvalidated Regular Expression Usage
[ https://issues.apache.org/jira/browse/SOLR-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Spishak updated SOLR-3446: --- Attachment: SOLR-3446.patch PatternSyntaxException Crash from Unvalidated Regular Expression Usage -- Key: SOLR-3446 URL: https://issues.apache.org/jira/browse/SOLR-3446 Project: Solr Issue Type: Bug Affects Versions: 3.5 Reporter: Eric Spishak Attachments: SOLR-3446.patch, bug.patch Solr sometimes crashes with an unhelpful stack trace. If the PatternTokenizerFactory's pattern attribute is set to an invalid regular expression, a PatternSyntaxException is thrown and Solr fails to start. The PatternSyntaxException is not useful to users in diagnosing the error. I think it would be better to report a detailed error message. The attached patch makes this change. Note that the patch adds a small RegexUtil class with helper methods to determine whether a String is a valid regular expression and to generate error messages for invalid regular expressions. I feel that these helper methods are more readable than catching the PatternSyntaxException. Furthermore, they can be re-used if more bugs like this one are found. Steps to reproduce: # Patch in bug.patch #* Note that this sets PatternTokenizerFactory's pattern attribute to an invalid regular expression. # Run 'ant run-example' from the solr folder # See exception in console output on startup: {code} Apr 3, 2012 2:07:29 PM org.apache.solr.common.SolrException log SEVERE: java.util.regex.PatternSyntaxException: Unclosed group near index 1 ( ^ at java.util.regex.Pattern.error(Pattern.java:1713) at java.util.regex.Pattern.accept(Pattern.java:1571) at java.util.regex.Pattern.group0(Pattern.java:2533) at java.util.regex.Pattern.sequence(Pattern.java:1806) at java.util.regex.Pattern.expr(Pattern.java:1752) at java.util.regex.Pattern.compile(Pattern.java:1460) at java.util.regex.Pattern.init(Pattern.java:1133) at java.util.regex.Pattern.compile(Pattern.java:847) at org.apache.solr.analysis.PatternTokenizerFactory.init(PatternTokenizerFactory.java:90) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:901) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:890) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:148) at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:910) at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:480) at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:125) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:461) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713) at org.mortbay.jetty.servlet.Context.startContext(Context.java:140) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130) at org.mortbay.jetty.Server.doStart(Server.java:224) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at
[jira] [Updated] (SOLR-3446) PatternSyntaxException Crash from Unvalidated Regular Expression Usage
[ https://issues.apache.org/jira/browse/SOLR-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Spishak updated SOLR-3446: --- Attachment: bug.patch PatternSyntaxException Crash from Unvalidated Regular Expression Usage -- Key: SOLR-3446 URL: https://issues.apache.org/jira/browse/SOLR-3446 Project: Solr Issue Type: Bug Affects Versions: 3.5 Reporter: Eric Spishak Attachments: SOLR-3446.patch, bug.patch Solr sometimes crashes with an unhelpful stack trace. If the PatternTokenizerFactory's pattern attribute is set to an invalid regular expression, a PatternSyntaxException is thrown and Solr fails to start. The PatternSyntaxException is not useful to users in diagnosing the error. I think it would be better to report a detailed error message. The attached patch makes this change. Note that the patch adds a small RegexUtil class with helper methods to determine whether a String is a valid regular expression and to generate error messages for invalid regular expressions. I feel that these helper methods are more readable than catching the PatternSyntaxException. Furthermore, they can be re-used if more bugs like this one are found. Steps to reproduce: # Patch in bug.patch #* Note that this sets PatternTokenizerFactory's pattern attribute to an invalid regular expression. # Run 'ant run-example' from the solr folder # See exception in console output on startup: {code} Apr 3, 2012 2:07:29 PM org.apache.solr.common.SolrException log SEVERE: java.util.regex.PatternSyntaxException: Unclosed group near index 1 ( ^ at java.util.regex.Pattern.error(Pattern.java:1713) at java.util.regex.Pattern.accept(Pattern.java:1571) at java.util.regex.Pattern.group0(Pattern.java:2533) at java.util.regex.Pattern.sequence(Pattern.java:1806) at java.util.regex.Pattern.expr(Pattern.java:1752) at java.util.regex.Pattern.compile(Pattern.java:1460) at java.util.regex.Pattern.init(Pattern.java:1133) at java.util.regex.Pattern.compile(Pattern.java:847) at org.apache.solr.analysis.PatternTokenizerFactory.init(PatternTokenizerFactory.java:90) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:901) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:890) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:148) at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:910) at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:480) at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:125) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:461) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713) at org.mortbay.jetty.servlet.Context.startContext(Context.java:140) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130) at org.mortbay.jetty.Server.doStart(Server.java:224) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at
[jira] [Updated] (SOLR-3446) PatternSyntaxException Crash from Unvalidated Regular Expression Usage
[ https://issues.apache.org/jira/browse/SOLR-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Spishak updated SOLR-3446: --- Attachment: (was: SOLR-3446.patch) PatternSyntaxException Crash from Unvalidated Regular Expression Usage -- Key: SOLR-3446 URL: https://issues.apache.org/jira/browse/SOLR-3446 Project: Solr Issue Type: Bug Affects Versions: 3.5 Reporter: Eric Spishak Attachments: SOLR-3446.patch, bug.patch Solr sometimes crashes with an unhelpful stack trace. If the PatternTokenizerFactory's pattern attribute is set to an invalid regular expression, a PatternSyntaxException is thrown and Solr fails to start. The PatternSyntaxException is not useful to users in diagnosing the error. I think it would be better to report a detailed error message. The attached patch makes this change. Note that the patch adds a small RegexUtil class with helper methods to determine whether a String is a valid regular expression and to generate error messages for invalid regular expressions. I feel that these helper methods are more readable than catching the PatternSyntaxException. Furthermore, they can be re-used if more bugs like this one are found. Steps to reproduce: # Patch in bug.patch #* Note that this sets PatternTokenizerFactory's pattern attribute to an invalid regular expression. # Run 'ant run-example' from the solr folder # See exception in console output on startup: {code} Apr 3, 2012 2:07:29 PM org.apache.solr.common.SolrException log SEVERE: java.util.regex.PatternSyntaxException: Unclosed group near index 1 ( ^ at java.util.regex.Pattern.error(Pattern.java:1713) at java.util.regex.Pattern.accept(Pattern.java:1571) at java.util.regex.Pattern.group0(Pattern.java:2533) at java.util.regex.Pattern.sequence(Pattern.java:1806) at java.util.regex.Pattern.expr(Pattern.java:1752) at java.util.regex.Pattern.compile(Pattern.java:1460) at java.util.regex.Pattern.init(Pattern.java:1133) at java.util.regex.Pattern.compile(Pattern.java:847) at org.apache.solr.analysis.PatternTokenizerFactory.init(PatternTokenizerFactory.java:90) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:901) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:890) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:148) at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:910) at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:480) at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:125) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:461) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713) at org.mortbay.jetty.servlet.Context.startContext(Context.java:140) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130) at org.mortbay.jetty.Server.doStart(Server.java:224) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
[jira] [Updated] (SOLR-3446) PatternSyntaxException Crash from Unvalidated Regular Expression Usage
[ https://issues.apache.org/jira/browse/SOLR-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Spishak updated SOLR-3446: --- Attachment: (was: bug.patch) PatternSyntaxException Crash from Unvalidated Regular Expression Usage -- Key: SOLR-3446 URL: https://issues.apache.org/jira/browse/SOLR-3446 Project: Solr Issue Type: Bug Affects Versions: 3.5 Reporter: Eric Spishak Attachments: SOLR-3446.patch, bug.patch Solr sometimes crashes with an unhelpful stack trace. If the PatternTokenizerFactory's pattern attribute is set to an invalid regular expression, a PatternSyntaxException is thrown and Solr fails to start. The PatternSyntaxException is not useful to users in diagnosing the error. I think it would be better to report a detailed error message. The attached patch makes this change. Note that the patch adds a small RegexUtil class with helper methods to determine whether a String is a valid regular expression and to generate error messages for invalid regular expressions. I feel that these helper methods are more readable than catching the PatternSyntaxException. Furthermore, they can be re-used if more bugs like this one are found. Steps to reproduce: # Patch in bug.patch #* Note that this sets PatternTokenizerFactory's pattern attribute to an invalid regular expression. # Run 'ant run-example' from the solr folder # See exception in console output on startup: {code} Apr 3, 2012 2:07:29 PM org.apache.solr.common.SolrException log SEVERE: java.util.regex.PatternSyntaxException: Unclosed group near index 1 ( ^ at java.util.regex.Pattern.error(Pattern.java:1713) at java.util.regex.Pattern.accept(Pattern.java:1571) at java.util.regex.Pattern.group0(Pattern.java:2533) at java.util.regex.Pattern.sequence(Pattern.java:1806) at java.util.regex.Pattern.expr(Pattern.java:1752) at java.util.regex.Pattern.compile(Pattern.java:1460) at java.util.regex.Pattern.init(Pattern.java:1133) at java.util.regex.Pattern.compile(Pattern.java:847) at org.apache.solr.analysis.PatternTokenizerFactory.init(PatternTokenizerFactory.java:90) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:901) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:890) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:148) at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:910) at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:480) at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:125) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:461) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713) at org.mortbay.jetty.servlet.Context.startContext(Context.java:140) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130) at org.mortbay.jetty.Server.doStart(Server.java:224) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
[jira] [Updated] (SOLR-3446) PatternSyntaxException Crash from Unvalidated Regular Expression Usage
[ https://issues.apache.org/jira/browse/SOLR-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Spishak updated SOLR-3446: --- Attachment: bug.patch PatternSyntaxException Crash from Unvalidated Regular Expression Usage -- Key: SOLR-3446 URL: https://issues.apache.org/jira/browse/SOLR-3446 Project: Solr Issue Type: Bug Affects Versions: 3.5 Reporter: Eric Spishak Attachments: SOLR-3446.patch, bug.patch Solr sometimes crashes with an unhelpful stack trace. If the PatternTokenizerFactory's pattern attribute is set to an invalid regular expression, a PatternSyntaxException is thrown and Solr fails to start. The PatternSyntaxException is not useful to users in diagnosing the error. I think it would be better to report a detailed error message. The attached patch makes this change. Note that the patch adds a small RegexUtil class with helper methods to determine whether a String is a valid regular expression and to generate error messages for invalid regular expressions. I feel that these helper methods are more readable than catching the PatternSyntaxException. Furthermore, they can be re-used if more bugs like this one are found. Steps to reproduce: # Patch in bug.patch #* Note that this sets PatternTokenizerFactory's pattern attribute to an invalid regular expression. # Run 'ant run-example' from the solr folder # See exception in console output on startup: {code} Apr 3, 2012 2:07:29 PM org.apache.solr.common.SolrException log SEVERE: java.util.regex.PatternSyntaxException: Unclosed group near index 1 ( ^ at java.util.regex.Pattern.error(Pattern.java:1713) at java.util.regex.Pattern.accept(Pattern.java:1571) at java.util.regex.Pattern.group0(Pattern.java:2533) at java.util.regex.Pattern.sequence(Pattern.java:1806) at java.util.regex.Pattern.expr(Pattern.java:1752) at java.util.regex.Pattern.compile(Pattern.java:1460) at java.util.regex.Pattern.init(Pattern.java:1133) at java.util.regex.Pattern.compile(Pattern.java:847) at org.apache.solr.analysis.PatternTokenizerFactory.init(PatternTokenizerFactory.java:90) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:901) at org.apache.solr.schema.IndexSchema$5.init(IndexSchema.java:890) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:148) at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:910) at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:480) at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:125) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:461) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713) at org.mortbay.jetty.servlet.Context.startContext(Context.java:140) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130) at org.mortbay.jetty.Server.doStart(Server.java:224) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at
[jira] [Created] (LUCENE-4044) Add NamedSPILoader support to TokenizerFactory, TokenFilterFactory and CharFilterFactory
Chris Male created LUCENE-4044: -- Summary: Add NamedSPILoader support to TokenizerFactory, TokenFilterFactory and CharFilterFactory Key: LUCENE-4044 URL: https://issues.apache.org/jira/browse/LUCENE-4044 Project: Lucene - Java Issue Type: Sub-task Components: modules/analysis Reporter: Chris Male In LUCENE-2510 I want to move all the analysis factories out of Solr and into the directories with what they create. This is going to hamper Solr's existing strategy for supporting {{solr.*}} package names, where it replaces {{solr}} with various pre-defined package names. One way to tackle this is to use NamedSPILoader so we simply look up {{StandardTokenizerFactory}} for example, and find it wherever it is, as long as it is defined as a service. This is similar to how we support Codecs currently. As noted by Robert in LUCENE-2510, this would also have the benefit of meaning configurations could be less verbose, would aid in fully decoupling the analysis module from Solr, and make the analysis factories easier to interact with. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4044) Add NamedSPILoader support to TokenizerFactory, TokenFilterFactory and CharFilterFactory
[ https://issues.apache.org/jira/browse/LUCENE-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271017#comment-13271017 ] Yonik Seeley commented on LUCENE-4044: -- bq. This is going to hamper Solr's existing strategy for supporting solr.* package names Why is that? Why can't solr.WhitespaceTokenizerFactory also check the package that you're planning on moving WhitespaceTokenizerFactory to? Add NamedSPILoader support to TokenizerFactory, TokenFilterFactory and CharFilterFactory Key: LUCENE-4044 URL: https://issues.apache.org/jira/browse/LUCENE-4044 Project: Lucene - Java Issue Type: Sub-task Components: modules/analysis Reporter: Chris Male Fix For: 4.0 In LUCENE-2510 I want to move all the analysis factories out of Solr and into the directories with what they create. This is going to hamper Solr's existing strategy for supporting {{solr.*}} package names, where it replaces {{solr}} with various pre-defined package names. One way to tackle this is to use NamedSPILoader so we simply look up {{StandardTokenizerFactory}} for example, and find it wherever it is, as long as it is defined as a service. This is similar to how we support Codecs currently. As noted by Robert in LUCENE-2510, this would also have the benefit of meaning configurations could be less verbose, would aid in fully decoupling the analysis module from Solr, and make the analysis factories easier to interact with. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4044) Add NamedSPILoader support to TokenizerFactory, TokenFilterFactory and CharFilterFactory
[ https://issues.apache.org/jira/browse/LUCENE-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271020#comment-13271020 ] Chris Male commented on LUCENE-4044: There will be alot of different packages, I assumed that cycling through them all would be undesirable. Add NamedSPILoader support to TokenizerFactory, TokenFilterFactory and CharFilterFactory Key: LUCENE-4044 URL: https://issues.apache.org/jira/browse/LUCENE-4044 Project: Lucene - Java Issue Type: Sub-task Components: modules/analysis Reporter: Chris Male Fix For: 4.0 In LUCENE-2510 I want to move all the analysis factories out of Solr and into the directories with what they create. This is going to hamper Solr's existing strategy for supporting {{solr.*}} package names, where it replaces {{solr}} with various pre-defined package names. One way to tackle this is to use NamedSPILoader so we simply look up {{StandardTokenizerFactory}} for example, and find it wherever it is, as long as it is defined as a service. This is similar to how we support Codecs currently. As noted by Robert in LUCENE-2510, this would also have the benefit of meaning configurations could be less verbose, would aid in fully decoupling the analysis module from Solr, and make the analysis factories easier to interact with. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4044) Add NamedSPILoader support to TokenizerFactory, TokenFilterFactory and CharFilterFactory
[ https://issues.apache.org/jira/browse/LUCENE-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271024#comment-13271024 ] Chris Male commented on LUCENE-4044: With that said, I'm open to suggestions since I dont think this is going to do what I want it to do. Add NamedSPILoader support to TokenizerFactory, TokenFilterFactory and CharFilterFactory Key: LUCENE-4044 URL: https://issues.apache.org/jira/browse/LUCENE-4044 Project: Lucene - Java Issue Type: Sub-task Components: modules/analysis Reporter: Chris Male Fix For: 4.0 In LUCENE-2510 I want to move all the analysis factories out of Solr and into the directories with what they create. This is going to hamper Solr's existing strategy for supporting {{solr.*}} package names, where it replaces {{solr}} with various pre-defined package names. One way to tackle this is to use NamedSPILoader so we simply look up {{StandardTokenizerFactory}} for example, and find it wherever it is, as long as it is defined as a service. This is similar to how we support Codecs currently. As noted by Robert in LUCENE-2510, this would also have the benefit of meaning configurations could be less verbose, would aid in fully decoupling the analysis module from Solr, and make the analysis factories easier to interact with. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4044) Add NamedSPILoader support to TokenizerFactory, TokenFilterFactory and CharFilterFactory
[ https://issues.apache.org/jira/browse/LUCENE-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271055#comment-13271055 ] Chris Male commented on LUCENE-4044: Hmm it seems that this process only supports singletons, which isn't much use to us. Add NamedSPILoader support to TokenizerFactory, TokenFilterFactory and CharFilterFactory Key: LUCENE-4044 URL: https://issues.apache.org/jira/browse/LUCENE-4044 Project: Lucene - Java Issue Type: Sub-task Components: modules/analysis Reporter: Chris Male Fix For: 4.0 In LUCENE-2510 I want to move all the analysis factories out of Solr and into the directories with what they create. This is going to hamper Solr's existing strategy for supporting {{solr.*}} package names, where it replaces {{solr}} with various pre-defined package names. One way to tackle this is to use NamedSPILoader so we simply look up {{StandardTokenizerFactory}} for example, and find it wherever it is, as long as it is defined as a service. This is similar to how we support Codecs currently. As noted by Robert in LUCENE-2510, this would also have the benefit of meaning configurations could be less verbose, would aid in fully decoupling the analysis module from Solr, and make the analysis factories easier to interact with. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org