[Lucene.Net] [jira] [Resolved] (LUCENENET-432) Concurrency issues in SegmentInfo.Files() (LUCENE-2584)
[ https://issues.apache.org/jira/browse/LUCENENET-432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-432. Resolution: Fixed Fix Version/s: Lucene.Net 2.9.4 Lucene.Net 2.9.2 Assignee: Digy Patch committed to trunk 2.9.4g branch Concurrency issues in SegmentInfo.Files() (LUCENE-2584) --- Key: LUCENENET-432 URL: https://issues.apache.org/jira/browse/LUCENENET-432 Project: Lucene.Net Issue Type: Bug Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Digy Assignee: Digy Fix For: Lucene.Net 2.9.2, Lucene.Net 2.9.4 Attachments: SegmentInfo.patch The multi-threaded call of the files() in SegmentInfo could lead to the ConcurrentModificationException if one thread is not finished additions to the ArrayList (files) yet while the other thread already obtained it as cached. https://issues.apache.org/jira/browse/LUCENE-2584 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Resolved] (LUCENENET-430) Contrib.ChainedFilter
[ https://issues.apache.org/jira/browse/LUCENENET-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-430. Resolution: Fixed Instead of creating a small project, I put it into Contrib.Analyzers. Contrib.ChainedFilter - Key: LUCENENET-430 URL: https://issues.apache.org/jira/browse/LUCENENET-430 Project: Lucene.Net Issue Type: New Feature Affects Versions: Lucene.Net 2.9.4g Reporter: Digy Priority: Minor Fix For: Lucene.Net 2.9.4g Attachments: ChainedFilter.cs, ChainedFilterTest.cs Port of lucene.Java 3.0.3's ChainedFilter its test cases. See the StackOverflow question: How to combine multiple filters within one search? http://stackoverflow.com/questions/6570477/multiple-filters-in-lucene-net -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-418) LuceneTestCase should not have a static method could throw exceptions.
[ https://issues.apache.org/jira/browse/LUCENENET-418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061125#comment-13061125 ] Digy commented on LUCENENET-418: It works! Thanks. DIGY LuceneTestCase should not have a static method could throw exceptions. Key: LUCENENET-418 URL: https://issues.apache.org/jira/browse/LUCENENET-418 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Test Affects Versions: Lucene.Net 3.x Environment: Linux, OSX, etc Reporter: michael herndon Assignee: michael herndon Labels: test Fix For: Lucene.Net 2.9.4g Original Estimate: 2m Remaining Estimate: 2m Throwing an exception in a base classes for 90% tests in a static method makes it hard to debug the issue in nunit. The test results came back saying that TestFixtureSetup was causing an issue even though it was the Static Constructor causing problems and this then propagates to all the tests that stem from LuceneTestCase. The TEMP_DIR needs to be moved to a static util class as a property or even a mixin method. This caused me hours to debug and figure out the real issue as the underlying exception method never bubbled up. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Lucene.Net] Lucene Steroids
I have built something similar using NTFS hard-links and re-using existing local snapshot files, etc. It runs in production for 3+ years now with more than 100 million docs, and distributes new snapshots from master servers every minute. It does not use any rsync, but only leverages unique file names in lucene - it only copies files not already existing on slaves, and uses NTFS hard links to copy existing local files into new snapshot directory. Also, on the masters, it just uses NTFS hard links to create a new snapshot of the master index, and then slaves just look for new snapshot directories on the master servers. When new directory shows up, it looks at existing local snapshot to see which files are new on master (or have been deleted by master), and then only copies new files. It does not need to send any explicit commit operations, and there is no explicit communication between masters and slaves (slaves just look in some remote directory for new snapshot sub-directories). This has worked great with no problems at all. All this was built prior to SOLR being available on windows. Going forward we are transitioning to Java and SOLR on Linux (it is just to hard to keep up with improvements otherwise IMO). On Jul 6, 2011, at 8:22 PM, Guilherme Balena Versiani wrote: Hi, I am working on a derived work of Solr for .NET. The purpose is to obtain a similar solution of Lucene replication available at Solr, but without the need to port all Solr code. There is a SnapShooter, SnapPuller and a SnapInstaller. The SnapShooter does similar work as in Solr script. The SnapPuller uses cwRsync to replicate the database between machines, but without storing the snapshot.current.MACHINENAME files on master, as cwRsync does no support sync with the server. The SnapInstaller tries to substitute the Lucene database files in-place -- the Lucene application should use a SteroidsFSDirectory that creates a special SteroidsFSIndexInput that permits to rename files in use; after that, SnapInstaller sends a commit operation through a Windows named pipe to the application to reset its current IndexSearcher instance. This solution has the suggestive name of Lucene Steroids, and was hosted in BitBucket.org. What is the best way to continue to distribute it? Should I continue to maintain it on BitBucket.org or should I apply to Lucene.NET project (I don't know how) to include it on Contrib modules? The current code is available at http://bitbucket.org/guibv/lucene.steroids. The work is incomplete; the first stable version should be available on next few days. Best regards, Guilherme Balena Versiani.
[Lucene.Net] [jira] [Created] (LUCENENET-433) AttributeSource can have an invalid computed state (LUCENE-3042)
AttributeSource can have an invalid computed state (LUCENE-3042) Key: LUCENENET-433 URL: https://issues.apache.org/jira/browse/LUCENENET-433 Project: Lucene.Net Issue Type: Bug Reporter: Digy Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g If you work a tokenstream, consume it, then reuse it and add an attribute to it, the computed state is wrong. thus for example, clearAttributes() will not actually clear the attribute added. So in some situations, addAttribute is not actually clearing the computed state when it should. https://issues.apache.org/jira/browse/LUCENE-3042 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-433) AttributeSource can have an invalid computed state (LUCENE-3042)
[ https://issues.apache.org/jira/browse/LUCENENET-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061214#comment-13061214 ] Digy commented on LUCENENET-433: Here is the test case {code} [Test] public void Test_LUCENE_3042_LUCENENET_433() { String testString = t; Analyzer analyzer = new Lucene.Net.Analysis.Standard.StandardAnalyzer(); TokenStream stream = analyzer.ReusableTokenStream(dummy, new System.IO.StringReader(testString)); stream.Reset(); while (stream.IncrementToken()) { // consume } stream.End(); stream.Close(); AssertAnalyzesToReuse(analyzer, testString, new String[] { t }); } {code} AttributeSource can have an invalid computed state (LUCENE-3042) Key: LUCENENET-433 URL: https://issues.apache.org/jira/browse/LUCENENET-433 Project: Lucene.Net Issue Type: Bug Reporter: Digy Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g If you work a tokenstream, consume it, then reuse it and add an attribute to it, the computed state is wrong. thus for example, clearAttributes() will not actually clear the attribute added. So in some situations, addAttribute is not actually clearing the computed state when it should. https://issues.apache.org/jira/browse/LUCENE-3042 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Resolved] (LUCENENET-172) This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass
[ https://issues.apache.org/jira/browse/LUCENENET-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-172. Resolution: Fixed Assignee: Digy (was: Scott Lombard) Fixed in 2.9.4g. No fix for 2.9.4 This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass --- Key: LUCENENET-172 URL: https://issues.apache.org/jira/browse/LUCENENET-172 Project: Lucene.Net Issue Type: Improvement Components: Lucene.Net Core Affects Versions: Lucene.Net 2.3.1, Lucene.Net 2.3.2 Reporter: Ben Martz Assignee: Digy Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: lucene_2.3.1_exceptions_fix.patch, lucene_2.9.4g_exceptions_fix The java version of Lucene handles end-of-file in FastCharStream by throwing an exception. This behavior has been ported to .NET but the behavior carries an unacceptable cost in the .NET environment. This patch is based on the prior work in LUCENENET-8 and LUCENENET-11, which I gratefully acknowledge for the solution. While I understand that this patch is outside of the current project specification in that it deviates from the pure nature of the port, I believe that it is very important to make the patch available to any developer looking to leverage Lucene.Net in their project. Thanks for your consideration. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (LUCENE-3279) Allow CFS be empty
[ https://issues.apache.org/jira/browse/LUCENE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061063#comment-13061063 ] Simon Willnauer commented on LUCENE-3279: - I plan to commit this soon if nobody objects. Allow CFS be empty -- Key: LUCENE-3279 URL: https://issues.apache.org/jira/browse/LUCENE-3279 Project: Lucene - Java Issue Type: Improvement Components: core/store Affects Versions: 3.4, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 3.4, 4.0 Attachments: LUCENE-3279.patch since we changed CFS semantics slightly closing a CFS directory on an error can lead to an exception. Yet, an empty CFS is still a valid CFS so for consistency we should allow CFS to be empty. here is an example: {noformat} 1 tests failed. REGRESSION: org.apache.lucene.index.TestIndexWriterOnDiskFull.testAddDocumentOnDiskFull Error Message: CFS has no entries Stack Trace: java.lang.IllegalStateException: CFS has no entries at org.apache.lucene.store.CompoundFileWriter.close(CompoundFileWriter.java:139) at org.apache.lucene.store.CompoundFileDirectory.close(CompoundFileDirectory.java:181) at org.apache.lucene.store.DefaultCompoundFileDirectory.close(DefaultCompoundFileDirectory.java:58) at org.apache.lucene.index.SegmentMerger.createCompoundFile(SegmentMerger.java:139) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4252) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3863) at org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:37) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2715) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2710) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2706) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3513) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2064) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2031) at org.apache.lucene.index.TestIndexWriterOnDiskFull.addDoc(TestIndexWriterOnDiskFull.java:539) at org.apache.lucene.index.TestIndexWriterOnDiskFull.testAddDocumentOnDiskFull(TestIndexWriterOnDiskFull.java:74) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1277) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1195) {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3216) Store DocValues per segment instead of per field
[ https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061066#comment-13061066 ] Simon Willnauer commented on LUCENE-3216: - I plan to commit this soon if nobody objects. Store DocValues per segment instead of per field Key: LUCENE-3216 URL: https://issues.apache.org/jira/browse/LUCENE-3216 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch currently we are storing docvalues per field which results in at least one file per field that uses docvalues (or at most two per field per segment depending on the impl.). Yet, we should try to by default pack docvalues into a single file if possible. To enable this we need to hold all docvalues in memory during indexing and write them to disk once we flush a segment. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3262) Facet benchmarking
[ https://issues.apache.org/jira/browse/LUCENE-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Toke Eskildsen updated LUCENE-3262: --- Attachment: TestPerformanceHack.java CorpusGenerator.java I've attached a second shot at faceting performance testing. It separates the taxonomy generation into a CorpusGenerator (maybe similar to the RandomTaxonomyWriter that Robert calls for in LUCENE-3264?). Proper setup of faceting tweaks for the new faceting module is not done at all and not something I find myself qualified for. Facet benchmarking -- Key: LUCENE-3262 URL: https://issues.apache.org/jira/browse/LUCENE-3262 Project: Lucene - Java Issue Type: New Feature Components: modules/benchmark, modules/facet Reporter: Shai Erera Attachments: CorpusGenerator.java, TestPerformanceHack.java A spin off from LUCENE-3079. We should define few benchmarks for faceting scenarios, so we can evaluate the new faceting module as well as any improvement we'd like to consider in the future (such as cutting over to docvalues, implement FST-based caches etc.). Toke attached a preliminary test case to LUCENE-3079, so I'll attach it here as a starting point. We've also done some preliminary job for extending Benchmark for faceting, so I'll attach it here as well. We should perhaps create a Wiki page where we clearly describe the benchmark scenarios, then include results of 'default settings' and 'optimized settings', or something like that. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3287) Allow ability to set maxDocCharsToAnalyze in WeightedSpanTermExtractor
Allow ability to set maxDocCharsToAnalyze in WeightedSpanTermExtractor -- Key: LUCENE-3287 URL: https://issues.apache.org/jira/browse/LUCENE-3287 Project: Lucene - Java Issue Type: Improvement Components: modules/highlighter Affects Versions: 3.3 Reporter: Jahangir Anwari Priority: Trivial In WeightedSpanTermExtractor the default maxDocCharsToAnalyze value is 0. This inhibits us from getting the weighted span terms in any custom code(e.g attached CustomHighlighter.java) that uses WeightedSpanTermExtractor. Currently the setMaxDocCharsToAnalyze() method is protected, which prevents us from setting maxDocCharsToAnalyze to a value greater than 0. Changing the method to public would give us the ability to set the maxDocCharsToAnalyze. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3287) Allow ability to set maxDocCharsToAnalyze in WeightedSpanTermExtractor
[ https://issues.apache.org/jira/browse/LUCENE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jahangir Anwari updated LUCENE-3287: Attachment: WeightedSpanTermExtractor.patch CustomHighlighter.java Allow ability to set maxDocCharsToAnalyze in WeightedSpanTermExtractor -- Key: LUCENE-3287 URL: https://issues.apache.org/jira/browse/LUCENE-3287 Project: Lucene - Java Issue Type: Improvement Components: modules/highlighter Affects Versions: 3.3 Reporter: Jahangir Anwari Priority: Trivial Attachments: CustomHighlighter.java, WeightedSpanTermExtractor.patch In WeightedSpanTermExtractor the default maxDocCharsToAnalyze value is 0. This inhibits us from getting the weighted span terms in any custom code(e.g attached CustomHighlighter.java) that uses WeightedSpanTermExtractor. Currently the setMaxDocCharsToAnalyze() method is protected, which prevents us from setting maxDocCharsToAnalyze to a value greater than 0. Changing the method to public would give us the ability to set the maxDocCharsToAnalyze. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1945) Allow @Field annotations in nested classes using DocumentObjectBinder
[ https://issues.apache.org/jira/browse/SOLR-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061129#comment-13061129 ] Monica Storfjord commented on SOLR-1945: What is the status on this proposal? It would be a great feature and very beneficial to my current project! :) Do you have a full solution on this and witch version do you think this feature will be released in? - Monica Allow @Field annotations in nested classes using DocumentObjectBinder - Key: SOLR-1945 URL: https://issues.apache.org/jira/browse/SOLR-1945 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-1945.patch see http://search.lucidimagination.com/search/document/d909d909420aeb4e/does_solrj_support_nested_annotated_beans Would be nice to be able to pass an object graph to solrj with @field annotations rather than just a top level class -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2452) rewrite solr build system
[ https://issues.apache.org/jira/browse/SOLR-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated SOLR-2452: -- Attachment: SOLR-2452-post-reshuffling.patch This patch restores all of Solr's build targets from trunk; the build system rewrite is feature-complete at this point. (The reshuffling scripts requires no further changes.) I moved {{lucene-lib/}} directories to under {{build/}}, and eliminated per-contrib {{clean}} target actions - instead, {{ant clean}} just deletes {{solr/build/}}, {{solr/dist/}}, {{solr/package/}}, and {{solr/example/solr/lib/}}. Before I commit this patch to the branch, I want to put the build through its paces and examine differences between the outputs from trunk and from branches/solr2452 with this patch applied. One difference I found so far: on trunk the Solr create-package target includes duplicate javadocs for some non-contrib modules (core solrj I think): in the uber-javadocs, and again for the javadocs produced for maven. The per-contrib javadocs, by contrast, are excluded. This makes the compressed binary package about 1.8MB larger than it needs to be, IIRC. rewrite solr build system - Key: SOLR-2452 URL: https://issues.apache.org/jira/browse/SOLR-2452 Project: Solr Issue Type: Task Components: Build Reporter: Robert Muir Assignee: Steven Rowe Fix For: 3.4, 4.0 Attachments: SOLR-2452-post-reshuffling.patch, SOLR-2452-post-reshuffling.patch, SOLR-2452-post-reshuffling.patch, SOLR-2452.dir.reshuffle.sh, SOLR-2452.dir.reshuffle.sh As discussed some in SOLR-2002 (but that issue is long and hard to follow), I think we should rewrite the solr build system. Its slow, cumbersome, and messy, and makes it hard for us to improve things. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3288) 'Thus terms are represented ...' should be 'Thus fields are represented ...'
'Thus terms are represented ...' should be 'Thus fields are represented ...' - Key: LUCENE-3288 URL: https://issues.apache.org/jira/browse/LUCENE-3288 Project: Lucene - Java Issue Type: Bug Components: general/website Affects Versions: 3.1 Environment: n/a Reporter: Paul Foster Priority: Trivial Fix For: 3.1.1 In the last paragraph of http://lucene.apache.org/java/3_1_0/fileformats.html#Definitions, second sentance, it says: Thus terms are represented as a pair of strings, the first naming the field, and the second naming text within the field. Shouldn't it start Thus fields are ... ? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1945) Allow @Field annotations in nested classes using DocumentObjectBinder
[ https://issues.apache.org/jira/browse/SOLR-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061183#comment-13061183 ] Mark Miller commented on SOLR-1945: --- hmmm...I don't remember. I'll take a look again. Allow @Field annotations in nested classes using DocumentObjectBinder - Key: SOLR-1945 URL: https://issues.apache.org/jira/browse/SOLR-1945 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-1945.patch see http://search.lucidimagination.com/search/document/d909d909420aeb4e/does_solrj_support_nested_annotated_beans Would be nice to be able to pass an object graph to solrj with @field annotations rather than just a top level class -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2623) Solr JMX MBeans do not survive core reloads
[ https://issues.apache.org/jira/browse/SOLR-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061190#comment-13061190 ] Shalin Shekhar Mangar commented on SOLR-2623: - There's another bug with core reload that I found while running Alexey's test. Suppose there's only one core with name X and you reload X, it then becomes registered with as the core name. So all your jmx monitoring is now useless because the key names have changed. Solr JMX MBeans do not survive core reloads --- Key: SOLR-2623 URL: https://issues.apache.org/jira/browse/SOLR-2623 Project: Solr Issue Type: Bug Components: multicore Affects Versions: 1.4, 1.4.1, 3.1, 3.2 Reporter: Alexey Serba Assignee: Shalin Shekhar Mangar Priority: Minor Attachments: SOLR-2623.patch, SOLR-2623.patch, SOLR-2623.patch Solr JMX MBeans do not survive core reloads {noformat:title=Steps to reproduce} sh cd example sh vi multicore/core0/conf/solrconfig.xml # enable jmx sh java -Dcom.sun.management.jmxremote -Dsolr.solr.home=multicore -jar start.jar sh echo 'open 8842 # 8842 is java pid domain solr/core0 beans ' | java -jar jmxterm-1.0-alpha-4-uber.jar solr/core0:id=core0,type=core solr/core0:id=org.apache.solr.handler.StandardRequestHandler,type=org.apache.solr.handler.StandardRequestHandler solr/core0:id=org.apache.solr.handler.StandardRequestHandler,type=standard solr/core0:id=org.apache.solr.handler.XmlUpdateRequestHandler,type=/update solr/core0:id=org.apache.solr.handler.XmlUpdateRequestHandler,type=org.apache.solr.handler.XmlUpdateRequestHandler ... solr/core0:id=org.apache.solr.search.SolrIndexSearcher,type=searcher solr/core0:id=org.apache.solr.update.DirectUpdateHandler2,type=updateHandler sh curl 'http://localhost:8983/solr/admin/cores?action=RELOADcore=core0' sh echo 'open 8842 # 8842 is java pid domain solr/core0 beans ' | java -jar jmxterm-1.0-alpha-4-uber.jar # there's only one bean left after Solr core reload solr/core0:id=org.apache.solr.search.SolrIndexSearcher,type=Searcher@2e831a91 main {noformat} The root cause of this is Solr core reload behavior: # create new core (which overwrites existing registered MBeans) # register new core and close old one (we remove/un-register MBeans on oldCore.close) The correct sequence is: # unregister MBeans from old core # create and register new core # close old core without touching MBeans -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3289) FST should allow controlling how hard builder tries to share suffixes
FST should allow controlling how hard builder tries to share suffixes - Key: LUCENE-3289 URL: https://issues.apache.org/jira/browse/LUCENE-3289 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.4, 4.0 Today we have a boolean option to the FST builder telling it whether it should share suffixes. If you turn this off, building is much faster, uses much less RAM, and the resulting FST is a prefix trie. But, the FST is larger than it needs to be. When it's on, the builder maintains a node hash holding every node seen so far in the FST -- this uses up RAM and slows things down. On a dataset that Elmer (see java-user thread Autocompletion on large index on Jul 6 2011) provided (thank you!), which is 1.32 M titles avg 67.3 chars per title, building with suffix sharing on took 22.5 seconds, required 1.25 GB heap, and produced 91.6 MB FST. With suffix sharing off, it was 8.2 seconds, 450 MB heap and 129 MB FST. I think we should allow this boolean to be shade-of-gray instead: usually, how well suffixes can share is a function of how far they are from the end of the string, so, by adding a tunable N to only share when suffix length N, we can let caller make reasonable tradeoffs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3289) FST should allow controlling how hard builder tries to share suffixes
[ https://issues.apache.org/jira/browse/LUCENE-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3289: --- Attachment: LUCENE-3289.patch Initial rough patch showing the idea. FST should allow controlling how hard builder tries to share suffixes - Key: LUCENE-3289 URL: https://issues.apache.org/jira/browse/LUCENE-3289 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.4, 4.0 Attachments: LUCENE-3289.patch Today we have a boolean option to the FST builder telling it whether it should share suffixes. If you turn this off, building is much faster, uses much less RAM, and the resulting FST is a prefix trie. But, the FST is larger than it needs to be. When it's on, the builder maintains a node hash holding every node seen so far in the FST -- this uses up RAM and slows things down. On a dataset that Elmer (see java-user thread Autocompletion on large index on Jul 6 2011) provided (thank you!), which is 1.32 M titles avg 67.3 chars per title, building with suffix sharing on took 22.5 seconds, required 1.25 GB heap, and produced 91.6 MB FST. With suffix sharing off, it was 8.2 seconds, 450 MB heap and 129 MB FST. I think we should allow this boolean to be shade-of-gray instead: usually, how well suffixes can share is a function of how far they are from the end of the string, so, by adding a tunable N to only share when suffix length N, we can let caller make reasonable tradeoffs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3289) FST should allow controlling how hard builder tries to share suffixes
[ https://issues.apache.org/jira/browse/LUCENE-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061195#comment-13061195 ] Michael McCandless commented on LUCENE-3289: NOTE: patch applies to 3.x. I ran the patch on the titles, varying the max prefix sharing length: ||Len||FST Size||Seconds|| |1|135446807|8.2| |2|137632702|8.5| |3|135177994|8.3| |4|132782016|8.3| |5|130415331|8.4| |6|128086200|8.0| |7|125797396|8.2| |8|123552157|8.5| |9|121358375|8.4| |10|119228942|8.1| |11|117181180|8.8| |12|115229788|8.7| |13|113388260|9.5| |14|111664442|9.0| |15|110059167|9.2| |16|108572519|9.7| |17|107201905|9.8| |18|105942576|10.3| |19|104791497|10.1| |20|103745678|11.1| |21|102801693|10.8| |22|101957797|11.4| |23|101206564|11.1| |24|100541849|11.0| |25|99956443|11.1| |26|99443232|12.9| |27|98995194|13.2| |28|98604680|13.9| |29|98264184|13.5| |30|97969241|13.6| |31|97714049|13.8| |32|97494104|14.3| |33|97304045|14.0| |34|97140033|14.3| |35|96998942|14.6| |36|96877590|16.5| |37|96773039|16.9| |38|96682961|16.6| |39|96605160|17.8| |40|96537687|18.3| |41|96479286|17.8| |42|96428710|17.5| |43|96384659|18.9| |44|96346174|17.0| |45|96312826|19.3| |46|96283545|17.8| |47|96257708|19.4| |48|96235159|19.0| |49|96215220|18.7| |50|96197450|19.6| |51|96181539|17.3| |52|96167235|16.9| |53|96154490|17.7| |54|96143081|18.8| |55|96132905|17.4| |56|96123776|17.5| |57|96115462|20.7| |58|96108051|19.2| |59|96101249|19.1| |60|96095107|18.7| |ALL|96020343|22.5| Very very odd that FST size first goes up at N=2... not yet sure why. But from this curve it looks like there is a sweet spot around maybe N=24. I didn't measure required heap here, but it also will go down as N goes down. FST should allow controlling how hard builder tries to share suffixes - Key: LUCENE-3289 URL: https://issues.apache.org/jira/browse/LUCENE-3289 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.4, 4.0 Attachments: LUCENE-3289.patch Today we have a boolean option to the FST builder telling it whether it should share suffixes. If you turn this off, building is much faster, uses much less RAM, and the resulting FST is a prefix trie. But, the FST is larger than it needs to be. When it's on, the builder maintains a node hash holding every node seen so far in the FST -- this uses up RAM and slows things down. On a dataset that Elmer (see java-user thread Autocompletion on large index on Jul 6 2011) provided (thank you!), which is 1.32 M titles avg 67.3 chars per title, building with suffix sharing on took 22.5 seconds, required 1.25 GB heap, and produced 91.6 MB FST. With suffix sharing off, it was 8.2 seconds, 450 MB heap and 129 MB FST. I think we should allow this boolean to be shade-of-gray instead: usually, how well suffixes can share is a function of how far they are from the end of the string, so, by adding a tunable N to only share when suffix length N, we can let caller make reasonable tradeoffs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-2793: Attachment: LUCENE-2793_final.patch I committed the latest patch, merged the branch with trunk and created a final diff for review. I think this is ready and I would like to reintegrate rather sooner than later. reviews welcome Directory createOutput and openInput should take an IOContext - Key: LUCENE-2793 URL: https://issues.apache.org/jira/browse/LUCENE-2793 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Michael McCandless Assignee: Varun Thacker Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2793-nrt.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793_final.patch Today for merging we pass down a larger readBufferSize than for searching because we get better performance. I think we should generalize this to a class (IOContext), which would hold the buffer size, but then could hold other flags like DIRECT (bypass OS's buffer cache), SEQUENTIAL, etc. Then, we can make the DirectIOLinuxDirectory fully usable because we would only use DIRECT/SEQUENTIAL during merging. This will require fixing how IW pools readers, so that a reader opened for merging is not then used for searching, and vice/versa. Really, it's only all the open file handles that need to be different -- we could in theory share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2635) FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings
[ https://issues.apache.org/jira/browse/SOLR-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061223#comment-13061223 ] Uwe Schindler commented on SOLR-2635: - How would you expose the args map? The problem of the current namedList is that its not easy to insert that in a backwards compatible way? I am currentyl looking into it, hopefully i will find a solution. FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings Key: SOLR-2635 URL: https://issues.apache.org/jira/browse/SOLR-2635 Project: Solr Issue Type: Improvement Components: Schema and Analysis, web gui Reporter: Stefan Matheis (steffkes) Priority: Minor The [current/old Analysis Page|http://files.mathe.is/solr-admin/04_analysis_verbose_cur.png] exposes the Filter- Tokenizer-Settings -- the FieldAnalysisRequestHandler not :/ This Information is already available on the [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] (through LukeRequestHandler) - so we could load this in parallel and grab the required informations .. but it would be easier if we could add this Information, so that we have all relevant Information at one Place. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-2635) FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings
[ https://issues.apache.org/jira/browse/SOLR-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned SOLR-2635: --- Assignee: Uwe Schindler FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings Key: SOLR-2635 URL: https://issues.apache.org/jira/browse/SOLR-2635 Project: Solr Issue Type: Improvement Components: Schema and Analysis, web gui Reporter: Stefan Matheis (steffkes) Assignee: Uwe Schindler Priority: Minor The [current/old Analysis Page|http://files.mathe.is/solr-admin/04_analysis_verbose_cur.png] exposes the Filter- Tokenizer-Settings -- the FieldAnalysisRequestHandler not :/ This Information is already available on the [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] (through LukeRequestHandler) - so we could load this in parallel and grab the required informations .. but it would be easier if we could add this Information, so that we have all relevant Information at one Place. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3279) Allow CFS be empty
[ https://issues.apache.org/jira/browse/LUCENE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-3279. - Resolution: Fixed Committed to trunk in revision 1143766 backported to 3.x in revision 1143775 Allow CFS be empty -- Key: LUCENE-3279 URL: https://issues.apache.org/jira/browse/LUCENE-3279 Project: Lucene - Java Issue Type: Improvement Components: core/store Affects Versions: 3.4, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 3.4, 4.0 Attachments: LUCENE-3279.patch since we changed CFS semantics slightly closing a CFS directory on an error can lead to an exception. Yet, an empty CFS is still a valid CFS so for consistency we should allow CFS to be empty. here is an example: {noformat} 1 tests failed. REGRESSION: org.apache.lucene.index.TestIndexWriterOnDiskFull.testAddDocumentOnDiskFull Error Message: CFS has no entries Stack Trace: java.lang.IllegalStateException: CFS has no entries at org.apache.lucene.store.CompoundFileWriter.close(CompoundFileWriter.java:139) at org.apache.lucene.store.CompoundFileDirectory.close(CompoundFileDirectory.java:181) at org.apache.lucene.store.DefaultCompoundFileDirectory.close(DefaultCompoundFileDirectory.java:58) at org.apache.lucene.index.SegmentMerger.createCompoundFile(SegmentMerger.java:139) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4252) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3863) at org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:37) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2715) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2710) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2706) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3513) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2064) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2031) at org.apache.lucene.index.TestIndexWriterOnDiskFull.addDoc(TestIndexWriterOnDiskFull.java:539) at org.apache.lucene.index.TestIndexWriterOnDiskFull.testAddDocumentOnDiskFull(TestIndexWriterOnDiskFull.java:74) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1277) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1195) {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3216) Store DocValues per segment instead of per field
[ https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-3216. - Resolution: Fixed Lucene Fields: [New, Patch Available] (was: [New]) Committed in revision 1143776. Store DocValues per segment instead of per field Key: LUCENE-3216 URL: https://issues.apache.org/jira/browse/LUCENE-3216 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch currently we are storing docvalues per field which results in at least one file per field that uses docvalues (or at most two per field per segment depending on the impl.). Yet, we should try to by default pack docvalues into a single file if possible. To enable this we need to hold all docvalues in memory during indexing and write them to disk once we flush a segment. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061230#comment-13061230 ] Uwe Schindler commented on SOLR-2399: - One additional question on the new analysis page: - Does your code also support CharFilters? I just ask because I had no time to try it out, it just came into my mind when i worked on FieldAnalysisReqHandler. The problem is that CharFilters return a different set of attributes and only one token. Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Assignee: Ryan McKinley Priority: Minor Fix For: 4.0 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, SOLR-2399-110606.patch, SOLR-2399-110622.patch, SOLR-2399-110702.patch, SOLR-2399-admin-interface.patch, SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] *Features:* * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png] * [Query-Form|http://files.mathe.is/solr-admin/02_query.png] * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png] * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, SOLR-2400) * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482) * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] * [Replication|http://files.mathe.is/solr-admin/10_replication.png] * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png] * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459) ** Stub (using static data) Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3233) HuperDuperSynonymsFilter™
[ https://issues.apache.org/jira/browse/LUCENE-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061232#comment-13061232 ] Robert Muir commented on LUCENE-3233: - so i don't forget, lets not waste an arc bitflag marking an arc as 'first'... I hear the secret is instead arc.target == startNode HuperDuperSynonymsFilter™ - Key: LUCENE-3233 URL: https://issues.apache.org/jira/browse/LUCENE-3233 Project: Lucene - Java Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-3223.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, synonyms.zip The current synonymsfilter uses a lot of ram and cpu, especially at build time. I think yesterday I heard about huge synonyms files three times. So, I think we should use an FST-based structure, sharing the inputs and outputs. And we should be more efficient with the tokenStream api, e.g. using save/restoreState instead of cloneAttributes() -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2623) Solr JMX MBeans do not survive core reloads
[ https://issues.apache.org/jira/browse/SOLR-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-2623: Attachment: SOLR-2623.patch Here's a patch which fixes the issue. I've reused Alexey's tests with the solution I proposed earlier. The problem with the core name changing across reloads is something we can address in another issue. Solr JMX MBeans do not survive core reloads --- Key: SOLR-2623 URL: https://issues.apache.org/jira/browse/SOLR-2623 Project: Solr Issue Type: Bug Components: multicore Affects Versions: 1.4, 1.4.1, 3.1, 3.2 Reporter: Alexey Serba Assignee: Shalin Shekhar Mangar Priority: Minor Attachments: SOLR-2623.patch, SOLR-2623.patch, SOLR-2623.patch, SOLR-2623.patch Solr JMX MBeans do not survive core reloads {noformat:title=Steps to reproduce} sh cd example sh vi multicore/core0/conf/solrconfig.xml # enable jmx sh java -Dcom.sun.management.jmxremote -Dsolr.solr.home=multicore -jar start.jar sh echo 'open 8842 # 8842 is java pid domain solr/core0 beans ' | java -jar jmxterm-1.0-alpha-4-uber.jar solr/core0:id=core0,type=core solr/core0:id=org.apache.solr.handler.StandardRequestHandler,type=org.apache.solr.handler.StandardRequestHandler solr/core0:id=org.apache.solr.handler.StandardRequestHandler,type=standard solr/core0:id=org.apache.solr.handler.XmlUpdateRequestHandler,type=/update solr/core0:id=org.apache.solr.handler.XmlUpdateRequestHandler,type=org.apache.solr.handler.XmlUpdateRequestHandler ... solr/core0:id=org.apache.solr.search.SolrIndexSearcher,type=searcher solr/core0:id=org.apache.solr.update.DirectUpdateHandler2,type=updateHandler sh curl 'http://localhost:8983/solr/admin/cores?action=RELOADcore=core0' sh echo 'open 8842 # 8842 is java pid domain solr/core0 beans ' | java -jar jmxterm-1.0-alpha-4-uber.jar # there's only one bean left after Solr core reload solr/core0:id=org.apache.solr.search.SolrIndexSearcher,type=Searcher@2e831a91 main {noformat} The root cause of this is Solr core reload behavior: # create new core (which overwrites existing registered MBeans) # register new core and close old one (we remove/un-register MBeans on oldCore.close) The correct sequence is: # unregister MBeans from old core # create and register new core # close old core without touching MBeans -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3284) Move contribs/modules away from QueryParser dependency
[ https://issues.apache.org/jira/browse/LUCENE-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061238#comment-13061238 ] Chris Male commented on LUCENE-3284: After looking at the analysis-common dependencies, I think they can be refactored out. There isn't any need to actually form Querys, the same testing can be done by asserting the tokenstream contents. I'll work on those and upload a new patch. Move contribs/modules away from QueryParser dependency -- Key: LUCENE-3284 URL: https://issues.apache.org/jira/browse/LUCENE-3284 Project: Lucene - Java Issue Type: Sub-task Components: core/queryparser, modules/queryparser Reporter: Chris Male Attachments: LUCENE-3284.patch Some contribs and modules depend on the core QueryParser just for simplicity in their tests. We should apply the same process as I did to the core tests, and move them away from using the QueryParser where possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 9390 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9390/ All tests passed Build Log (for compile errors): [...truncated 10091 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2638) A CoreContainer Plugin interface to create Container level Services
[ https://issues.apache.org/jira/browse/SOLR-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061240#comment-13061240 ] Noble Paul commented on SOLR-2638: -- I'm preparing a mega patch which abstracts out Zookeeper as a complete plugin. It also simplifies the configuration A CoreContainer Plugin interface to create Container level Services --- Key: SOLR-2638 URL: https://issues.apache.org/jira/browse/SOLR-2638 Project: Solr Issue Type: New Feature Components: multicore Reporter: Noble Paul Assignee: Noble Paul Attachments: SOLR-2638.patch It can help register services such as Zookeeper . interface {code:java} public abstract class ContainerPlugin { /**Called before initializing any core. * @param container * @param attrs */ public abstract void init(CoreContainer container, MapString,String attrs); /**Callback after all cores are initialized */ public void postInit(){} /**Callback after each core is created, but before registration * @param core */ public void onCoreCreate(SolrCore core){} /**Callback for server shutdown */ public void shutdown(){} } {code} It may be specified in solr.xml as {code:xml} solr plugin name=zk class=solr.ZookeeperService param1=val1 param2=val2 zkClientTimeout=8000/ cores adminPath=/admin/cores defaultCoreName=collection1 host=127.0.0.1 hostPort=${hostPort:8983} hostContext=solr core name=collection1 shard=${shard:} collection=${collection:collection1} config=${solrconfig:solrconfig.xml} instanceDir=./ /cores /solr {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3233) HuperDuperSynonymsFilter™
[ https://issues.apache.org/jira/browse/LUCENE-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3233: --- Attachment: LUCENE-3233.patch New patch, moving the root arcs cache into FST, not using up our last precious arc bit. HuperDuperSynonymsFilter™ - Key: LUCENE-3233 URL: https://issues.apache.org/jira/browse/LUCENE-3233 Project: Lucene - Java Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-3223.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, synonyms.zip The current synonymsfilter uses a lot of ram and cpu, especially at build time. I think yesterday I heard about huge synonyms files three times. So, I think we should use an FST-based structure, sharing the inputs and outputs. And we should be more efficient with the tokenStream api, e.g. using save/restoreState instead of cloneAttributes() -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 9392 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9392/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestIndexWriterCommit.testCommitThreadSafety Error Message: null Stack Trace: junit.framework.AssertionFailedError: at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1435) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1353) at org.apache.lucene.index.TestIndexWriterCommit.testCommitThreadSafety(TestIndexWriterCommit.java:366) Build Log (for compile errors): [...truncated 1250 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 9392 - Failure
my bad, I committed a fix On Thu, Jul 7, 2011 at 3:27 PM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9392/ 1 tests failed. REGRESSION: Â org.apache.lucene.index.TestIndexWriterCommit.testCommitThreadSafety Error Message: null Stack Trace: junit.framework.AssertionFailedError: Â Â Â Â at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1435) Â Â Â Â at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1353) Â Â Â Â at org.apache.lucene.index.TestIndexWriterCommit.testCommitThreadSafety(TestIndexWriterCommit.java:366) Build Log (for compile errors): [...truncated 1250 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[Lucene.Net] [jira] [Commented] (LUCENENET-433) AttributeSource can have an invalid computed state (LUCENE-3042)
[ https://issues.apache.org/jira/browse/LUCENENET-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061304#comment-13061304 ] Digy commented on LUCENENET-433: Committed to 2.9.4g branch AttributeSource can have an invalid computed state (LUCENE-3042) Key: LUCENENET-433 URL: https://issues.apache.org/jira/browse/LUCENENET-433 Project: Lucene.Net Issue Type: Bug Reporter: Digy Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: LUCENENET-433.patch If you work a tokenstream, consume it, then reuse it and add an attribute to it, the computed state is wrong. thus for example, clearAttributes() will not actually clear the attribute added. So in some situations, addAttribute is not actually clearing the computed state when it should. https://issues.apache.org/jira/browse/LUCENE-3042 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061310#comment-13061310 ] Thomas Fischer commented on SOLR-1604: -- Will the complexphrase search works fine with e.g. GOK:PXB 80?, it will throw an exception if there is no space present, e.g. GOK:PXB80?. The exception is: Unknown query type org.apache.lucene.search.WildcardQuery found in phrase query string PXB80? Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Fix For: 3.4, 4.0 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2635) FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings
[ https://issues.apache.org/jira/browse/SOLR-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061313#comment-13061313 ] Stefan Matheis (steffkes) commented on SOLR-2635: - Maybe we can append this List to the existing output .. like it's actually done for highlighting on the select handler? Just a suggestion: {code:xml}?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime37/int /lst lst name=analysis !-- .. -- /lst lst name=settings lst name=field_types lst name=text_general_rev lst name=index arr name=org.apache.lucene.analysis.standard.StandardTokenizer lst !-- settings -- /lst /arr lst /lst /lst /lst /response{code} That will work w/o problems, as long as the list of used Filter and Tokenizer is unique. If there is at least One, which is used more than once -- the relation is only defined through the order of the list, but we could maybe add an counter to the existing output, then it's also no problem : FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings Key: SOLR-2635 URL: https://issues.apache.org/jira/browse/SOLR-2635 Project: Solr Issue Type: Improvement Components: Schema and Analysis, web gui Reporter: Stefan Matheis (steffkes) Assignee: Uwe Schindler Priority: Minor The [current/old Analysis Page|http://files.mathe.is/solr-admin/04_analysis_verbose_cur.png] exposes the Filter- Tokenizer-Settings -- the FieldAnalysisRequestHandler not :/ This Information is already available on the [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] (through LukeRequestHandler) - so we could load this in parallel and grab the required informations .. but it would be easier if we could add this Information, so that we have all relevant Information at one Place. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2634) Publish nightly snapshots, please
[ https://issues.apache.org/jira/browse/SOLR-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benson Margulies updated SOLR-2634: --- Attachment: SOLR-2634.patch Very small patch that allows nexus deployment. Publish nightly snapshots, please - Key: SOLR-2634 URL: https://issues.apache.org/jira/browse/SOLR-2634 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.4 Reporter: Benson Margulies Assignee: Steven Rowe Attachments: SOLR-2634.patch, SOLR-2634.patch If you added 'mvn deploy' to the jenkins job, the nightly snapshots would push to repository.apache.org as snapshots, where maven could get them without having to manually download and deploy them. Please? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2634) Publish nightly snapshots, please
[ https://issues.apache.org/jira/browse/SOLR-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benson Margulies updated SOLR-2634: --- Attachment: SOLR-2634.patch Very simple patch that enables deployment, optionally, to nexus. Publish nightly snapshots, please - Key: SOLR-2634 URL: https://issues.apache.org/jira/browse/SOLR-2634 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.4 Reporter: Benson Margulies Assignee: Steven Rowe Attachments: SOLR-2634.patch, SOLR-2634.patch If you added 'mvn deploy' to the jenkins job, the nightly snapshots would push to repository.apache.org as snapshots, where maven could get them without having to manually download and deploy them. Please? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2635) FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings
[ https://issues.apache.org/jira/browse/SOLR-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061323#comment-13061323 ] Uwe Schindler commented on SOLR-2635: - This solution might work, i just don't like it, because it decouples the settings from the output and makes correlation harder. But thats of course the same for highlighting. The list of tokenizers and filters is not necessarily unique, but order would be, so access via index (like for highlighting) is fine. Its possible to add the same TokenFilter at several places in the analysis chain, so a lookup by class name is impossible. FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings Key: SOLR-2635 URL: https://issues.apache.org/jira/browse/SOLR-2635 Project: Solr Issue Type: Improvement Components: Schema and Analysis, web gui Reporter: Stefan Matheis (steffkes) Assignee: Uwe Schindler Priority: Minor The [current/old Analysis Page|http://files.mathe.is/solr-admin/04_analysis_verbose_cur.png] exposes the Filter- Tokenizer-Settings -- the FieldAnalysisRequestHandler not :/ This Information is already available on the [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] (through LukeRequestHandler) - so we could load this in parallel and grab the required informations .. but it would be easier if we could add this Information, so that we have all relevant Information at one Place. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2634) Publish nightly snapshots, please
[ https://issues.apache.org/jira/browse/SOLR-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061325#comment-13061325 ] Benson Margulies commented on SOLR-2634: So, if you apply the patch, folks like me can trivially deliver snapshots to repo managers that use password authentication. You can deliver to the Apache snapshot repo by changing your jenkins job to look like: ant generate-maven-artifacts -Dm2.repository.url=https://repository.apache.org/content/repositories/snapshots/ -Dm2.repository.username=whoever -Dm2.repository.password=whatever There is some scheme on the jenkins instance for these credentials, I can research it for you. My elders and betters at d...@maven.apache.org tell me that the thing that you have is really not a good idea from either a Jenkins or a Maven standpoint. Following my recipe here will change nothing about the publicity/policy issues, it will retain some old snapshots which might be useful, and it will generally work better. Publish nightly snapshots, please - Key: SOLR-2634 URL: https://issues.apache.org/jira/browse/SOLR-2634 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.4 Reporter: Benson Margulies Assignee: Steven Rowe Attachments: SOLR-2634.patch, SOLR-2634.patch If you added 'mvn deploy' to the jenkins job, the nightly snapshots would push to repository.apache.org as snapshots, where maven could get them without having to manually download and deploy them. Please? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2635) FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings
[ https://issues.apache.org/jira/browse/SOLR-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061330#comment-13061330 ] Stefan Matheis (steffkes) commented on SOLR-2635: - Hm yes, correct :/ Then, what about an additional {{settings=true}} -parameter for this Handler which adds a second lst-Element with the used Settings? {code:xml}arr name=org.apache.lucene.analysis.standard.StandardTokenizer lst !-- .. existing output .. -- /lst lst name=settings !-- settings -- /lst /arr{code} The JSON-Output for this Handler is already not the best, but that should be still usable. FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings Key: SOLR-2635 URL: https://issues.apache.org/jira/browse/SOLR-2635 Project: Solr Issue Type: Improvement Components: Schema and Analysis, web gui Reporter: Stefan Matheis (steffkes) Assignee: Uwe Schindler Priority: Minor The [current/old Analysis Page|http://files.mathe.is/solr-admin/04_analysis_verbose_cur.png] exposes the Filter- Tokenizer-Settings -- the FieldAnalysisRequestHandler not :/ This Information is already available on the [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] (through LukeRequestHandler) - so we could load this in parallel and grab the required informations .. but it would be easier if we could add this Information, so that we have all relevant Information at one Place. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2392) Enable flexible scoring
[ https://issues.apache.org/jira/browse/LUCENE-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061332#comment-13061332 ] Robert Muir commented on LUCENE-2392: - I think we need to commit the refactoring portions (separating TF-IDF out) to trunk very soon, because its really difficult to keep this branch in sync with trunk, e.g. lots of activity and refactoring going on. I'd like to get this merged in as quickly as possible. I don't think the svn history is interesting, especially given all the frustrations I am having with merging... The easiest way will be to commit a patch, I'll get everything in shape and upload one soon, like, today. Enable flexible scoring --- Key: LUCENE-2392 URL: https://issues.apache.org/jira/browse/LUCENE-2392 Project: Lucene - Java Issue Type: Improvement Components: core/search Reporter: Michael McCandless Assignee: Michael McCandless Fix For: flexscoring branch Attachments: LUCENE-2392.patch, LUCENE-2392.patch, LUCENE-2392.patch, LUCENE-2392_take2.patch This is a first step (nowhere near committable!), implementing the design iterated to in the recent Baby steps towards making Lucene's scoring more flexible java-dev thread. The idea is (if you turn it on for your Field; it's off by default) to store full stats in the index, into a new _X.sts file, per doc (X field) in the index. And then have FieldSimilarityProvider impls that compute doc's boost bytes (norms) from these stats. The patch is able to index the stats, merge them when segments are merged, and provides an iterator-only API. It also has starting point for per-field Sims that use the stats iterator API to compute boost bytes. But it's not at all tied into actual searching! There's still tons left to do, eg, how does one configure via Field/FieldType which stats one wants indexed. All tests pass, and I added one new TestStats unit test. The stats I record now are: - field's boost - field's unique term count (a b c a a b -- 3) - field's total term count (a b c a a b -- 6) - total term count per-term (sum of total term count for all docs that have this term) Still need at least the total term count for each field. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3167) Make lucene/solr a OSGI bundle through Ant
[ https://issues.apache.org/jira/browse/LUCENE-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luca Stancapiano updated LUCENE-3167: - Attachment: lucene_trunk.patch I expose this patch: macrodef name=build-manifest description=Builds a manifest file attribute name=title default=Lucene Search Engine: ${ant.project.name} / attribute name=bndtempDir default=${build.dir}/temp/ sequential xmlproperty file=${ant.file} collapseAttributes=true prefix=bnd/ property name=bndclasspath refid=classpath/ taskdef resource=aQute/bnd/ant/taskdef.properties / mkdir dir=@{bndtempDir}/ bnd classpath=${bndclasspath} eclipse=false failok=false exceptions=true files=${common.dir}/lucene.bnd output=@{bndtempDir}/${final.name}-temp.jar / copy todir=${common.dir}/build flatten=true resources url url=jar:file://@{bndtempDir}/${final.name}-temp.jar!/META-INF/MANIFEST.MF/ /resources /copy /sequential /macrodef It rewrites the build-manifest macrodef because bndlib cannot append the information of the manifest.mf. I moved the information of the manifest in the lucene.bnd file appending the new osgi info: Export-Package: *;-split-package:=merge-first Specification-Title: Lucene Search Engine: ${ant.project.name} Specification-Version: ${spec.version} Specification-Vendor: The Apache Software Foundation Implementation-Title: org.apache.lucene Implementation-Version: ${version} ${svnversion} - ${DSTAMP} ${TSTAMP} Implementation-Vendor: The Apache Software Foundation X-Compile-Source-JDK: ${javac.source} X-Compile-Target-JDK: ${javac.target} Bundle-License: http://www.apache.org/licenses/LICENSE-2.0.txt Bundle-SymbolicName: org.apache.lucene.${name} Bundle-Name: Lucene Search Engine: ${ant.project.name} Bundle-Vendor: The Apache Software Foundation Bundle-Version: ${version} Bundle-Description: ${bnd.project.description} Bundle-DocUR: http://www.apache.org/ I tested on lucene and solr modules and all jars are created with the correct OSGI info in the manifest.mf. Unluckily bndlib is not flexible so if you use bndlib you are forced to: - precompile the classes - create a temp directory with a temporary jar - extract the new manifest from the jar and put it in the shared directory Make lucene/solr a OSGI bundle through Ant -- Key: LUCENE-3167 URL: https://issues.apache.org/jira/browse/LUCENE-3167 Project: Lucene - Java Issue Type: New Feature Environment: bndtools Reporter: Luca Stancapiano Attachments: lucene_trunk.patch We need to make a bundle thriugh Ant, so the binary can be published and no more need the download of the sources. Actually to get a OSGI bundle we need to use maven tools and build the sources. Here the reference for the creation of the OSGI bundle through Maven: https://issues.apache.org/jira/browse/LUCENE-1344 Bndtools could be used inside Ant -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2635) FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings
[ https://issues.apache.org/jira/browse/SOLR-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061341#comment-13061341 ] Uwe Schindler commented on SOLR-2635: - I was already thinking about an extra param to enable the settings. But like for highlighting, we should add them as a separate list with relation via lst-index. Is this fine? To fix the output perfelctly, each list inside the anaysis component array should have a key like tokens, settings, but that would make it incompatible. Also the CharFilter output would need some improvements (I prefer to return the CharFilter output like a single token in other compoenents, currently its one level higher - it has no lst). But thats out of scope for this issue. FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings Key: SOLR-2635 URL: https://issues.apache.org/jira/browse/SOLR-2635 Project: Solr Issue Type: Improvement Components: Schema and Analysis, web gui Reporter: Stefan Matheis (steffkes) Assignee: Uwe Schindler Priority: Minor The [current/old Analysis Page|http://files.mathe.is/solr-admin/04_analysis_verbose_cur.png] exposes the Filter- Tokenizer-Settings -- the FieldAnalysisRequestHandler not :/ This Information is already available on the [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] (through LukeRequestHandler) - so we could load this in parallel and grab the required informations .. but it would be easier if we could add this Information, so that we have all relevant Information at one Place. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2634) Publish nightly snapshots, please
[ https://issues.apache.org/jira/browse/SOLR-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061342#comment-13061342 ] Steven Rowe commented on SOLR-2634: --- bq. There is some scheme on the jenkins instance for these credentials, I can research it for you. Please do. bq. My elders and betters at d...@maven.apache.org tell me that the thing that you have is really not a good idea from either a Jenkins or a Maven standpoint. The thing that you have? Have you rigged a device that can spy my goiter through the tubes of the interweb? Quiet acceptance of my elders' and betters' judgments is a virtue that I, sadly, lack; you have my admiration. A pointer to the mailing list discussion(s) to which you appear to be referring would be helpful. Publish nightly snapshots, please - Key: SOLR-2634 URL: https://issues.apache.org/jira/browse/SOLR-2634 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.4 Reporter: Benson Margulies Assignee: Steven Rowe Attachments: SOLR-2634.patch, SOLR-2634.patch If you added 'mvn deploy' to the jenkins job, the nightly snapshots would push to repository.apache.org as snapshots, where maven could get them without having to manually download and deploy them. Please? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061348#comment-13061348 ] Stefan Matheis (steffkes) commented on SOLR-2399: - bq. Does your code also support CharFilters? Thanks Uwe -- actually it does not. i've just checked the default-enabled fields types from the example package. I'll try to fix that and update my last patch. {code:xml}?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime1/int /lst lst name=analysis lst name=field_types lst name=text_char_norm lst name=index str name=org.apache.lucene.analysis.charfilter.MappingCharFilterFoo/str arr name=org.apache.lucene.analysis.core.WhitespaceTokenizer lst str name=textFoo/str str name=raw_bytes[46 6f 6f]/str int name=start0/int int name=end3/int int name=position1/int arr name=positionHistory int1/int /arr str name=typeword/str /lst /arr /lst /lst /lst lst name=field_names//lst /response{code} will create an _virtual_ object for CharFilters so that they have on property {{text}} - should be okay? Especially in Combination with other Filters Tokenizer which have more than that. Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Assignee: Ryan McKinley Priority: Minor Fix For: 4.0 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, SOLR-2399-110606.patch, SOLR-2399-110622.patch, SOLR-2399-110702.patch, SOLR-2399-admin-interface.patch, SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] *Features:* * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png] * [Query-Form|http://files.mathe.is/solr-admin/02_query.png] * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png] * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, SOLR-2400) * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482) * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] * [Replication|http://files.mathe.is/solr-admin/10_replication.png] * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png] * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459) ** Stub (using static data) Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2634) Publish nightly snapshots, please
[ https://issues.apache.org/jira/browse/SOLR-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated SOLR-2634: -- Affects Version/s: 4.0 Fix Version/s: 4.0 3.4 Publish nightly snapshots, please - Key: SOLR-2634 URL: https://issues.apache.org/jira/browse/SOLR-2634 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.4, 4.0 Reporter: Benson Margulies Assignee: Steven Rowe Fix For: 3.4, 4.0 Attachments: SOLR-2634.patch, SOLR-2634.patch If you added 'mvn deploy' to the jenkins job, the nightly snapshots would push to repository.apache.org as snapshots, where maven could get them without having to manually download and deploy them. Please? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2634) Publish nightly snapshots, please
[ https://issues.apache.org/jira/browse/SOLR-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated SOLR-2634: -- Attachment: SOLR-2634.patch bq. So, if you apply the patch, folks like me can trivially deliver snapshots to repo managers that use password authentication. I agree, this is a good addition. This version of your patch adds password auth in two more places where it's required. I tested that the additions do no harm to the local-repo use case for {{ant generate-maven-artifacts}}. I'll commit shortly. Publish nightly snapshots, please - Key: SOLR-2634 URL: https://issues.apache.org/jira/browse/SOLR-2634 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.4, 4.0 Reporter: Benson Margulies Assignee: Steven Rowe Fix For: 3.4, 4.0 Attachments: SOLR-2634.patch, SOLR-2634.patch, SOLR-2634.patch If you added 'mvn deploy' to the jenkins job, the nightly snapshots would push to repository.apache.org as snapshots, where maven could get them without having to manually download and deploy them. Please? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2634) Publish nightly snapshots, please
[ https://issues.apache.org/jira/browse/SOLR-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061389#comment-13061389 ] Steven Rowe commented on SOLR-2634: --- Benson, I committed your patch to trunk in r1143878 and branch_3x in r1143882. Publish nightly snapshots, please - Key: SOLR-2634 URL: https://issues.apache.org/jira/browse/SOLR-2634 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.4, 4.0 Reporter: Benson Margulies Assignee: Steven Rowe Fix For: 3.4, 4.0 Attachments: SOLR-2634.patch, SOLR-2634.patch, SOLR-2634.patch If you added 'mvn deploy' to the jenkins job, the nightly snapshots would push to repository.apache.org as snapshots, where maven could get them without having to manually download and deploy them. Please? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Matheis (steffkes) updated SOLR-2399: Attachment: SOLR-2399-110702.patch Patch based on SVN-Rev {{1143882}}, works now also with CharFilter-Output. Screenshot: [Normal|http://files.mathe.is/solr-admin/04_analysis-cf.png], [Verbose|http://files.mathe.is/solr-admin/04_analysis_verbose-cf.png] Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Assignee: Ryan McKinley Priority: Minor Fix For: 4.0 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, SOLR-2399-110606.patch, SOLR-2399-110622.patch, SOLR-2399-110702.patch, SOLR-2399-110702.patch, SOLR-2399-admin-interface.patch, SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] *Features:* * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png] * [Query-Form|http://files.mathe.is/solr-admin/02_query.png] * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png] * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, SOLR-2400) * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482) * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] * [Replication|http://files.mathe.is/solr-admin/10_replication.png] * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png] * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459) ** Stub (using static data) Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061391#comment-13061391 ] Simon Willnauer commented on LUCENE-2878: - hey Mike, I applied all your patches and walked through, this looks great. I mean this entire thing is far from committable but I think we should take this further and open a branch for it. I want to commit both your latest patch and the highlighter prototype and work from there. {quote}So after working with this a bit more (and reading the paper), I see now that it's really not necessary to cache positions in the iterators. So never mind all that! In the end, for some uses like highlighting I think somebody needs to cache positions (I put it in a ScorePosDoc created by the PosCollector), but I agree that doesn't belong in the lower level iterator.{quote} after looking into your patch I think I understand now what is needed to enable low level stuff like highlighting. what is missing here is a positions collector interface that you can pass in and that collects positions on the lowest levels like for pharses or simple terms. The PositionIterator itself (btw. i think we should call it Positions or something along those lines - try to not introduce spans in the name :) ) should accept this collector and simply call back each low level position if needed. For highlighting I think we should also go a two stage approach. First stage does the matching (with or without positions) and second stage takes the first stages resutls and does the highlighting. that way we don't slow down the query and the second one can even choose a different rewrite method (for MTQ this is needed as we don't have positions on filters) {quote} As I'm learning more, I am beginning to see this is going to require sweeping updates. Basically everywhere we currently create a DocsEnum, we might now want to create a DocsAndPositionsEnum, and then the options (needs positions/payloads) have to be threaded through all the surrounding APIs. I wonder if it wouldn't make sense to encapsulate those options (needsPositions/needsPayloads) in some kind of EnumConfig object. Just in case, down the line, there is some other information that gets stored in the index, and wants to be made available during scoring, then the required change would be much less painful to implement. {quote} what do you mean by sweeping updates? For the enum config I think we only have 2 or 3 places where we need to make the decision. 1. TermScorer 2. PhraseScorer (maybe 2. goes away anyway) so this is not needed for now I think? {quote} I'm thinking for example (Robert M's idea), that it might be nice to have a positions-offsets map in the index (this would be better for highlighting than term vectors). Maybe this would just be part of payload, but maybe not? And it seems possible there could be other things like that we don't know about yet? {quote} yeah this would be awesome... next step :) Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: Bulk Postings branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet, currently only TermQuery / TermScorer
[jira] [Commented] (SOLR-2635) FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings
[ https://issues.apache.org/jira/browse/SOLR-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061392#comment-13061392 ] Stefan Matheis (steffkes) commented on SOLR-2635: - bq. Is this fine? Yes, that should be good to work with :) FieldAnalysisRequestHandler; Expose Filter- Tokenizer-Settings Key: SOLR-2635 URL: https://issues.apache.org/jira/browse/SOLR-2635 Project: Solr Issue Type: Improvement Components: Schema and Analysis, web gui Reporter: Stefan Matheis (steffkes) Assignee: Uwe Schindler Priority: Minor The [current/old Analysis Page|http://files.mathe.is/solr-admin/04_analysis_verbose_cur.png] exposes the Filter- Tokenizer-Settings -- the FieldAnalysisRequestHandler not :/ This Information is already available on the [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] (through LukeRequestHandler) - so we could load this in parallel and grab the required informations .. but it would be easier if we could add this Information, so that we have all relevant Information at one Place. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3167) Make lucene/solr a OSGI bundle through Ant
[ https://issues.apache.org/jira/browse/LUCENE-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061393#comment-13061393 ] Luca Stancapiano commented on LUCENE-3167: -- I created an issue on https://github.com/bnd/bnd/issues/70 to parametrize the bnd ant task Make lucene/solr a OSGI bundle through Ant -- Key: LUCENE-3167 URL: https://issues.apache.org/jira/browse/LUCENE-3167 Project: Lucene - Java Issue Type: New Feature Environment: bndtools Reporter: Luca Stancapiano Attachments: lucene_trunk.patch We need to make a bundle thriugh Ant, so the binary can be published and no more need the download of the sources. Actually to get a OSGI bundle we need to use maven tools and build the sources. Here the reference for the creation of the OSGI bundle through Maven: https://issues.apache.org/jira/browse/LUCENE-1344 Bndtools could be used inside Ant -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061395#comment-13061395 ] Robert Muir commented on LUCENE-2878: - {quote} For highlighting I think we should also go a two stage approach. First stage does the matching (with or without positions) and second stage takes the first stages resutls and does the highlighting. that way we don't slow down the query and the second one can even choose a different rewrite method (for MTQ this is needed as we don't have positions on filters) {quote} I think this would be a good approach, its the same algorithm really that you generally want for positional scoring: score all the docs the 'fast' way then reorder only the top-N (e.g. first two pages of results), which will require using the positioniterator and doing some calculation that you typically add to the score. So if we can generalize this in a way where you can do this in your collector, I think it would be reusable for this as well. Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: Bulk Postings branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet, currently only TermQuery / TermScorer implements this API and other simply return null instead. To show that the API really works and our BulkPostings work fine too with positions I cut over TermSpanQuery to use a TermScorer under the hood and nuked TermSpans entirely. A nice sideeffect of this was that the Position BulkReading implementation got some exercise which now :) work all with positions while Payloads for bulkreading are kind of experimental in the patch and those only work with Standard codec. So all spans now work on top of TermScorer ( I truly hate spans since today ) including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother to implement the other codecs yet since I want to get feedback on the API and on this first cut before I go one with it. I will upload the corresponding patch in a minute. I also had to cut over SpanQuery.getSpans(IR) to SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk first but after that pain today I need a break first :). The patch passes all core tests (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2634) Publish nightly snapshots, please
[ https://issues.apache.org/jira/browse/SOLR-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061408#comment-13061408 ] Benson Margulies commented on SOLR-2634: Thank you. Publish nightly snapshots, please - Key: SOLR-2634 URL: https://issues.apache.org/jira/browse/SOLR-2634 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.4, 4.0 Reporter: Benson Margulies Assignee: Steven Rowe Fix For: 3.4, 4.0 Attachments: SOLR-2634.patch, SOLR-2634.patch, SOLR-2634.patch If you added 'mvn deploy' to the jenkins job, the nightly snapshots would push to repository.apache.org as snapshots, where maven could get them without having to manually download and deploy them. Please? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2392) Enable flexible scoring
[ https://issues.apache.org/jira/browse/LUCENE-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061420#comment-13061420 ] Simon Willnauer commented on LUCENE-2392: - {quote} I'd like to get this merged in as quickly as possible. I don't think the svn history is interesting, especially given all the frustrations I am having with merging... The easiest way will be to commit a patch, I'll get everything in shape and upload one soon, like, today. {quote} +1 even if this is not entirely in shape we can still iterate on trunk. Enable flexible scoring --- Key: LUCENE-2392 URL: https://issues.apache.org/jira/browse/LUCENE-2392 Project: Lucene - Java Issue Type: Improvement Components: core/search Reporter: Michael McCandless Assignee: Michael McCandless Fix For: flexscoring branch Attachments: LUCENE-2392.patch, LUCENE-2392.patch, LUCENE-2392.patch, LUCENE-2392_take2.patch This is a first step (nowhere near committable!), implementing the design iterated to in the recent Baby steps towards making Lucene's scoring more flexible java-dev thread. The idea is (if you turn it on for your Field; it's off by default) to store full stats in the index, into a new _X.sts file, per doc (X field) in the index. And then have FieldSimilarityProvider impls that compute doc's boost bytes (norms) from these stats. The patch is able to index the stats, merge them when segments are merged, and provides an iterator-only API. It also has starting point for per-field Sims that use the stats iterator API to compute boost bytes. But it's not at all tied into actual searching! There's still tons left to do, eg, how does one configure via Field/FieldType which stats one wants indexed. All tests pass, and I added one new TestStats unit test. The stats I record now are: - field's boost - field's unique term count (a b c a a b -- 3) - field's total term count (a b c a a b -- 6) - total term count per-term (sum of total term count for all docs that have this term) Still need at least the total term count for each field. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Getting patches (with tests!) committed
On Jul 6, 2011, at 6:44 PM, Erick Erickson wrote: In the past I've had to ping the dev list with an include patch XYZ please message Yeah... doesn't that strike you as a problem though? Maybe I should dig up all my issues and start bugging the dev list with Commit this, pretty please? messages. Or not and they will say in the JIRA graveyard. But I've just assigned it to myself, I'll see if I can get it committed, I'm new enough at the process that I need the practice I noticed, thanks. Best Erick On Wed, Jul 6, 2011 at 1:51 PM, Smiley, David W. dsmi...@mitre.org wrote: How do committers recommend that patch contributors (like me) get their patches committed? At the moment I'm thinking of this one: https://issues.apache.org/jira/browse/SOLR-2535 This is a regression bug. I found the bug, I added a patch which fixes the bug and tested that it was fixed. The tests are actually new tests that tested code that wasn't tested before. I put the fix version in JIRA as 3.3 at the time I did this, because it was ready to go. Well 3.3 came and went, and the version got bumped to 3.4. There are no processes in place for committers to recognize completed patches. I think that's a problem. It's very discouraging, as the contributor. I think prior to a release and ideally at other occasions, issues assigned to the next release number should actually be examined.Granted there are ~250 of them on the Solr side: https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truejqlQuery=project+%3D+SOLR+AND+resolution+%3D+Unresolved+AND+fixVersion+%3D+12316683+ORDER+BY+priority+DESC And some initial triage could separate the wheat from the chaff. ~ David Smiley - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2392) Enable flexible scoring
[ https://issues.apache.org/jira/browse/LUCENE-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2392: Attachment: LUCENE-2392.patch Attached is a patch, with this CHANGES entry: {noformat} * LUCENE-2392: Decoupled vector space scoring from Query/Weight/Scorer. If you extended Similarity directly before, you should extend TFIDFSimilarity instead. Similarity is now a lower-level API to implement other scoring algorithms. See MIGRATE.txt for more details. {noformat} I would like to commit this, and then proceed onward with issues such as LUCENE-3220 and LUCENE-3221 Enable flexible scoring --- Key: LUCENE-2392 URL: https://issues.apache.org/jira/browse/LUCENE-2392 Project: Lucene - Java Issue Type: Improvement Components: core/search Reporter: Michael McCandless Assignee: Michael McCandless Fix For: flexscoring branch Attachments: LUCENE-2392.patch, LUCENE-2392.patch, LUCENE-2392.patch, LUCENE-2392.patch, LUCENE-2392_take2.patch This is a first step (nowhere near committable!), implementing the design iterated to in the recent Baby steps towards making Lucene's scoring more flexible java-dev thread. The idea is (if you turn it on for your Field; it's off by default) to store full stats in the index, into a new _X.sts file, per doc (X field) in the index. And then have FieldSimilarityProvider impls that compute doc's boost bytes (norms) from these stats. The patch is able to index the stats, merge them when segments are merged, and provides an iterator-only API. It also has starting point for per-field Sims that use the stats iterator API to compute boost bytes. But it's not at all tied into actual searching! There's still tons left to do, eg, how does one configure via Field/FieldType which stats one wants indexed. All tests pass, and I added one new TestStats unit test. The stats I record now are: - field's boost - field's unique term count (a b c a a b -- 3) - field's total term count (a b c a a b -- 6) - total term count per-term (sum of total term count for all docs that have this term) Still need at least the total term count for each field. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-949) Add QueryResponse and SolrQuery support for TermVectorComponent
[ https://issues.apache.org/jira/browse/SOLR-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061437#comment-13061437 ] David Smiley commented on SOLR-949: --- It would be easier for a committer to digest this patch if you didn't do any reformatting of existing code. Add QueryResponse and SolrQuery support for TermVectorComponent --- Key: SOLR-949 URL: https://issues.apache.org/jira/browse/SOLR-949 Project: Solr Issue Type: New Feature Components: clients - java Reporter: Aleksander M. Stensby Priority: Minor Attachments: SOLR-949.patch In a similar fashion to Facet information, it would be nice to have support for easily setting TermVector related parameters through SolrQuery, and it would be nice to have methods in QueryResponse to easily retrieve TermVector information -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Getting patches (with tests!) committed
Yeah, this is kind of a grey area, I think we should do what we can to encourage contributions and being better about applying patches when someone has gone through the effort of making one in the first place certainly goes in the right direction... It *may* help that there have been several more committers added in the recent past (myself included), so perhaps there's some more bandwidth available now. Best Erick On Thu, Jul 7, 2011 at 12:44 PM, Smiley, David W. dsmi...@mitre.org wrote: On Jul 6, 2011, at 6:44 PM, Erick Erickson wrote: In the past I've had to ping the dev list with an include patch XYZ please message Yeah... doesn't that strike you as a problem though? Â Maybe I should dig up all my issues and start bugging the dev list with Commit this, pretty please? messages. Â Or not and they will say in the JIRA graveyard. But I've just assigned it to myself, I'll see if I can get it committed, I'm new enough at the process that I need the practice I noticed, thanks. Best Erick On Wed, Jul 6, 2011 at 1:51 PM, Smiley, David W. dsmi...@mitre.org wrote: How do committers recommend that patch contributors (like me) get their patches committed? Â At the moment I'm thinking of this one: https://issues.apache.org/jira/browse/SOLR-2535 This is a regression bug. I found the bug, I added a patch which fixes the bug and tested that it was fixed. Â The tests are actually new tests that tested code that wasn't tested before. Â I put the fix version in JIRA as 3.3 at the time I did this, because it was ready to go. Â Well 3.3 came and went, and the version got bumped to 3.4. Â There are no processes in place for committers to recognize completed patches. Â I think that's a problem. Â It's very discouraging, as the contributor. Â I think prior to a release and ideally at other occasions, issues assigned to the next release number should actually be examined. Â Â Granted there are ~250 of them on the Solr side: https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truejqlQuery=project+%3D+SOLR+AND+resolution+%3D+Unresolved+AND+fixVersion+%3D+12316683+ORDER+BY+priority+DESC And some initial triage could separate the wheat from the chaff. ~ David Smiley - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Getting patches (with tests!) committed
On Thu, Jul 7, 2011 at 6:52 PM, Erick Erickson erickerick...@gmail.com wrote: Yeah, this is kind of a grey area, I think we should do what we can to encourage contributions and being better about applying patches when someone has gone through the effort of making one in the first place certainly goes in the right direction... It *may* help that there have been several more committers added in the recent past (myself included), so perhaps there's some more bandwidth available now. hopefully!!! this is why we added and keep on adding committers since we have too much work than we can handle. simon Best Erick On Thu, Jul 7, 2011 at 12:44 PM, Smiley, David W. dsmi...@mitre.org wrote: On Jul 6, 2011, at 6:44 PM, Erick Erickson wrote: In the past I've had to ping the dev list with an include patch XYZ please message Yeah... doesn't that strike you as a problem though? Â Maybe I should dig up all my issues and start bugging the dev list with Commit this, pretty please? messages. Â Or not and they will say in the JIRA graveyard. But I've just assigned it to myself, I'll see if I can get it committed, I'm new enough at the process that I need the practice I noticed, thanks. Best Erick On Wed, Jul 6, 2011 at 1:51 PM, Smiley, David W. dsmi...@mitre.org wrote: How do committers recommend that patch contributors (like me) get their patches committed? Â At the moment I'm thinking of this one: https://issues.apache.org/jira/browse/SOLR-2535 This is a regression bug. I found the bug, I added a patch which fixes the bug and tested that it was fixed. Â The tests are actually new tests that tested code that wasn't tested before. Â I put the fix version in JIRA as 3.3 at the time I did this, because it was ready to go. Â Well 3.3 came and went, and the version got bumped to 3.4. Â There are no processes in place for committers to recognize completed patches. Â I think that's a problem. Â It's very discouraging, as the contributor. Â I think prior to a release and ideally at other occasions, issues assigned to the next release number should actually be examined. Â Â Granted there are ~250 of them on the Solr side: https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truejqlQuery=project+%3D+SOLR+AND+resolution+%3D+Unresolved+AND+fixVersion+%3D+12316683+ORDER+BY+priority+DESC And some initial triage could separate the wheat from the chaff. ~ David Smiley - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061439#comment-13061439 ] Michael McCandless commented on LUCENE-2793: Looks good! +1 to land it! Just a few things: * Shouldn't WindowsDirectory also call BII.bufferSize(context) and do the same Math.max it used to do? * Should VarGapTermsIndexReader should pass READONCE context down when it opens/reads the FST? Hmm, though, it should just replace the ctx passed in, ie if we are merging vs reading we want to differentiate. Let's open separate issue for this and address post merge? * Can you open an issue for this one: // TODO: context should be part of the key used to cache that reader in the pool.? This is pretty important, else you can get NRT readers with too-large buffer sizes because the readers had been opened for merging first. * Extra space in SegmentInfo.java: IOContext.READONCE ); Directory createOutput and openInput should take an IOContext - Key: LUCENE-2793 URL: https://issues.apache.org/jira/browse/LUCENE-2793 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Michael McCandless Assignee: Varun Thacker Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2793-nrt.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793_final.patch Today for merging we pass down a larger readBufferSize than for searching because we get better performance. I think we should generalize this to a class (IOContext), which would hold the buffer size, but then could hold other flags like DIRECT (bypass OS's buffer cache), SEQUENTIAL, etc. Then, we can make the DirectIOLinuxDirectory fully usable because we would only use DIRECT/SEQUENTIAL during merging. This will require fixing how IW pools readers, so that a reader opened for merging is not then used for searching, and vice/versa. Really, it's only all the open file handles that need to be different -- we could in theory share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2640) Error message typo for missing field
Error message typo for missing field Key: SOLR-2640 URL: https://issues.apache.org/jira/browse/SOLR-2640 Project: Solr Issue Type: Bug Components: search Reporter: Benson Margulies 2011-07-07 13:03:16,630 [http-bio-9167-exec-6] ERROR org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: Specify at least on field, function or query to group by. at org.apache.solr.search.Grouping.execute(Grouping.java:264) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2640) Error message typo for missing field
[ https://issues.apache.org/jira/browse/SOLR-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benson Margulies updated SOLR-2640: --- Attachment: SOLR-2640.patch Error message typo for missing field Key: SOLR-2640 URL: https://issues.apache.org/jira/browse/SOLR-2640 Project: Solr Issue Type: Bug Components: search Reporter: Benson Margulies Attachments: SOLR-2640.patch 2011-07-07 13:03:16,630 [http-bio-9167-exec-6] ERROR org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: Specify at least on field, function or query to group by. at org.apache.solr.search.Grouping.execute(Grouping.java:264) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Getting patches (with tests!) committed
On Thu, Jul 7, 2011 at 12:44 PM, Smiley, David W. dsmi...@mitre.org wrote: On Jul 6, 2011, at 6:44 PM, Erick Erickson wrote: In the past I've had to ping the dev list with an include patch XYZ please message Yeah... doesn't that strike you as a problem though? Actually I think gentle-nagging is an incredibly important part of OS (and, life in general). The process here is not perfect -- we all have our ways of tracking TODOs, but, inevitably, often, things fall past the event horizon on anyone's TODO list, and gentle nagging / bump is very much appreciated to bring attention back, but unfortunately not done nearly often enough. That said, we of course will also forever need more committers... Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 9398 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9398/ 1 tests failed. REGRESSION: org.apache.lucene.search.TestSpanQueryFilter.testFilterWorks Error Message: docIdSet doesn't contain docId 10 Stack Trace: junit.framework.AssertionFailedError: docIdSet doesn't contain docId 10 at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1435) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1353) at org.apache.lucene.search.TestSpanQueryFilter.assertContainsDocId(TestSpanQueryFilter.java:84) at org.apache.lucene.search.TestSpanQueryFilter.testFilterWorks(TestSpanQueryFilter.java:56) Build Log (for compile errors): [...truncated 1226 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3289) FST should allow controlling how hard builder tries to share suffixes
[ https://issues.apache.org/jira/browse/LUCENE-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061453#comment-13061453 ] Robert Muir commented on LUCENE-3289: - I think thats probably good for most cases? In the example you gave, it seems that FST might not be the best algorithm? The strings are extremely long (more like short documents) and probably need to be compressed in some different datastructure, e.g. a word-based one? FST should allow controlling how hard builder tries to share suffixes - Key: LUCENE-3289 URL: https://issues.apache.org/jira/browse/LUCENE-3289 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.4, 4.0 Attachments: LUCENE-3289.patch, LUCENE-3289.patch Today we have a boolean option to the FST builder telling it whether it should share suffixes. If you turn this off, building is much faster, uses much less RAM, and the resulting FST is a prefix trie. But, the FST is larger than it needs to be. When it's on, the builder maintains a node hash holding every node seen so far in the FST -- this uses up RAM and slows things down. On a dataset that Elmer (see java-user thread Autocompletion on large index on Jul 6 2011) provided (thank you!), which is 1.32 M titles avg 67.3 chars per title, building with suffix sharing on took 22.5 seconds, required 1.25 GB heap, and produced 91.6 MB FST. With suffix sharing off, it was 8.2 seconds, 450 MB heap and 129 MB FST. I think we should allow this boolean to be shade-of-gray instead: usually, how well suffixes can share is a function of how far they are from the end of the string, so, by adding a tunable N to only share when suffix length N, we can let caller make reasonable tradeoffs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2308) Separately specify a field's type
[ https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061451#comment-13061451 ] Michael McCandless commented on LUCENE-2308: I'm seeing compilation errors with the last patch, eg: {noformat} [javac] /lucene/fieldtype/lucene/src/test/org/apache/lucene/index/TestSegmentMerger.java:53: setupDoc(org.apache.lucene.document2.Document) in org.apache.lucene.index.DocHelper cannot be applied to (org.apache.lucene.document.Document) [javac] DocHelper.setupDoc(doc1); [javac] ^ {noformat} Otherwise patch looks good! Separately specify a field's type - Key: LUCENE-2308 URL: https://issues.apache.org/jira/browse/LUCENE-2308 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Labels: gsoc2011, lucene-gsoc-11, mentor Fix For: 4.0 Attachments: LUCENE-2308-2.patch, LUCENE-2308-3.patch, LUCENE-2308-4.patch, LUCENE-2308-4.patch, LUCENE-2308-5.patch, LUCENE-2308-6.patch, LUCENE-2308.patch, LUCENE-2308.patch This came up from dicussions on IRC. I'm summarizing here... Today when you make a Field to add to a document you can set things index or not, stored or not, analyzed or not, details like omitTfAP, omitNorms, index term vectors (separately controlling offsets/positions), etc. I think we should factor these out into a new class (FieldType?). Then you could re-use this FieldType instance across multiple fields. The Field instance would still hold the actual value. We could then do per-field analyzers by adding a setAnalyzer on the FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise for per-field codecs (with flex), where we now have PerFieldCodecWrapper). This would NOT be a schema! It's just refactoring what we already specify today. EG it's not serialized into the index. This has been discussed before, and I know Michael Busch opened a more ambitious (I think?) issue. I think this is a good first baby step. We could consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold off on that for starters... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Putting search-lucene.com back on l.a.o/solr
Hi Otis, I think its most likely the case I broke this when releasing! Sorry! Not to defer the blame, but I think the confusing aspect of the solr website wrt releasing is that unlike lucene, solr doesnt have a separate versioned and unversioned site. So this causes some difficulties like having to guess release dates, commit release announcements before the RC, as well as merging difficulties across branches... I think we just need to make sure the latest (3.3) updates are merged into trunk/branch_3x and then republish the site. I'll take a look at this. On Thu, Jul 7, 2011 at 1:36 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi, I just noticed that over on http://lucene.apache.org/solr/ we are back to Lucid Find being the only search provider. 5 months ago we added search-lucene.com there, but now it's gone. Google Analytics shows that search-lucene.com was removed from there on June 4. This is when Lucene 3.2 was released, so I suspect the site was somehow rebuilt and published without it. Aha, I see, it looks like https://issues.apache.org/jira/browse/LUCENE-2660 was applied to trunk only and not branch_3x, and the site was built from 3x branch. As I'm about to go on vacation, I don't want to mess up the site by reforresting it (did it locally and it looks good, but it's past 1 AM here) and publishing it, so I'll just commit stuff in Solr's src/site after applying the patch from LUCENE-2660: branch_3x/solr/src/site$ svn st ?  LUCENE-2660-solr.patch M  src/documentation/skins/lucene/css/screen.css M  src/documentation/skins/lucene/xslt/html/site-to-xhtml.xsl It would be great if somebody could publish this. Thanks, Otis - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3289) FST should allow controlling how hard builder tries to share suffixes
[ https://issues.apache.org/jira/browse/LUCENE-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061456#comment-13061456 ] Michael McCandless commented on LUCENE-3289: Yeah I think costly but perfect minimization is the right default. FST should allow controlling how hard builder tries to share suffixes - Key: LUCENE-3289 URL: https://issues.apache.org/jira/browse/LUCENE-3289 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.4, 4.0 Attachments: LUCENE-3289.patch, LUCENE-3289.patch Today we have a boolean option to the FST builder telling it whether it should share suffixes. If you turn this off, building is much faster, uses much less RAM, and the resulting FST is a prefix trie. But, the FST is larger than it needs to be. When it's on, the builder maintains a node hash holding every node seen so far in the FST -- this uses up RAM and slows things down. On a dataset that Elmer (see java-user thread Autocompletion on large index on Jul 6 2011) provided (thank you!), which is 1.32 M titles avg 67.3 chars per title, building with suffix sharing on took 22.5 seconds, required 1.25 GB heap, and produced 91.6 MB FST. With suffix sharing off, it was 8.2 seconds, 450 MB heap and 129 MB FST. I think we should allow this boolean to be shade-of-gray instead: usually, how well suffixes can share is a function of how far they are from the end of the string, so, by adding a tunable N to only share when suffix length N, we can let caller make reasonable tradeoffs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2640) Error message typo for missing field
[ https://issues.apache.org/jira/browse/SOLR-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated SOLR-2640: -- Priority: Trivial (was: Major) Affects Version/s: 4.0 3.4 3.3 Fix Version/s: 4.0 3.3 Issue Type: Test (was: Bug) Error message typo for missing field Key: SOLR-2640 URL: https://issues.apache.org/jira/browse/SOLR-2640 Project: Solr Issue Type: Test Components: search Affects Versions: 3.3, 3.4, 4.0 Reporter: Benson Margulies Assignee: Simon Willnauer Priority: Trivial Fix For: 3.3, 4.0 Attachments: SOLR-2640.patch 2011-07-07 13:03:16,630 [http-bio-9167-exec-6] ERROR org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: Specify at least on field, function or query to group by. at org.apache.solr.search.Grouping.execute(Grouping.java:264) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-2640) Error message typo for missing field
[ https://issues.apache.org/jira/browse/SOLR-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer reassigned SOLR-2640: - Assignee: Simon Willnauer Error message typo for missing field Key: SOLR-2640 URL: https://issues.apache.org/jira/browse/SOLR-2640 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.3, 3.4, 4.0 Reporter: Benson Margulies Assignee: Simon Willnauer Fix For: 3.3, 4.0 Attachments: SOLR-2640.patch 2011-07-07 13:03:16,630 [http-bio-9167-exec-6] ERROR org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: Specify at least on field, function or query to group by. at org.apache.solr.search.Grouping.execute(Grouping.java:264) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2640) Error message typo for missing field
[ https://issues.apache.org/jira/browse/SOLR-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved SOLR-2640. --- Resolution: Fixed committed, thanks! Error message typo for missing field Key: SOLR-2640 URL: https://issues.apache.org/jira/browse/SOLR-2640 Project: Solr Issue Type: Task Components: search Affects Versions: 3.3, 3.4, 4.0 Reporter: Benson Margulies Assignee: Simon Willnauer Priority: Trivial Fix For: 3.3, 4.0 Attachments: SOLR-2640.patch 2011-07-07 13:03:16,630 [http-bio-9167-exec-6] ERROR org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: Specify at least on field, function or query to group by. at org.apache.solr.search.Grouping.execute(Grouping.java:264) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2615) Have LogUpdateProcessor log each command (add, delete, ...) at debug/FINE level
[ https://issues.apache.org/jira/browse/SOLR-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061467#comment-13061467 ] David Smiley commented on SOLR-2615: Yonik, if I instead use a doDebug boolean flag initialized in the constructor, would that sufficiently satisfy you to commit this? Have LogUpdateProcessor log each command (add, delete, ...) at debug/FINE level --- Key: SOLR-2615 URL: https://issues.apache.org/jira/browse/SOLR-2615 Project: Solr Issue Type: Improvement Components: update Reporter: David Smiley Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2615_LogUpdateProcessor_debug_logging.patch It would be great if the LogUpdateProcessor logged each command (add, delete, ...) at debug (Fine) level. Presently it only logs a summary of 8 commands and it does so at the very end. The attached patch implements this. * I moved the LogUpdateProcessor ahead of RunUpdateProcessor so that the debug level log happens before Solr does anything with it. It should not affect the ordering of the existing summary log which happens at finish(). * I changed UpdateRequestProcessor's static log variable to be an instance variable that uses the current class name. I think this makes much more sense since I want to be able to alter logging levels for a specific processor without doing it for all of them. This change did require me to tweak the factory's detection of the log level which avoids creating the LogUpdateProcessor. * There was an NPE bug in AddUpdateCommand.getPrintableId() in the event there is no schema unique field. I fixed that. You may notice I use SLF4J's nifty log.debug(message blah {} blah, var) syntax, which is both performant and concise as there's no point in guarding the debug message with an isDebugEnabled() since debug() will internally check this any way and there is no string concatenation if debug isn't enabled. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Please commit SOLR-2616 Include jdk14 logging configuration file
Please review/commit: https://issues.apache.org/jira/browse/SOLR-2616Include jdk14 logging configuration file ~ David - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2795) Genericize DirectIOLinuxDir - UnixDir
[ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Thacker updated LUCENE-2795: -- Attachment: LUCENE-2795.patch the open_direct, posix_fadvise function in onlylinux.h remains the same. In onlybsd.h open_direct, posix_fadvise are the same too except that the O_NOATIME flag is not present. In onlyosx the open_direct is implemented in a different way. Also I have added a open_normal function to all of the headers which will be used in case the IOContext in not a MERGE. Genericize DirectIOLinuxDir - UnixDir -- Key: LUCENE-2795 URL: https://issues.apache.org/jira/browse/LUCENE-2795 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Michael McCandless Assignee: Varun Thacker Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2795.patch, LUCENE-2795.patch Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching). It's a trap. But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used. I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061480#comment-13061480 ] Mike Sokolov commented on LUCENE-2878: -- bq. what do you mean by sweeping updates? I meant adding positions to filters would be a sweeping update. But it sounds as if the idea of rewriting differently is a better approach (certainly much less change). bq. For highlighting I think we should also go a two stage approach. I think I agree. The only possible trade-off that goes the other way is in the case where you have the positions available already during initial search/scoring, and there is not too much turnover in the TopDocs priority queue during hit collection. Then a Highlighter might save some time by not re-scoring and re-iterating the positions if it accumulated them up front (even for docs that were eventually dropped off the queue). I think it should be possible to test out both approaches given the right API here though? The callback idea sounds appealing, but I still think we should also consider enabling the top-down approach: especially if this is going to run in two passes, why not let the highlighter drive the iteration? Keep in mind that positions consumers (like highlighters) may possibly be interested in more than just the lowest-level positions (they may want to see phrases, eg, and near-clauses - trying to avoid the s-word). Another consideration is ordering. I think (?) that positions are retrieved from the index in document order. This could be a natural order for many cases, but score order will also be useful. I'm not sure whose responsibility the sorting should be. Highlighters will want to be able to optimize their work (esp for very large documents) by terminating after considering only the first N matches, where the ordering could either be score or document-order. I'm glad you will create a branch - this patch is getting a bit unwieldy. I think the PosHighlighter code should probably (?) end up as test code only - I guess we'll see. It seems like we could get further faster using the existing Highlighter, with a positions-based TokenStream; I'll post a patch once the branch is in place. Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: Bulk Postings branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet, currently only TermQuery / TermScorer implements this API and other simply return null instead. To show that the API really works and our BulkPostings work fine too with positions I cut over TermSpanQuery to use a TermScorer under the hood and nuked TermSpans entirely. A nice sideeffect of this was that the Position BulkReading implementation got some exercise which now :) work all with positions while Payloads for bulkreading are kind of experimental in the patch and those only work with Standard codec. So all spans now work on top of TermScorer ( I truly hate spans since today ) including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother to implement the other codecs yet since I want to get feedback on the API and on this first cut before I go one with it. I
[jira] [Updated] (LUCENE-3233) HuperDuperSynonymsFilter™
[ https://issues.apache.org/jira/browse/LUCENE-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3233: --- Attachment: LUCENE-3233.patch Another rev of the patch: I did a hard bump the FST version (so existing trunk indices must be rebuilt), and added NOTE in suggest's FST impl that the file format is experimental; removed maxVerticalContext; fixed false test failure. HuperDuperSynonymsFilter™ - Key: LUCENE-3233 URL: https://issues.apache.org/jira/browse/LUCENE-3233 Project: Lucene - Java Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-3223.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, synonyms.zip The current synonymsfilter uses a lot of ram and cpu, especially at build time. I think yesterday I heard about huge synonyms files three times. So, I think we should use an FST-based structure, sharing the inputs and outputs. And we should be more efficient with the tokenStream api, e.g. using save/restoreState instead of cloneAttributes() -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Putting search-lucene.com back on l.a.o/solr
: Not to defer the blame, but I think the confusing aspect of the solr : website wrt releasing is that unlike lucene, solr doesnt have a : separate versioned and unversioned site. So this causes some yeah .. at one point we started looking into making this change to be consistent with ./java, but then there was the push to merge development, and reducing the sub-projects in general, which lead to a discussion about moving all of the unversioned parts of the site into the existing directory for the TLP pages (so there would only be one set of forrest docs for the entire website), and then the new Apache CMS came out and grant started looking into that instead of wassting effort merging the forrest docs It's kind of a cluster fuck now. -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3289) FST should allow controlling how hard builder tries to share suffixes
[ https://issues.apache.org/jira/browse/LUCENE-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061512#comment-13061512 ] Dawid Weiss commented on LUCENE-3289: - Exactly. This is a very specific use case (long suggestions). FST should allow controlling how hard builder tries to share suffixes - Key: LUCENE-3289 URL: https://issues.apache.org/jira/browse/LUCENE-3289 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.4, 4.0 Attachments: LUCENE-3289.patch, LUCENE-3289.patch Today we have a boolean option to the FST builder telling it whether it should share suffixes. If you turn this off, building is much faster, uses much less RAM, and the resulting FST is a prefix trie. But, the FST is larger than it needs to be. When it's on, the builder maintains a node hash holding every node seen so far in the FST -- this uses up RAM and slows things down. On a dataset that Elmer (see java-user thread Autocompletion on large index on Jul 6 2011) provided (thank you!), which is 1.32 M titles avg 67.3 chars per title, building with suffix sharing on took 22.5 seconds, required 1.25 GB heap, and produced 91.6 MB FST. With suffix sharing off, it was 8.2 seconds, 450 MB heap and 129 MB FST. I think we should allow this boolean to be shade-of-gray instead: usually, how well suffixes can share is a function of how far they are from the end of the string, so, by adding a tunable N to only share when suffix length N, we can let caller make reasonable tradeoffs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2500) TestSolrProperties sometimes fails with no such core: core0
[ https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061540#comment-13061540 ] Steven Rowe commented on SOLR-2500: --- On Windows 7 using Oracle JDK 1.6.0_21, {{TestSolrProperties#testProperties()}} is consistently failing for me, both individually and with all Solr tests, and Ant and in IntelliJ: {quote} java.lang.AssertionError: Failed to delete C:\svn\lucene\dev\trunk\solr\build\tests\solr\shared\solr-persist.xml at org.junit.Assert.fail(Assert.java:91) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.client.solrj.embedded.TestSolrProperties.tearDown(TestSolrProperties.java:107) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:37) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1430) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1348) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.ParentRunner.run(ParentRunner.java:236) at org.junit.runner.JUnitCore.run(JUnitCore.java:157) {quote} The failure is in TestSolrProperties.tearDown(): {code:java} 107: assertTrue(Failed to delete +persistedFile, persistedFile.delete()); {code} TestSolrProperties sometimes fails with no such core: core0 - Key: SOLR-2500 URL: https://issues.apache.org/jira/browse/SOLR-2500 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Doron Cohen Fix For: 3.2, 4.0 Attachments: SOLR-2500.patch, SOLR-2500.patch, SOLR-2500.patch, solr-after-1st-run.xml, solr-clean.xml [junit] Testsuite: org.apache.solr.client.solrj.embedded.TestSolrProperties [junit] Testcase: testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): Caused an ERROR [junit] No such core: core0 [junit] org.apache.solr.common.SolrException: No such core: core0 [junit] at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118) [junit] at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) [junit] at org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2500) TestSolrProperties sometimes fails with no such core: core0
[ https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061549#comment-13061549 ] Steven Rowe commented on SOLR-2500: --- I can get this failure to consistently succeed by calling System.gc() prior to the attempt to delete the file. Any objections to adding this? TestSolrProperties sometimes fails with no such core: core0 - Key: SOLR-2500 URL: https://issues.apache.org/jira/browse/SOLR-2500 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Doron Cohen Fix For: 3.2, 4.0 Attachments: SOLR-2500.patch, SOLR-2500.patch, SOLR-2500.patch, solr-after-1st-run.xml, solr-clean.xml [junit] Testsuite: org.apache.solr.client.solrj.embedded.TestSolrProperties [junit] Testcase: testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): Caused an ERROR [junit] No such core: core0 [junit] org.apache.solr.common.SolrException: No such core: core0 [junit] at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118) [junit] at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) [junit] at org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (SOLR-2500) TestSolrProperties sometimes fails with no such core: core0
[ https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe reopened SOLR-2500: --- Assignee: Steven Rowe (was: Doron Cohen) Reopening to address Windows test failure. TestSolrProperties sometimes fails with no such core: core0 - Key: SOLR-2500 URL: https://issues.apache.org/jira/browse/SOLR-2500 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Steven Rowe Fix For: 3.2, 4.0 Attachments: SOLR-2500.patch, SOLR-2500.patch, SOLR-2500.patch, solr-after-1st-run.xml, solr-clean.xml [junit] Testsuite: org.apache.solr.client.solrj.embedded.TestSolrProperties [junit] Testcase: testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): Caused an ERROR [junit] No such core: core0 [junit] org.apache.solr.common.SolrException: No such core: core0 [junit] at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118) [junit] at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) [junit] at org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2500) TestSolrProperties sometimes fails with no such core: core0
[ https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061557#comment-13061557 ] Robert Muir commented on SOLR-2500: --- seems like calling gc() is just masking the problem? we should hunt down which finalizer is closing the file and explicitly close instead / fix the leak? TestSolrProperties sometimes fails with no such core: core0 - Key: SOLR-2500 URL: https://issues.apache.org/jira/browse/SOLR-2500 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Steven Rowe Fix For: 3.2, 4.0 Attachments: SOLR-2500.patch, SOLR-2500.patch, SOLR-2500.patch, solr-after-1st-run.xml, solr-clean.xml [junit] Testsuite: org.apache.solr.client.solrj.embedded.TestSolrProperties [junit] Testcase: testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): Caused an ERROR [junit] No such core: core0 [junit] org.apache.solr.common.SolrException: No such core: core0 [junit] at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118) [junit] at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) [junit] at org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2538) Math overflow in LongRangeEndpointCalculator and DoubleRangeEndpointCalculator
[ https://issues.apache.org/jira/browse/SOLR-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-2538. Resolution: Fixed Fix Version/s: 4.0 3.4 Assignee: Hoss Man Erbi: thanks for catching this. looks like a cut/paste error, but i went ahead and added a test to reduce the risk of future regression. Committed revision 1144014. - trunk Committed revision 1144016. - 3x Math overflow in LongRangeEndpointCalculator and DoubleRangeEndpointCalculator --- Key: SOLR-2538 URL: https://issues.apache.org/jira/browse/SOLR-2538 Project: Solr Issue Type: Bug Affects Versions: 3.1 Environment: AMD64+Ubuntu 10.10 Reporter: Erbi Hanka Assignee: Hoss Man Fix For: 3.4, 4.0 In the classes LongRangeEndpointCalculator and DoubleRangeEndpointCalculator, in the method parseAndAddGap, there is a loss of precision: 1318 private static class DoubleRangeEndpointCalculator 1319 extends RangeEndpointCalculator { 1320 1321 public DoubleRangeEndpointCalculator(final SchemaField f) { super(f); } 1322 @Override 1323 protected Double parseVal(String rawval) { 1324return Double.valueOf(rawval); 1325 } 1326 @Override 1327 public Double parseAndAddGap(Double value, String gap) { 1328 --- return new Double(value.floatValue() + Double.valueOf(gap).floatValue()); -- 1329 } 1330 [..] 1344private static class LongRangeEndpointCalculator 1345 extends RangeEndpointCalculator { 1346 1347 public LongRangeEndpointCalculator(final SchemaField f) { super(f); } 1348 @Override 1349 protected Long parseVal(String rawval) { 1350return Long.valueOf(rawval); 1351 } 1352 @Override 1353 public Long parseAndAddGap(Long value, String gap) { 1354 return new Long(value.intValue() + Long.valueOf(gap).intValue()); --- 1355 } 1356} As result, the following code is detecting a data overflow because the long number is being treated as an integer: 1068 while (low.compareTo(end) 0) { 1069T high = calc.addGap(low, gap); 1070if (end.compareTo(high) 0) { 1071 if (params.getFieldBool(f,FacetParams.FACET_RANGE_HARD_END,false)) { 1072high = end; 1073 } else { 1074end = high; 1075 } 1076} 1077if (high.compareTo(low) 0) { 1078 throw new SolrException 1079(SolrException.ErrorCode.BAD_REQUEST, 1080 range facet infinite loop (is gap negative? did the math overflow?)); 1081} 1082 Changing the 'intValue()' by a 'longValue()' and the 'floatValue()' by 'doubleValue()' should work. We have detected this bug when faceting a very long start and end values. We have tested edge values (the transition from 32 to 64 bits) and any value below the threshold works fine. Any value greater than 2^32 doesn't work. We have not tested the 'double' version, but seems that can suffer from the same problem. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2500) TestSolrProperties sometimes fails with no such core: core0
[ https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061580#comment-13061580 ] Steven Rowe commented on SOLR-2500: --- bq. seems like calling gc() is just masking the problem? we should hunt down which finalizer is closing the file and explicitly close instead / fix the leak? I agree. I tracked down the actual file activity to SolrXMLSerializer.persistFile() - this class was created as part of SOLR-2331, which Mark M. committed 2 days ago; the timing makes it the likely culprit. TestSolrProperties sometimes fails with no such core: core0 - Key: SOLR-2500 URL: https://issues.apache.org/jira/browse/SOLR-2500 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Steven Rowe Fix For: 3.2, 4.0 Attachments: SOLR-2500.patch, SOLR-2500.patch, SOLR-2500.patch, solr-after-1st-run.xml, solr-clean.xml [junit] Testsuite: org.apache.solr.client.solrj.embedded.TestSolrProperties [junit] Testcase: testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): Caused an ERROR [junit] No such core: core0 [junit] org.apache.solr.common.SolrException: No such core: core0 [junit] at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118) [junit] at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) [junit] at org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2641) Auto Facet Selection component
Auto Facet Selection component -- Key: SOLR-2641 URL: https://issues.apache.org/jira/browse/SOLR-2641 Project: Solr Issue Type: Improvement Components: SearchComponents - other Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Minor It sure would be nice if you could have Solr automatically select field(s) for faceting based dynamically off the profile of the results. For example, you're indexing disparate types of products, all with varying attributes (color, size - like for apparel, memory_size - for electronics, subject - for books, etc), and a user searches for ipod where most products match products with color and memory_size attributes... let's automatically facet on those fields. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2641) Auto Facet Selection component
[ https://issues.apache.org/jira/browse/SOLR-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher updated SOLR-2641: --- Attachment: SOLR_2641.patch Basic implementation of a search component to put after query and before facet that keys off a fields used field (see SOLR-1280 for how this can be created automatically too), selects the top N fields and sets those as facet.field's automatically. Auto Facet Selection component -- Key: SOLR-2641 URL: https://issues.apache.org/jira/browse/SOLR-2641 Project: Solr Issue Type: Improvement Components: SearchComponents - other Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Minor Attachments: SOLR_2641.patch It sure would be nice if you could have Solr automatically select field(s) for faceting based dynamically off the profile of the results. For example, you're indexing disparate types of products, all with varying attributes (color, size - like for apparel, memory_size - for electronics, subject - for books, etc), and a user searches for ipod where most products match products with color and memory_size attributes... let's automatically facet on those fields. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2641) Auto Facet Selection component
[ https://issues.apache.org/jira/browse/SOLR-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061584#comment-13061584 ] Erik Hatcher commented on SOLR-2641: There's loads of room for improvement here, and likely there's better ways to go about even the simple stuff I've done in this initial patch. Some ideas for improvement: pluggable implementations to determine the best facets to auto-select given the current request and results, ability to tailor the parameters for each field selected for faceting (should facets be sorted by index or count order? mincount? limit? how to determine these for each field?). Auto Facet Selection component -- Key: SOLR-2641 URL: https://issues.apache.org/jira/browse/SOLR-2641 Project: Solr Issue Type: Improvement Components: SearchComponents - other Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Minor Attachments: SOLR_2641.patch It sure would be nice if you could have Solr automatically select field(s) for faceting based dynamically off the profile of the results. For example, you're indexing disparate types of products, all with varying attributes (color, size - like for apparel, memory_size - for electronics, subject - for books, etc), and a user searches for ipod where most products match products with color and memory_size attributes... let's automatically facet on those fields. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2641) Auto Facet Selection component
[ https://issues.apache.org/jira/browse/SOLR-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061588#comment-13061588 ] Erik Hatcher commented on SOLR-2641: What's needed for this type of thing to do the right thing with distributed search? The delegating server will need to cull together the counts (in this current implementation) to determine the best field(s) to facet on before distributing those requests to ensure each shard is faceting on the same field(s). Not sure, yet, how to go about that. Auto Facet Selection component -- Key: SOLR-2641 URL: https://issues.apache.org/jira/browse/SOLR-2641 Project: Solr Issue Type: Improvement Components: SearchComponents - other Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Minor Attachments: SOLR_2641.patch It sure would be nice if you could have Solr automatically select field(s) for faceting based dynamically off the profile of the results. For example, you're indexing disparate types of products, all with varying attributes (color, size - like for apparel, memory_size - for electronics, subject - for books, etc), and a user searches for ipod where most products match products with color and memory_size attributes... let's automatically facet on those fields. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2230) solrj: submitting more than one stream/file via CommonsHttpSolrServer fails
[ https://issues.apache.org/jira/browse/SOLR-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-2230. Resolution: Fixed Fix Version/s: 4.0 3.4 Assignee: Hoss Man Although CommonsHttpSolrServer's code for dealing with multiple streams had canged significantly since Stephan posted his patch, a simple test verified that multiple addFile calls did not work. I've committed some improved tests, along with massaged version of Stephan's test Committed revision 1144038. - trunk Committed revision 1144041. - 3x solrj: submitting more than one stream/file via CommonsHttpSolrServer fails --- Key: SOLR-2230 URL: https://issues.apache.org/jira/browse/SOLR-2230 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4.1 Reporter: Stephan GĂĽnther Assignee: Hoss Man Fix For: 3.4, 4.0 Attachments: 0001-solrj-fix-submitting-more-that-one-stream-via-multip.patch If you are using an HTTP-client (CommonsHttpSolrServer) to connect to Solr, you are unable to push more than one File/Stream over the wire. For example, if you call ContentStreamUpdateRequest.addContentStream()/.addFile() twice to index both files via Tika, you get the following exception at your Solr server: 15:48:59 [ERROR] http-8983-1 [org.apache.solr.core.SolrCore] - org.apache.solr.common.SolrException: missing content stream at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:49) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:619) Seems that the POST body send by CommonsHttpSolrServer is not correct. If you push only one file, everything works as expected. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[Lucene.Net] [jira] [Commented] (LUCENENET-172) This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass
[ https://issues.apache.org/jira/browse/LUCENENET-172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061595#comment-13061595 ] Digy commented on LUCENENET-172: Already fixed for 2.9.4g This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass --- Key: LUCENENET-172 URL: https://issues.apache.org/jira/browse/LUCENENET-172 Project: Lucene.Net Issue Type: Improvement Components: Lucene.Net Core Affects Versions: Lucene.Net 2.3.1, Lucene.Net 2.3.2 Reporter: Ben Martz Assignee: Scott Lombard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: lucene_2.3.1_exceptions_fix.patch, lucene_2.9.4g_exceptions_fix The java version of Lucene handles end-of-file in FastCharStream by throwing an exception. This behavior has been ported to .NET but the behavior carries an unacceptable cost in the .NET environment. This patch is based on the prior work in LUCENENET-8 and LUCENENET-11, which I gratefully acknowledge for the solution. While I understand that this patch is outside of the current project specification in that it deviates from the pure nature of the port, I believe that it is very important to make the patch available to any developer looking to leverage Lucene.Net in their project. Thanks for your consideration. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Updated] (LUCENENET-172) This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass
[ https://issues.apache.org/jira/browse/LUCENENET-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-172: --- Fix Version/s: Lucene.Net 2.9.4g This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass --- Key: LUCENENET-172 URL: https://issues.apache.org/jira/browse/LUCENENET-172 Project: Lucene.Net Issue Type: Improvement Components: Lucene.Net Core Affects Versions: Lucene.Net 2.3.1, Lucene.Net 2.3.2 Reporter: Ben Martz Assignee: Scott Lombard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: lucene_2.3.1_exceptions_fix.patch, lucene_2.9.4g_exceptions_fix The java version of Lucene handles end-of-file in FastCharStream by throwing an exception. This behavior has been ported to .NET but the behavior carries an unacceptable cost in the .NET environment. This patch is based on the prior work in LUCENENET-8 and LUCENENET-11, which I gratefully acknowledge for the solution. While I understand that this patch is outside of the current project specification in that it deviates from the pure nature of the port, I believe that it is very important to make the patch available to any developer looking to leverage Lucene.Net in their project. Thanks for your consideration. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[JENKINS] Lucene-Solr-tests-only-3.x - Build # 9415 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9415/ No tests ran. Build Log (for compile errors): [...truncated 2658 lines...] [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/solrj/org/apache/solr/client/solrj/response/DocumentAnalysisResponse.java:53: warning: [unchecked] unchecked cast [javac] found : java.lang.Object [javac] required: org.apache.solr.common.util.NamedListjava.lang.Object [javac] NamedListObject query = (NamedListObject) field.get(query); [javac]^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/solrj/org/apache/solr/client/solrj/response/DocumentAnalysisResponse.java:59: warning: [unchecked] unchecked cast [javac] found : java.lang.Object [javac] required: org.apache.solr.common.util.NamedListjava.lang.Object [javac] NamedListObject index = (NamedListObject) field.get(index); [javac]^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/solrj/org/apache/solr/client/solrj/response/DocumentAnalysisResponse.java:62: warning: [unchecked] unchecked cast [javac] found : java.lang.Object [javac] required: org.apache.solr.common.util.NamedListjava.lang.Object [javac] NamedListObject valueNL = (NamedListObject) valueEntry.getValue(); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/solrj/org/apache/solr/client/solrj/response/FieldAnalysisResponse.java:47: warning: [unchecked] unchecked cast [javac] found : java.lang.Object [javac] required: org.apache.solr.common.util.NamedListjava.lang.Object [javac] NamedListObject fieldTypesNL = (NamedListObject) analysisNL.get(field_types); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/solrj/org/apache/solr/client/solrj/response/FieldAnalysisResponse.java:51: warning: [unchecked] unchecked cast [javac] found : java.lang.Object [javac] required: org.apache.solr.common.util.NamedListjava.lang.Object [javac] NamedListObject queryNL = (NamedListObject) fieldTypeNL.get(query); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/solrj/org/apache/solr/client/solrj/response/FieldAnalysisResponse.java:54: warning: [unchecked] unchecked cast [javac] found : java.lang.Object [javac] required: org.apache.solr.common.util.NamedListjava.lang.Object [javac] NamedListObject indexNL = (NamedListObject) fieldTypeNL.get(index); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/solrj/org/apache/solr/client/solrj/response/FieldAnalysisResponse.java:61: warning: [unchecked] unchecked cast [javac] found : java.lang.Object [javac] required: org.apache.solr.common.util.NamedListjava.lang.Object [javac] NamedListObject fieldNamesNL = (NamedListObject) analysisNL.get(field_names); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/solrj/org/apache/solr/client/solrj/response/FieldAnalysisResponse.java:65: warning: [unchecked] unchecked cast [javac] found : java.lang.Object [javac] required: org.apache.solr.common.util.NamedListjava.lang.Object [javac] NamedListObject queryNL = (NamedListObject) fieldNameNL.get(query); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/solrj/org/apache/solr/client/solrj/response/FieldAnalysisResponse.java:68: warning: [unchecked] unchecked cast [javac] found : java.lang.Object [javac] required: org.apache.solr.common.util.NamedListjava.lang.Object [javac] NamedListObject indexNL = (NamedListObject) fieldNameNL.get(index); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/solrj/org/apache/solr/client/solrj/request/JavaBinUpdateRequestCodec.java:54: warning: [unchecked] unchecked call to add(java.lang.String,T) as a member of the raw type org.apache.solr.common.util.NamedList [javac] params.add(commitWithin,
Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 9415 - Failure
Hmmm, not sure what i fucked up here... [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/solrj/org/apache/solr/client/solrj/impl/CommonsHttpSolrServer.java:328: method does not override a method from its superclass [javac] @Override [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/src/solrj/org/apache/solr/client/solrj/impl/CommonsHttpSolrServer.java:337: method does not override a method from its superclass [javac] @Override [javac] ^ ...i'm not seeing this locally ... investigating. -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2331) Refactor CoreContainer's SolrXML serialization code and improve testing
[ https://issues.apache.org/jira/browse/SOLR-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated SOLR-2331: -- Attachment: SOLR-2331-fix-windows-file-deletion-failure.patch I reopened SOLR-2500 because TestSolrProperties is failing consistently on Windows 7/Oracle JDK 1.6.0_21 for me, but it appears that this is the issue that introduced the problem. I've tracked the issue down to the anonymous {{FileInputStream}} created in order to print out the contents of the persisted core configuration to STDOUT -- the following line was uncommented when Mark committed the patch on this issue: {code:java} 206: System.out.println(IOUtils.toString(new FileInputStream(new File(solrXml.getParent(), solr-persist.xml; {code} This patch de-anonymizes the {{FileInputStream}} and closes it after the file contents are printed out. I plan to commit this later tonight. Refactor CoreContainer's SolrXML serialization code and improve testing --- Key: SOLR-2331 URL: https://issues.apache.org/jira/browse/SOLR-2331 Project: Solr Issue Type: Improvement Components: multicore Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.0 Attachments: SOLR-2331-fix-windows-file-deletion-failure.patch, SOLR-2331.patch CoreContainer has enough code in it - I'd like to factor out the solr.xml serialization code into SolrXMLSerializer or something - which should make testing it much easier and lightweight. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2500) TestSolrProperties sometimes fails with no such core: core0
[ https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe resolved SOLR-2500. --- Resolution: Fixed Assignee: Doron Cohen (was: Steven Rowe) I attached a patch with a fix to SOLR-2331, which introduced the problem. TestSolrProperties sometimes fails with no such core: core0 - Key: SOLR-2500 URL: https://issues.apache.org/jira/browse/SOLR-2500 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Doron Cohen Fix For: 4.0, 3.2 Attachments: SOLR-2500.patch, SOLR-2500.patch, SOLR-2500.patch, solr-after-1st-run.xml, solr-clean.xml [junit] Testsuite: org.apache.solr.client.solrj.embedded.TestSolrProperties [junit] Testcase: testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): Caused an ERROR [junit] No such core: core0 [junit] org.apache.solr.common.SolrException: No such core: core0 [junit] at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118) [junit] at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) [junit] at org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2331) Refactor CoreContainer's SolrXML serialization code and improve testing
[ https://issues.apache.org/jira/browse/SOLR-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated SOLR-2331: -- Attachment: SOLR-2331-fix-windows-file-deletion-failure.patch This version of the patch wraps the persisted core config printing to STDOUT in an {{if (VERBOSE)}} block. Committing shortly. Refactor CoreContainer's SolrXML serialization code and improve testing --- Key: SOLR-2331 URL: https://issues.apache.org/jira/browse/SOLR-2331 Project: Solr Issue Type: Improvement Components: multicore Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.0 Attachments: SOLR-2331-fix-windows-file-deletion-failure.patch, SOLR-2331-fix-windows-file-deletion-failure.patch, SOLR-2331.patch CoreContainer has enough code in it - I'd like to factor out the solr.xml serialization code into SolrXMLSerializer or something - which should make testing it much easier and lightweight. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2331) Refactor CoreContainer's SolrXML serialization code and improve testing
[ https://issues.apache.org/jira/browse/SOLR-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061649#comment-13061649 ] Steven Rowe commented on SOLR-2331: --- bq. This patch de-anonymizes the FileInputStream and closes it after the file contents are printed out [and] wraps the persisted core config printing to STDOUT in an {{if (VERBOSE)}} block. Committed in r1144088. Refactor CoreContainer's SolrXML serialization code and improve testing --- Key: SOLR-2331 URL: https://issues.apache.org/jira/browse/SOLR-2331 Project: Solr Issue Type: Improvement Components: multicore Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.0 Attachments: SOLR-2331-fix-windows-file-deletion-failure.patch, SOLR-2331-fix-windows-file-deletion-failure.patch, SOLR-2331.patch CoreContainer has enough code in it - I'd like to factor out the solr.xml serialization code into SolrXMLSerializer or something - which should make testing it much easier and lightweight. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org