[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061785#comment-13061785 ] Bill Bell commented on SOLR-2242: - Are we ready to commit? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061788#comment-13061788 ] Jonathan Rochkind commented on SOLR-2242: - I am out of the office on vacation, I will return Monday July 11. I will not be checking email. For urgent Systems Department business, please contact Mercy Anaba, man...@jhu.edu,(410) 516-5306. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1825) SolrQuery.addFacetQuery should call setFacet(true)
[ https://issues.apache.org/jira/browse/SOLR-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061787#comment-13061787 ] Simon Willnauer commented on SOLR-1825: --- +1 - shoot for it SolrQuery.addFacetQuery should call setFacet(true) -- Key: SOLR-1825 URL: https://issues.apache.org/jira/browse/SOLR-1825 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.5 Reporter: David Smiley Assignee: Chris Male Priority: Trivial Attachments: SOLR-1825.patch, solr1825.patch Note that solrQuery.addFacetField(name) does enable faceting automatically but addFacetQuery does not. This is inconsistent. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2616) Include jdk14 logging configuration file
[ https://issues.apache.org/jira/browse/SOLR-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061791#comment-13061791 ] Bill Bell commented on SOLR-2616: - +1 please!! Include jdk14 logging configuration file Key: SOLR-2616 URL: https://issues.apache.org/jira/browse/SOLR-2616 Project: Solr Issue Type: Improvement Reporter: David Smiley Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2616_jdk14logging_setup.patch The /example/ Jetty Solr configuration should include a basic logging configuration file. Looking at this wiki page: http://wiki.apache.org/solr/LoggingInDefaultJettySetup I am creating this patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3291) bugs in memorycodec with lots of docs
bugs in memorycodec with lots of docs - Key: LUCENE-3291 URL: https://issues.apache.org/jira/browse/LUCENE-3291 Project: Lucene - Java Issue Type: Bug Components: core/codecs Affects Versions: 4.0 Reporter: Robert Muir While working on LUCENE-3290, I noticed a readVint that i thought should be a readVLong, so I wrote a test (Test2BPostings) to try to catch things like this... it takes about 5 minutes to run with MemoryCodec. The problem is, it dies on some other bug in FSTs first! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3291) bugs in memorycodec with lots of docs
[ https://issues.apache.org/jira/browse/LUCENE-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3291: Attachment: LUCENE-3291_test.patch here's the test, it indexes 26 terms (a..z) per doc about 80M times to create just over Integer.MAX_VALUE t-d pairs. with memorycodec (ant test-core -Dtestcase=Test2BPostings -Dtests.codec=Memory) it fails like this: {noformat} [junit] Caused by: java.lang.ArrayIndexOutOfBoundsException [junit] at java.lang.System.arraycopy(Native Method) [junit] at org.apache.lucene.util.fst.FST$BytesWriter.writeBytes(FST.java:855) [junit] at org.apache.lucene.util.fst.ByteSequenceOutputs.write(ByteSequenceOutputs.java:113) [junit] at org.apache.lucene.util.fst.ByteSequenceOutputs.write(ByteSequenceOutputs.java:32) [junit] at org.apache.lucene.util.fst.FST.addNode(FST.java:401) [junit] at org.apache.lucene.util.fst.NodeHash.add(NodeHash.java:120) [junit] at org.apache.lucene.util.fst.Builder.compileNode(Builder.java:153) [junit] at org.apache.lucene.util.fst.Builder.finish(Builder.java:440) [junit] at org.apache.lucene.index.codecs.memory.MemoryCodec$TermsWriter.finish(MemoryCodec.java:228) {noformat} bugs in memorycodec with lots of docs - Key: LUCENE-3291 URL: https://issues.apache.org/jira/browse/LUCENE-3291 Project: Lucene - Java Issue Type: Bug Components: core/codecs Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-3291_test.patch While working on LUCENE-3290, I noticed a readVint that i thought should be a readVLong, so I wrote a test (Test2BPostings) to try to catch things like this... it takes about 5 minutes to run with MemoryCodec. The problem is, it dies on some other bug in FSTs first! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3289) FST should allow controlling how hard builder tries to share suffixes
[ https://issues.apache.org/jira/browse/LUCENE-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061804#comment-13061804 ] Eks Dev commented on LUCENE-3289: - bq. The strings are extremely long (more like short documents) and probably need to be compressed in some different datastructure, e.g. a word-based one? That would be indeed cool, e.g. FST with words (ngrams?) as symbols. Ages ago we used one trie, for all unique terms to get prefix/edit distance on words and one word-trie (symbols were words via symbol table) for documents. I am sure this would cut memory requirements significantly for multiword cases when compared to char level FST. e.g. TermDictionary that supports ord() could be used as a symbol table. FST should allow controlling how hard builder tries to share suffixes - Key: LUCENE-3289 URL: https://issues.apache.org/jira/browse/LUCENE-3289 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.4, 4.0 Attachments: LUCENE-3289.patch, LUCENE-3289.patch Today we have a boolean option to the FST builder telling it whether it should share suffixes. If you turn this off, building is much faster, uses much less RAM, and the resulting FST is a prefix trie. But, the FST is larger than it needs to be. When it's on, the builder maintains a node hash holding every node seen so far in the FST -- this uses up RAM and slows things down. On a dataset that Elmer (see java-user thread Autocompletion on large index on Jul 6 2011) provided (thank you!), which is 1.32 M titles avg 67.3 chars per title, building with suffix sharing on took 22.5 seconds, required 1.25 GB heap, and produced 91.6 MB FST. With suffix sharing off, it was 8.2 seconds, 450 MB heap and 129 MB FST. I think we should allow this boolean to be shade-of-gray instead: usually, how well suffixes can share is a function of how far they are from the end of the string, so, by adding a tunable N to only share when suffix length N, we can let caller make reasonable tradeoffs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2641) Auto Facet Selection component
[ https://issues.apache.org/jira/browse/SOLR-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061810#comment-13061810 ] Upayavira commented on SOLR-2641: - Same issue with pivot facets (SOLR-792). I'm going to try to work it out (as a slow, background task). Auto Facet Selection component -- Key: SOLR-2641 URL: https://issues.apache.org/jira/browse/SOLR-2641 Project: Solr Issue Type: Improvement Components: SearchComponents - other Reporter: Erik Hatcher Assignee: Erik Hatcher Priority: Minor Attachments: SOLR_2641.patch It sure would be nice if you could have Solr automatically select field(s) for faceting based dynamically off the profile of the results. For example, you're indexing disparate types of products, all with varying attributes (color, size - like for apparel, memory_size - for electronics, subject - for books, etc), and a user searches for ipod where most products match products with color and memory_size attributes... let's automatically facet on those fields. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061823#comment-13061823 ] Bill Bell commented on SOLR-1725: - Is there a reason why this is not committed. It seems pretty awesome!! Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2452) rewrite solr build system
[ https://issues.apache.org/jira/browse/SOLR-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated SOLR-2452: -- Attachment: SOLR-2452.diffSource.py.patch.zip The solr2452 branch is now up-to-date with trunk, and I've committed to the branch the work that I was keeping as a script/patch pair. I think this is ready to commit to trunk. For review purposes, I'm attaching the zipped output from {{python -u diffSource.py trunk branches/solr2452}}, but the patch is huge, so I don't know how useful it will be. (I had to compress it because it exceeds JIRA's 10MB threshold.) I plan on merging the solr2452 branch back to trunk in about 24 hours, and then work on backporting the changes to branch_3x. rewrite solr build system - Key: SOLR-2452 URL: https://issues.apache.org/jira/browse/SOLR-2452 Project: Solr Issue Type: Task Components: Build Reporter: Robert Muir Assignee: Steven Rowe Fix For: 3.4, 4.0 Attachments: SOLR-2452-post-reshuffling.patch, SOLR-2452-post-reshuffling.patch, SOLR-2452-post-reshuffling.patch, SOLR-2452.diffSource.py.patch.zip, SOLR-2452.dir.reshuffle.sh, SOLR-2452.dir.reshuffle.sh As discussed some in SOLR-2002 (but that issue is long and hard to follow), I think we should rewrite the solr build system. Its slow, cumbersome, and messy, and makes it hard for us to improve things. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3292) IOContext should be part of the SegmentReader cache key
IOContext should be part of the SegmentReader cache key Key: LUCENE-3292 URL: https://issues.apache.org/jira/browse/LUCENE-3292 Project: Lucene - Java Issue Type: Task Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Varun Thacker Priority: Minor Fix For: 4.0 Once IOContext (LUCENE-2793) is landed the IOContext should be part of the key used to cache that reader in the pool -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3293) Use IOContext.READONCE in VarGapTermsIndexReader to load FST
Use IOContext.READONCE in VarGapTermsIndexReader to load FST Key: LUCENE-3293 URL: https://issues.apache.org/jira/browse/LUCENE-3293 Project: Lucene - Java Issue Type: Task Components: core/codecs Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Varun Thacker Priority: Minor Fix For: 4.0 VarGapTermsIndexReader should pass READONCE context down when it opens/reads the FST. Yet, it should just replace the ctx passed in, ie if we are merging vs reading we want to differentiate. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061836#comment-13061836 ] Simon Willnauer commented on LUCENE-2793: - I fixed the two minor things from above, created two followup issues (LUCENE-3292 LUCENE-3293) for the remaining TODOs and will go ahead reintegrating the branch now. Directory createOutput and openInput should take an IOContext - Key: LUCENE-2793 URL: https://issues.apache.org/jira/browse/LUCENE-2793 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Michael McCandless Assignee: Varun Thacker Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2793-nrt.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793_final.patch Today for merging we pass down a larger readBufferSize than for searching because we get better performance. I think we should generalize this to a class (IOContext), which would hold the buffer size, but then could hold other flags like DIRECT (bypass OS's buffer cache), SEQUENTIAL, etc. Then, we can make the DirectIOLinuxDirectory fully usable because we would only use DIRECT/SEQUENTIAL during merging. This will require fixing how IW pools readers, so that a reader opened for merging is not then used for searching, and vice/versa. Really, it's only all the open file handles that need to be different -- we could in theory share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061838#comment-13061838 ] Simon Willnauer commented on SOLR-1725: --- bq. Is there a reason why this is not committed. It seems pretty awesome!! indeed this looks good... somebody should bring it uptodate I guess :) Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated SOLR-2242: -- Comment: was deleted (was: I am out of the office on vacation, I will return Monday July 11. I will not be checking email. For urgent Systems Department business, please contact Mercy Anaba, man...@jhu.edu,(410) 516-5306. ) Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061839#comment-13061839 ] Simon Willnauer commented on SOLR-2242: --- bq. Are we ready to commit? bill, isnt't there a test failure still on this issue related to FC? Yonik mentioned BW compat issues here and promised to comment. I will ping him again. thanks for the patience simon Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-2793. - Resolution: Fixed Fix Version/s: IOContext branch 4.0 Lucene Fields: [New, Patch Available] (was: [New]) I reintegrated the branch and committed to trunk in revision 1144196. I will now go ahead and delete the branch. all further developments should happen on trunk. @Varun make sure you move you current work in progress to trunk and be careful with svn update on the branch since some of your changes might get lost. Thanks Varun... good job! Directory createOutput and openInput should take an IOContext - Key: LUCENE-2793 URL: https://issues.apache.org/jira/browse/LUCENE-2793 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Michael McCandless Assignee: Varun Thacker Labels: gsoc2011, lucene-gsoc-11, mentor Fix For: 4.0, IOContext branch Attachments: LUCENE-2793-nrt.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793_final.patch Today for merging we pass down a larger readBufferSize than for searching because we get better performance. I think we should generalize this to a class (IOContext), which would hold the buffer size, but then could hold other flags like DIRECT (bypass OS's buffer cache), SEQUENTIAL, etc. Then, we can make the DirectIOLinuxDirectory fully usable because we would only use DIRECT/SEQUENTIAL during merging. This will require fixing how IW pools readers, so that a reader opened for merging is not then used for searching, and vice/versa. Really, it's only all the open file handles that need to be different -- we could in theory share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 9414 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9414/ All tests passed Build Log (for compile errors): [...truncated 10240 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061872#comment-13061872 ] Simon Willnauer commented on LUCENE-2878: - {quote} I think I agree. The only possible trade-off that goes the other way is in the case where you have the positions available already during initial search/scoring, and there is not too much turnover in the TopDocs priority queue during hit collection. Then a Highlighter might save some time by not re-scoring and re-iterating the positions if it accumulated them up front (even for docs that were eventually dropped off the queue). I think it should be possible to test out both approaches given the right API here though? {quote} Yes, I think we should go and provide both possibilities here. {quote} The callback idea sounds appealing, but I still think we should also consider enabling the top-down approach: especially if this is going to run in two passes, why not let the highlighter drive the iteration? Keep in mind that positions consumers (like highlighters) may possibly be interested in more than just the lowest-level positions (they may want to see phrases, eg, and near-clauses - trying to avoid the s-word). {quote} I am not sure if I understand this correctly. I think the collector should be some kind of a visitor that walks down the query/scorer tree and each scorer can ask if it should pass the current positions to the collector something like this: {code} class PositionCollector { public boolean register(Scorer scorer) { if(interestedInScorere(scorere)) { // store infor about the scorer return true; } return false; } /* * Called by a registered scorer for each position change */ public void nexPosition(Scorer scorer) { // collect positions for the current scorer } } {code} that way the iteration process is still driven by the top-level consumer but if you need information about intermediate positions you can collect them. {quote} Another consideration is ordering. I think that positions are retrieved from the index in document order. This could be a natural order for many cases, but score order will also be useful. I'm not sure whose responsibility the sorting should be. Highlighters will want to be able to optimize their work (esp for very large documents) by terminating after considering only the first N matches, where the ordering could either be score or document-order. {quote} so the order here depends on the first collector I figure. the usual case it that you do your search and retrieve the top N documents (those are also the top N you want to highlight right?) then you pass in your top N and do the highlighting collection based on those top N. In that collection you are not interested all matches but only in the top N from the previous collection. The simplest yet maybe not the best way to do this is using a simple filter that is build from the top N docs. I will go ahead and create the branch now Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: Bulk Postings branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet,
Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 9414 - Failure
my bad - just committed a fix simon On Fri, Jul 8, 2011 at 11:41 AM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9414/ All tests passed Build Log (for compile errors): [...truncated 10240 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2616) Include jdk14 logging configuration file
[ https://issues.apache.org/jira/browse/SOLR-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061873#comment-13061873 ] Simon Willnauer commented on SOLR-2616: --- +1 Include jdk14 logging configuration file Key: SOLR-2616 URL: https://issues.apache.org/jira/browse/SOLR-2616 Project: Solr Issue Type: Improvement Reporter: David Smiley Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2616_jdk14logging_setup.patch The /example/ Jetty Solr configuration should include a basic logging configuration file. Looking at this wiki page: http://wiki.apache.org/solr/LoggingInDefaultJettySetup I am creating this patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3290) add FieldInvertState.numUniqueTerms, Terms.sumDocFreq
[ https://issues.apache.org/jira/browse/LUCENE-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061876#comment-13061876 ] Michael McCandless commented on LUCENE-3290: You are right -- nice catch! Can you change the sumTotalTF to be a readVLong? Thanks. add FieldInvertState.numUniqueTerms, Terms.sumDocFreq - Key: LUCENE-3290 URL: https://issues.apache.org/jira/browse/LUCENE-3290 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Robert Muir Assignee: Robert Muir Fix For: 4.0 Attachments: LUCENE-3290.patch For scoring systems like lnu.ltc (http://trec.nist.gov/pubs/trec16/papers/ibm-haifa.mq.final.pdf), we need to supply 3 stats: * average tf within d * # of unique terms within d * average number of unique terms across field If we add FieldInvertState.numUniqueTerms, you can incorporate the first two into your norms/docvalues (once we cut over), the average tf within d being length / numUniqueTerms. to compute the average across the field, we can just write the sum of all terms' docfreqs into the terms dictionary header, and you can then divide this by maxdoc to get the average. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 9398 - Failure
I committed a fix -- just a test bug. Mike McCandless http://blog.mikemccandless.com On Thu, Jul 7, 2011 at 1:27 PM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9398/ 1 tests failed. REGRESSION: org.apache.lucene.search.TestSpanQueryFilter.testFilterWorks Error Message: docIdSet doesn't contain docId 10 Stack Trace: junit.framework.AssertionFailedError: docIdSet doesn't contain docId 10 at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1435) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1353) at org.apache.lucene.search.TestSpanQueryFilter.assertContainsDocId(TestSpanQueryFilter.java:84) at org.apache.lucene.search.TestSpanQueryFilter.testFilterWorks(TestSpanQueryFilter.java:56) Build Log (for compile errors): [...truncated 1226 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3290) add FieldInvertState.numUniqueTerms, Terms.sumDocFreq
[ https://issues.apache.org/jira/browse/LUCENE-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061887#comment-13061887 ] Michael McCandless commented on LUCENE-3290: Patch looks awesome! Nice to add these additional status. add FieldInvertState.numUniqueTerms, Terms.sumDocFreq - Key: LUCENE-3290 URL: https://issues.apache.org/jira/browse/LUCENE-3290 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Robert Muir Assignee: Robert Muir Fix For: 4.0 Attachments: LUCENE-3290.patch For scoring systems like lnu.ltc (http://trec.nist.gov/pubs/trec16/papers/ibm-haifa.mq.final.pdf), we need to supply 3 stats: * average tf within d * # of unique terms within d * average number of unique terms across field If we add FieldInvertState.numUniqueTerms, you can incorporate the first two into your norms/docvalues (once we cut over), the average tf within d being length / numUniqueTerms. to compute the average across the field, we can just write the sum of all terms' docfreqs into the terms dictionary header, and you can then divide this by maxdoc to get the average. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3290) add FieldInvertState.numUniqueTerms, Terms.sumDocFreq
[ https://issues.apache.org/jira/browse/LUCENE-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3290: Attachment: LUCENE-3290.patch i committed the fix to memorycodec, synced the patch up to trunk, and renamed the confusing 'sumDF' variable in termsconsumer, that actually is no sumDF at all :) I think this is ready to go add FieldInvertState.numUniqueTerms, Terms.sumDocFreq - Key: LUCENE-3290 URL: https://issues.apache.org/jira/browse/LUCENE-3290 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Robert Muir Assignee: Robert Muir Fix For: 4.0 Attachments: LUCENE-3290.patch, LUCENE-3290.patch For scoring systems like lnu.ltc (http://trec.nist.gov/pubs/trec16/papers/ibm-haifa.mq.final.pdf), we need to supply 3 stats: * average tf within d * # of unique terms within d * average number of unique terms across field If we add FieldInvertState.numUniqueTerms, you can incorporate the first two into your norms/docvalues (once we cut over), the average tf within d being length / numUniqueTerms. to compute the average across the field, we can just write the sum of all terms' docfreqs into the terms dictionary header, and you can then divide this by maxdoc to get the average. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3294) Some code still compares string equality instead using equals
[ https://issues.apache.org/jira/browse/LUCENE-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-3294: Attachment: LUCENE-3294.patch here is a patch Some code still compares string equality instead using equals - Key: LUCENE-3294 URL: https://issues.apache.org/jira/browse/LUCENE-3294 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: LUCENE-3294.patch I found a couple of places where we still use string == otherstring which don't look correct. I will attache a patch soon. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3294) Some code still compares string equality instead using equals
Some code still compares string equality instead using equals - Key: LUCENE-3294 URL: https://issues.apache.org/jira/browse/LUCENE-3294 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: LUCENE-3294.patch I found a couple of places where we still use string == otherstring which don't look correct. I will attache a patch soon. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061904#comment-13061904 ] Mike Sokolov commented on LUCENE-2878: -- bq. I am not sure if I understand this correctly. I think the collector should be some kind of a visitor that walks down the query/scorer tree and each scorer can ask if it should pass the current positions to the collector something like this: Yes that sounds right Re: ordering; I was concerned about the order in which the positions are iterated within each document, not so much the order in which the documents are returned. I think this is an issue for the highlighter mostly, which can score position-ranges in the document so as to return the best snippet. This kind of score may be built up from tfidf scores for each term, proximity, length of the position-ranges and so on. Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: Bulk Postings branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet, currently only TermQuery / TermScorer implements this API and other simply return null instead. To show that the API really works and our BulkPostings work fine too with positions I cut over TermSpanQuery to use a TermScorer under the hood and nuked TermSpans entirely. A nice sideeffect of this was that the Position BulkReading implementation got some exercise which now :) work all with positions while Payloads for bulkreading are kind of experimental in the patch and those only work with Standard codec. So all spans now work on top of TermScorer ( I truly hate spans since today ) including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother to implement the other codecs yet since I want to get feedback on the API and on this first cut before I go one with it. I will upload the corresponding patch in a minute. I also had to cut over SpanQuery.getSpans(IR) to SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk first but after that pain today I need a break first :). The patch passes all core tests (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2633) Make SolrDispatchFilter testable and add tests
[ https://issues.apache.org/jira/browse/SOLR-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edoardo Tosca updated SOLR-2633: Attachment: SOLR-2633-only-tests.patch This second cut contains more tests which are convering about 80% of the code of the class under test. Make SolrDispatchFilter testable and add tests -- Key: SOLR-2633 URL: https://issues.apache.org/jira/browse/SOLR-2633 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1, 3.2, 3.3 Reporter: Edoardo Tosca Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2633-tests-only.patch I have ideas for possible extensions/enhancements to the SolrDispatchFilter. However, as it doesn't have any tests, making safe enhancements is difficult. Given its monolithic nature, it is hard to test. Therefore, I am proposing to refactor it to make it testable, and to provide tests for it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2633) Make SolrDispatchFilter testable and add tests
[ https://issues.apache.org/jira/browse/SOLR-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edoardo Tosca updated SOLR-2633: Attachment: (was: SOLR-2633-only-tests.patch) Make SolrDispatchFilter testable and add tests -- Key: SOLR-2633 URL: https://issues.apache.org/jira/browse/SOLR-2633 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1, 3.2, 3.3 Reporter: Edoardo Tosca Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2633-tests-only.patch I have ideas for possible extensions/enhancements to the SolrDispatchFilter. However, as it doesn't have any tests, making safe enhancements is difficult. Given its monolithic nature, it is hard to test. Therefore, I am proposing to refactor it to make it testable, and to provide tests for it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2633) Make SolrDispatchFilter testable and add tests
[ https://issues.apache.org/jira/browse/SOLR-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edoardo Tosca updated SOLR-2633: Attachment: SOLR-2633-tests-only.patch Make SolrDispatchFilter testable and add tests -- Key: SOLR-2633 URL: https://issues.apache.org/jira/browse/SOLR-2633 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1, 3.2, 3.3 Reporter: Edoardo Tosca Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2633-tests-only.patch, SOLR-2633-tests-only.patch I have ideas for possible extensions/enhancements to the SolrDispatchFilter. However, as it doesn't have any tests, making safe enhancements is difficult. Given its monolithic nature, it is hard to test. Therefore, I am proposing to refactor it to make it testable, and to provide tests for it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2633) Make SolrDispatchFilter testable and add tests
[ https://issues.apache.org/jira/browse/SOLR-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061915#comment-13061915 ] Edoardo Tosca commented on SOLR-2633: - I'm still struggling in trying to understand some bits of code in the doFilter method. Does anyone have an example of real usage of the management path? I'd like to cover that before refactoring. the incriminated piece of code is in SolrDispatchFilter, line 164-168 (pasted below): {code} // check for management path String alternate = cores.getManagementPath(); if (alternate != null path.startsWith(alternate)) { path = path.substring(0, alternate.length()); } {code} Thanks Make SolrDispatchFilter testable and add tests -- Key: SOLR-2633 URL: https://issues.apache.org/jira/browse/SOLR-2633 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1, 3.2, 3.3 Reporter: Edoardo Tosca Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2633-tests-only.patch, SOLR-2633-tests-only.patch I have ideas for possible extensions/enhancements to the SolrDispatchFilter. However, as it doesn't have any tests, making safe enhancements is difficult. Given its monolithic nature, it is hard to test. Therefore, I am proposing to refactor it to make it testable, and to provide tests for it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061918#comment-13061918 ] Simon Willnauer commented on LUCENE-2878: - mike I created a branch here: https://svn.apache.org/repos/asf/lucene/dev/branches/positions Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: Bulk Postings branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet, currently only TermQuery / TermScorer implements this API and other simply return null instead. To show that the API really works and our BulkPostings work fine too with positions I cut over TermSpanQuery to use a TermScorer under the hood and nuked TermSpans entirely. A nice sideeffect of this was that the Position BulkReading implementation got some exercise which now :) work all with positions while Payloads for bulkreading are kind of experimental in the patch and those only work with Standard codec. So all spans now work on top of TermScorer ( I truly hate spans since today ) including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother to implement the other codecs yet since I want to get feedback on the API and on this first cut before I go one with it. I will upload the corresponding patch in a minute. I also had to cut over SpanQuery.getSpans(IR) to SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk first but after that pain today I need a break first :). The patch passes all core tests (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3294) Some code still compares string equality instead using equals
[ https://issues.apache.org/jira/browse/LUCENE-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061925#comment-13061925 ] Michael McCandless commented on LUCENE-3294: Nice catch Simon -- looks good! Some code still compares string equality instead using equals - Key: LUCENE-3294 URL: https://issues.apache.org/jira/browse/LUCENE-3294 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: LUCENE-3294.patch I found a couple of places where we still use string == otherstring which don't look correct. I will attache a patch soon. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3294) Some code still compares string equality instead using equals
[ https://issues.apache.org/jira/browse/LUCENE-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-3294. - Resolution: Fixed Committed in revision 1144280 Some code still compares string equality instead using equals - Key: LUCENE-3294 URL: https://issues.apache.org/jira/browse/LUCENE-3294 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: LUCENE-3294.patch I found a couple of places where we still use string == otherstring which don't look correct. I will attache a patch soon. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
omitNorms and omitTermFreqAndPosition
Hi, i have a problem with omitTermFreqAndPosition and omitNorms. In my schema i have some fields with these property set True. for example the field category then i make a query like: select?q=category:(x OR y or Z) it returns all docs that have as category x or y or z. i make a debugQuery=on to see the score and i see every docs have different score. why? the tf is calculated and, also normalization. why? they should be have the same score.. cause it's not a full-text search but i search only docs that are inside a group. stop Thank you very much -- *Gastone Penzo* * *
[jira] [Created] (LUCENE-3295) BitVector never skips fully populated bytes when writing ClearedDgaps
BitVector never skips fully populated bytes when writing ClearedDgaps - Key: LUCENE-3295 URL: https://issues.apache.org/jira/browse/LUCENE-3295 Project: Lucene - Java Issue Type: Bug Components: core/other Affects Versions: 3.4, 4.0 Reporter: Simon Willnauer Priority: Minor Fix For: 3.4, 4.0 When writing cleared DGaps in BitVector we compare a byte against 0xFF (255) yet the byte is casted into an int (-1) and the comparison will never succeed. We should mask the byte with 0xFF before comparing or compare against -1 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-3295) BitVector never skips fully populated bytes when writing ClearedDgaps
[ https://issues.apache.org/jira/browse/LUCENE-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer reassigned LUCENE-3295: --- Assignee: Simon Willnauer BitVector never skips fully populated bytes when writing ClearedDgaps - Key: LUCENE-3295 URL: https://issues.apache.org/jira/browse/LUCENE-3295 Project: Lucene - Java Issue Type: Bug Components: core/other Affects Versions: 3.4, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 3.4, 4.0 Attachments: LUCENE-3295.patch When writing cleared DGaps in BitVector we compare a byte against 0xFF (255) yet the byte is casted into an int (-1) and the comparison will never succeed. We should mask the byte with 0xFF before comparing or compare against -1 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3295) BitVector never skips fully populated bytes when writing ClearedDgaps
[ https://issues.apache.org/jira/browse/LUCENE-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-3295: Attachment: LUCENE-3295.patch here is a simple patch and a test that at least exercise the code. BitVector never skips fully populated bytes when writing ClearedDgaps - Key: LUCENE-3295 URL: https://issues.apache.org/jira/browse/LUCENE-3295 Project: Lucene - Java Issue Type: Bug Components: core/other Affects Versions: 3.4, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 3.4, 4.0 Attachments: LUCENE-3295.patch When writing cleared DGaps in BitVector we compare a byte against 0xFF (255) yet the byte is casted into an int (-1) and the comparison will never succeed. We should mask the byte with 0xFF before comparing or compare against -1 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
FindBugs PMD ?
Developers, Any thoughts on using FindBugs PMD to catch more bugs in Lucene/Solr? Jenkins could be configured to run FindBugs PMD analysis nightly. It would have helped find this: (LUCENE-3294) Some code still compares string equality instead using equals I am aware there are a high degree of false-positives but there are ways of dealing with them, such as with @SuppressWarnings(PMD) and with //NOPMD and for Findbugs, there is @edu.umd.cs.findbugs.annotations.SuppressWarnings() and there's a fairly detailed configuration file for FindBugs to really control it and to make exceptions. I'd also really like to see use of FindBugs concurrency annotations @GuardedBy, @Immutable, @NotThreadSafe, @ThreadSafe. ~ David Smiley - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: FindBugs PMD ?
On Fri, Jul 8, 2011 at 10:08 AM, Smiley, David W. dsmi...@mitre.org wrote: Developers, Any thoughts on using FindBugs PMD to catch more bugs in Lucene/Solr? Jenkins could be configured to run FindBugs PMD analysis nightly. It would have helped find this: (LUCENE-3294) Some code still compares string equality instead using equals I am aware there are a high degree of false-positives but there are ways of dealing with them, such as with @SuppressWarnings(PMD) and with //NOPMD and for Findbugs, there is @edu.umd.cs.findbugs.annotations.SuppressWarnings() and there's a fairly detailed configuration file for FindBugs to really control it and to make exceptions. I'd also really like to see use of FindBugs concurrency annotations @GuardedBy, @Immutable, @NotThreadSafe, @ThreadSafe. I think its a good idea for nightly, but I am strongly against linking to an LGPL library for these annotations. I would prefer PMD instead, because of the license. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2331) Refactor CoreContainer's SolrXML serialization code and improve testing
[ https://issues.apache.org/jira/browse/SOLR-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-2331. --- Resolution: Fixed Thanks Steve - calling this one done - SolrCore needs more refactoring, but it can come in further issues. Noble has a great one going to factor out zookeeper parts as well. Refactor CoreContainer's SolrXML serialization code and improve testing --- Key: SOLR-2331 URL: https://issues.apache.org/jira/browse/SOLR-2331 Project: Solr Issue Type: Improvement Components: multicore Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.0 Attachments: SOLR-2331-fix-windows-file-deletion-failure.patch, SOLR-2331-fix-windows-file-deletion-failure.patch, SOLR-2331.patch CoreContainer has enough code in it - I'd like to factor out the solr.xml serialization code into SolrXMLSerializer or something - which should make testing it much easier and lightweight. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2616) Include jdk14 logging configuration file
[ https://issues.apache.org/jira/browse/SOLR-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061976#comment-13061976 ] Mark Miller commented on SOLR-2616: --- So while I'm pro putting the logging config file in for easy use, I'm not so sure about wiring it up out of the box. Perhaps I'm just over used to things going to the console while starting/deving with Solr - but it has become something I've gotten used to :) I was thinking we just put the file there, and modify any doc to alert that you can also start Solr with a -D command to use the example logging config file. I could see going either way though. Thoughts? Include jdk14 logging configuration file Key: SOLR-2616 URL: https://issues.apache.org/jira/browse/SOLR-2616 Project: Solr Issue Type: Improvement Reporter: David Smiley Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2616_jdk14logging_setup.patch The /example/ Jetty Solr configuration should include a basic logging configuration file. Looking at this wiki page: http://wiki.apache.org/solr/LoggingInDefaultJettySetup I am creating this patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-2331) Refactor CoreContainer's SolrXML serialization code and improve testing
[ https://issues.apache.org/jira/browse/SOLR-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061973#comment-13061973 ] Mark Miller edited comment on SOLR-2331 at 7/8/11 2:23 PM: --- Thanks Steve - calling this one done - CoreContainer needs more refactoring, but it can come in further issues. Noble has a great one going to factor out zookeeper parts as well. was (Author: markrmil...@gmail.com): Thanks Steve - calling this one done - SolrCore needs more refactoring, but it can come in further issues. Noble has a great one going to factor out zookeeper parts as well. Refactor CoreContainer's SolrXML serialization code and improve testing --- Key: SOLR-2331 URL: https://issues.apache.org/jira/browse/SOLR-2331 Project: Solr Issue Type: Improvement Components: multicore Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.0 Attachments: SOLR-2331-fix-windows-file-deletion-failure.patch, SOLR-2331-fix-windows-file-deletion-failure.patch, SOLR-2331.patch CoreContainer has enough code in it - I'd like to factor out the solr.xml serialization code into SolrXMLSerializer or something - which should make testing it much easier and lightweight. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2616) Include jdk14 logging configuration file
[ https://issues.apache.org/jira/browse/SOLR-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061978#comment-13061978 ] Robert Muir commented on SOLR-2616: --- what will wiring it up out of box do to tests (e.g. example tests)? Will running the tests now cause jetty to create files outside of the build/ folder? Include jdk14 logging configuration file Key: SOLR-2616 URL: https://issues.apache.org/jira/browse/SOLR-2616 Project: Solr Issue Type: Improvement Reporter: David Smiley Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2616_jdk14logging_setup.patch The /example/ Jetty Solr configuration should include a basic logging configuration file. Looking at this wiki page: http://wiki.apache.org/solr/LoggingInDefaultJettySetup I am creating this patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: FindBugs PMD ?
Rob, there is an ASL 2.0 licensed implementation here: https://github.com/stephenc/findbugs-annotations ~ David On Jul 8, 2011, at 10:12 AM, Robert Muir wrote: On Fri, Jul 8, 2011 at 10:08 AM, Smiley, David W. dsmi...@mitre.org wrote: Developers, Any thoughts on using FindBugs PMD to catch more bugs in Lucene/Solr? Jenkins could be configured to run FindBugs PMD analysis nightly. It would have helped find this: (LUCENE-3294) Some code still compares string equality instead using equals I am aware there are a high degree of false-positives but there are ways of dealing with them, such as with @SuppressWarnings(PMD) and with //NOPMD and for Findbugs, there is @edu.umd.cs.findbugs.annotations.SuppressWarnings() and there's a fairly detailed configuration file for FindBugs to really control it and to make exceptions. I'd also really like to see use of FindBugs concurrency annotations @GuardedBy, @Immutable, @NotThreadSafe, @ThreadSafe. I think its a good idea for nightly, but I am strongly against linking to an LGPL library for these annotations. I would prefer PMD instead, because of the license. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3295) BitVector never skips fully populated bytes when writing ClearedDgaps
[ https://issues.apache.org/jira/browse/LUCENE-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061981#comment-13061981 ] Robert Muir commented on LUCENE-3295: - good catch, just some thoughts looking at the test: * we should create a helper no-arg LTC.newIOContext() that uses LTC's random, or * should we need to actually pass IOcontext like this in tests explicitly? or, should MDW randomize the IOContexts that it passes down to its wrapped Dir? BitVector never skips fully populated bytes when writing ClearedDgaps - Key: LUCENE-3295 URL: https://issues.apache.org/jira/browse/LUCENE-3295 Project: Lucene - Java Issue Type: Bug Components: core/other Affects Versions: 3.4, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 3.4, 4.0 Attachments: LUCENE-3295.patch When writing cleared DGaps in BitVector we compare a byte against 0xFF (255) yet the byte is casted into an int (-1) and the comparison will never succeed. We should mask the byte with 0xFF before comparing or compare against -1 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3292) IOContext should be part of the SegmentReader cache key
[ https://issues.apache.org/jira/browse/LUCENE-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061982#comment-13061982 ] Varun Thacker commented on LUCENE-3292: --- I am not quite sure on how to start with this. In SegmentReader#get something like this is required : {noformat} if (readOnly) { assert context != IOContext.DEFAULT; //assert context.context == IOContext.Context.READ; // Using the second assert checks for both READ and READONCE } {noformat} And what do I need to do in IndexWriter.ReaderPool#get so that context should be part of the key used to cache that reader in the pool? IOContext should be part of the SegmentReader cache key Key: LUCENE-3292 URL: https://issues.apache.org/jira/browse/LUCENE-3292 Project: Lucene - Java Issue Type: Task Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Varun Thacker Priority: Minor Fix For: 4.0 Once IOContext (LUCENE-2793) is landed the IOContext should be part of the key used to cache that reader in the pool -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2616) Include jdk14 logging configuration file
[ https://issues.apache.org/jira/browse/SOLR-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061984#comment-13061984 ] Mark Miller commented on SOLR-2616: --- It will be an issue with tests as is I believe, but nothing we couldn't work around. Include jdk14 logging configuration file Key: SOLR-2616 URL: https://issues.apache.org/jira/browse/SOLR-2616 Project: Solr Issue Type: Improvement Reporter: David Smiley Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2616_jdk14logging_setup.patch The /example/ Jetty Solr configuration should include a basic logging configuration file. Looking at this wiki page: http://wiki.apache.org/solr/LoggingInDefaultJettySetup I am creating this patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3295) BitVector never skips fully populated bytes when writing ClearedDgaps
[ https://issues.apache.org/jira/browse/LUCENE-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061986#comment-13061986 ] Simon Willnauer commented on LUCENE-3295: - while those comments are really unrelated, how would you pass a randomized IOContext in the MDW? ignore the given one? I agree we should have a zero arg newIOContext() BitVector never skips fully populated bytes when writing ClearedDgaps - Key: LUCENE-3295 URL: https://issues.apache.org/jira/browse/LUCENE-3295 Project: Lucene - Java Issue Type: Bug Components: core/other Affects Versions: 3.4, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 3.4, 4.0 Attachments: LUCENE-3295.patch When writing cleared DGaps in BitVector we compare a byte against 0xFF (255) yet the byte is casted into an int (-1) and the comparison will never succeed. We should mask the byte with 0xFF before comparing or compare against -1 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2616) Include jdk14 logging configuration file
[ https://issues.apache.org/jira/browse/SOLR-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061991#comment-13061991 ] David Smiley commented on SOLR-2616: The logging configuration file I provided does not log to a file nor does it suppress logging to the console. There is some commented configuration to make it easier to log to a file. The net perceived effect of applying this patch should be no change. Include jdk14 logging configuration file Key: SOLR-2616 URL: https://issues.apache.org/jira/browse/SOLR-2616 Project: Solr Issue Type: Improvement Reporter: David Smiley Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2616_jdk14logging_setup.patch The /example/ Jetty Solr configuration should include a basic logging configuration file. Looking at this wiki page: http://wiki.apache.org/solr/LoggingInDefaultJettySetup I am creating this patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3295) BitVector never skips fully populated bytes when writing ClearedDgaps
[ https://issues.apache.org/jira/browse/LUCENE-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061992#comment-13061992 ] Robert Muir commented on LUCENE-3295: - we can have a true/false setter on MDW (randomizeIOContexts), so we control if it respects the given one (e.g. tests that actually want to test IOContext works) or not. BitVector never skips fully populated bytes when writing ClearedDgaps - Key: LUCENE-3295 URL: https://issues.apache.org/jira/browse/LUCENE-3295 Project: Lucene - Java Issue Type: Bug Components: core/other Affects Versions: 3.4, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 3.4, 4.0 Attachments: LUCENE-3295.patch When writing cleared DGaps in BitVector we compare a byte against 0xFF (255) yet the byte is casted into an int (-1) and the comparison will never succeed. We should mask the byte with 0xFF before comparing or compare against -1 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2452) rewrite solr build system
[ https://issues.apache.org/jira/browse/SOLR-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061999#comment-13061999 ] Robert Muir commented on SOLR-2452: --- playing around with the branch, the whole situation looks so much better to me. in my opinion we can then go and make other little improvements, make things faster, add new targets, in separate issues... so I think you should just commit before the patch goes out of date. maybe we even encounter some serious grief, but I think we should just work thru this in svn. great work! rewrite solr build system - Key: SOLR-2452 URL: https://issues.apache.org/jira/browse/SOLR-2452 Project: Solr Issue Type: Task Components: Build Reporter: Robert Muir Assignee: Steven Rowe Fix For: 3.4, 4.0 Attachments: SOLR-2452-post-reshuffling.patch, SOLR-2452-post-reshuffling.patch, SOLR-2452-post-reshuffling.patch, SOLR-2452.diffSource.py.patch.zip, SOLR-2452.dir.reshuffle.sh, SOLR-2452.dir.reshuffle.sh As discussed some in SOLR-2002 (but that issue is long and hard to follow), I think we should rewrite the solr build system. Its slow, cumbersome, and messy, and makes it hard for us to improve things. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2616) Include jdk14 logging configuration file
[ https://issues.apache.org/jira/browse/SOLR-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062000#comment-13062000 ] Mark Miller commented on SOLR-2616: --- bq. The logging configuration file I provided does not log to a file nor does it suppress logging to the console. The question in my mind is not what the patch does, but what should we do. If we want this as an example that is not hooked up, my preference would be to let the user know he should use -D to hook up the sample log file - not configure it in jetty.xml - we should still stay somewhat logging framework agnostic. In both cases I would prefer that the default log.properties file use the FileHandler rather than ConsoleHandler though. We should give something close to what you actually might want to use - which is not to setup logging to log to the console. First I'm gathering feedback from others though. My current leaning is to doc the wiki and what not to mention the sample log props and use of -D to put it in action, and to setup the default log props to log to the ./logs dir. Include jdk14 logging configuration file Key: SOLR-2616 URL: https://issues.apache.org/jira/browse/SOLR-2616 Project: Solr Issue Type: Improvement Reporter: David Smiley Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2616_jdk14logging_setup.patch The /example/ Jetty Solr configuration should include a basic logging configuration file. Looking at this wiki page: http://wiki.apache.org/solr/LoggingInDefaultJettySetup I am creating this patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2616) Include jdk14 logging configuration file
[ https://issues.apache.org/jira/browse/SOLR-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062002#comment-13062002 ] David Smiley commented on SOLR-2616: Ok. The main thing I wanted to accomplish in this patch, was to make it easy for me to enable debug logging for a particular logger and to actually see the results. Before this patch, the current state, I could use the logging admin page to enable debug logging for a known Solr logger but the debug output wouldn't go anywhere because the default threshold for the console logger is INFO. This patch includes a commented line to lower the console threshold. FYI I still *hate* JDK14 logging (aka JUL); but nonetheless it's the default as provided with Solr. Include jdk14 logging configuration file Key: SOLR-2616 URL: https://issues.apache.org/jira/browse/SOLR-2616 Project: Solr Issue Type: Improvement Reporter: David Smiley Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2616_jdk14logging_setup.patch The /example/ Jetty Solr configuration should include a basic logging configuration file. Looking at this wiki page: http://wiki.apache.org/solr/LoggingInDefaultJettySetup I am creating this patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2615) Have LogUpdateProcessor log each command (add, delete, ...) at debug/FINE level
[ https://issues.apache.org/jira/browse/SOLR-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062003#comment-13062003 ] Yonik Seeley commented on SOLR-2615: bq. Yonik, if I instead use a doDebug boolean flag initialized in the constructor, would that sufficiently satisfy you to commit this? Yep, I think so... Have LogUpdateProcessor log each command (add, delete, ...) at debug/FINE level --- Key: SOLR-2615 URL: https://issues.apache.org/jira/browse/SOLR-2615 Project: Solr Issue Type: Improvement Components: update Reporter: David Smiley Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2615_LogUpdateProcessor_debug_logging.patch It would be great if the LogUpdateProcessor logged each command (add, delete, ...) at debug (Fine) level. Presently it only logs a summary of 8 commands and it does so at the very end. The attached patch implements this. * I moved the LogUpdateProcessor ahead of RunUpdateProcessor so that the debug level log happens before Solr does anything with it. It should not affect the ordering of the existing summary log which happens at finish(). * I changed UpdateRequestProcessor's static log variable to be an instance variable that uses the current class name. I think this makes much more sense since I want to be able to alter logging levels for a specific processor without doing it for all of them. This change did require me to tweak the factory's detection of the log level which avoids creating the LogUpdateProcessor. * There was an NPE bug in AddUpdateCommand.getPrintableId() in the event there is no schema unique field. I fixed that. You may notice I use SLF4J's nifty log.debug(message blah {} blah, var) syntax, which is both performant and concise as there's no point in guarding the debug message with an isDebugEnabled() since debug() will internally check this any way and there is no string concatenation if debug isn't enabled. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2616) Include jdk14 logging configuration file
[ https://issues.apache.org/jira/browse/SOLR-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062004#comment-13062004 ] Robert Muir commented on SOLR-2616: --- {quote} My current leaning is to doc the wiki and what not to mention the sample log props and use of -D to put it in action, and to setup the default log props to log to the ./logs dir. {quote} yeah as long as we dont somehow create test meddling, I'm happy! There is already some hacks in the build somehow related to this: {noformat} in lucene/common-build.xml: property name=tests.loggingfile value=/dev/null/ and in the JUnitResultFormatter to reboot logging for each test case: try { LogManager.getLogManager().readConfiguration(); } catch (Exception e) {} {noformat} Include jdk14 logging configuration file Key: SOLR-2616 URL: https://issues.apache.org/jira/browse/SOLR-2616 Project: Solr Issue Type: Improvement Reporter: David Smiley Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2616_jdk14logging_setup.patch The /example/ Jetty Solr configuration should include a basic logging configuration file. Looking at this wiki page: http://wiki.apache.org/solr/LoggingInDefaultJettySetup I am creating this patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2616) Include jdk14 logging configuration file
[ https://issues.apache.org/jira/browse/SOLR-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062006#comment-13062006 ] Yonik Seeley commented on SOLR-2616: bq. My current leaning is to doc the wiki and what not to mention the sample log props and use of -D to put it in action, and to setup the default log props to log to the ./logs dir. That's a good plan I think. It does seem important for newbies to get the instant console feedback of address already in use or other exceptions. I actually find it pretty useful myself (when I forget that I already have an instance running, or just for seeing requests come in by default, etc). We can also document it right in the example/README.txt! Include jdk14 logging configuration file Key: SOLR-2616 URL: https://issues.apache.org/jira/browse/SOLR-2616 Project: Solr Issue Type: Improvement Reporter: David Smiley Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2616_jdk14logging_setup.patch The /example/ Jetty Solr configuration should include a basic logging configuration file. Looking at this wiki page: http://wiki.apache.org/solr/LoggingInDefaultJettySetup I am creating this patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2633) Make SolrDispatchFilter testable and add tests
[ https://issues.apache.org/jira/browse/SOLR-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062012#comment-13062012 ] Mark Miller commented on SOLR-2633: --- The heck if I know. The comment says: {code} /** * Sets the alternate path for multicore handling: * This is used in case there is a registered unnamed core (aka name is ) to * declare an alternate way of accessing named cores. * This can also be used in a pseudo single-core environment so admins can prepare * a new version before swapping. * @param path */ {code} But the code is: {code} // check for management path String alternate = cores.getManagementPath(); if (alternate != null path.startsWith(alternate)) { path = path.substring(0, alternate.length()); } {code} This simply checks if the path starts with your management path (say /manage), and then sets the path to the management path - I don't see how this triggers or does anything later though... Does anyone out there use this or know what if it does/did work? Perhaps it should just go away. Make SolrDispatchFilter testable and add tests -- Key: SOLR-2633 URL: https://issues.apache.org/jira/browse/SOLR-2633 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1, 3.2, 3.3 Reporter: Edoardo Tosca Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2633-tests-only.patch, SOLR-2633-tests-only.patch I have ideas for possible extensions/enhancements to the SolrDispatchFilter. However, as it doesn't have any tests, making safe enhancements is difficult. Given its monolithic nature, it is hard to test. Therefore, I am proposing to refactor it to make it testable, and to provide tests for it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2588) Make Velocity an optional dependency in SolrCore
[ https://issues.apache.org/jira/browse/SOLR-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062016#comment-13062016 ] David Smiley commented on SOLR-2588: I'm surprised velocity became a core dependency, but nonetheless I think it should be possible to use Solr in an embedded fashion without pulling in extraneous dependencies like velocity and others. What if these response writers were initialized on-demand? This would increase startup time decrease memory usage just a little since most people aren't actually going to use all response writers that Solr supports. I'm willing to put together a patch. Make Velocity an optional dependency in SolrCore Key: SOLR-2588 URL: https://issues.apache.org/jira/browse/SOLR-2588 Project: Solr Issue Type: Wish Affects Versions: 3.2 Reporter: Gunnar Wagenknecht Priority: Minor Fix For: 3.4, 4.0 In 1.4. it was fine to run Solr without Velocity on the classpath. However, in 3.2. SolrCore won't load because of a hard reference to the Velocity response writer in a static initializer. {noformat} ... ERROR org.apache.solr.core.CoreContainer - java.lang.NoClassDefFoundError: org/apache/velocity/context/Context at org.apache.solr.core.SolrCore.clinit(SolrCore.java:1447) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2633) Make SolrDispatchFilter testable and add tests
[ https://issues.apache.org/jira/browse/SOLR-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062033#comment-13062033 ] David Smiley commented on SOLR-2633: I think it's great that this issue is going to make it more testable. But why is SolrDispatchFilter a filter in the first place instead of a Servlet? This is somewhat off-topic, but if perhaps it should be a servlet then this issue is majorly disrupted by such a change. Make SolrDispatchFilter testable and add tests -- Key: SOLR-2633 URL: https://issues.apache.org/jira/browse/SOLR-2633 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1, 3.2, 3.3 Reporter: Edoardo Tosca Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2633-tests-only.patch, SOLR-2633-tests-only.patch I have ideas for possible extensions/enhancements to the SolrDispatchFilter. However, as it doesn't have any tests, making safe enhancements is difficult. Given its monolithic nature, it is hard to test. Therefore, I am proposing to refactor it to make it testable, and to provide tests for it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2633) Make SolrDispatchFilter testable and add tests
[ https://issues.apache.org/jira/browse/SOLR-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062034#comment-13062034 ] Edoardo Tosca commented on SOLR-2633: - {quote} This simply checks if the path starts with your management path (say /manage), and then sets the path to the management path - I don't see how this triggers or does anything later though... {quote} exactly right. I saw the comment (i forgot to paste it previously), i tried to add a managementPath=/manage attribute to solr.xml and see what it could trigger but i haven't discovered anything :( thanks Make SolrDispatchFilter testable and add tests -- Key: SOLR-2633 URL: https://issues.apache.org/jira/browse/SOLR-2633 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1, 3.2, 3.3 Reporter: Edoardo Tosca Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2633-tests-only.patch, SOLR-2633-tests-only.patch I have ideas for possible extensions/enhancements to the SolrDispatchFilter. However, as it doesn't have any tests, making safe enhancements is difficult. Given its monolithic nature, it is hard to test. Therefore, I am proposing to refactor it to make it testable, and to provide tests for it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2633) Make SolrDispatchFilter testable and add tests
[ https://issues.apache.org/jira/browse/SOLR-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062037#comment-13062037 ] Mark Miller commented on SOLR-2633: --- bq. But why is SolrDispatchFilter a filter in the first place instead of a Servlet? It used to be a Servlet once. I cannot remember the history of the change - hossman probably does. Make SolrDispatchFilter testable and add tests -- Key: SOLR-2633 URL: https://issues.apache.org/jira/browse/SOLR-2633 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1, 3.2, 3.3 Reporter: Edoardo Tosca Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2633-tests-only.patch, SOLR-2633-tests-only.patch I have ideas for possible extensions/enhancements to the SolrDispatchFilter. However, as it doesn't have any tests, making safe enhancements is difficult. Given its monolithic nature, it is hard to test. Therefore, I am proposing to refactor it to make it testable, and to provide tests for it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2588) Make Velocity an optional dependency in SolrCore
[ https://issues.apache.org/jira/browse/SOLR-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062044#comment-13062044 ] Erik Hatcher commented on SOLR-2588: Sorry - I missed this when it first got posted, and David's comment bumped it... it was intentional to make Velocity a core component as the idea being that we'd use it for built-in admin UI. So far we're only using it for the /browse interface though. I get the argument that Velocity ideally shouldn't be required to embed Solr though. I'm ok with the Velocity writer creation either being in the try/catch as Ryan posted, or pulling it out of the default writers and having it be explicitly configured in solrconfig.xml for our example app. Make Velocity an optional dependency in SolrCore Key: SOLR-2588 URL: https://issues.apache.org/jira/browse/SOLR-2588 Project: Solr Issue Type: Wish Affects Versions: 3.2 Reporter: Gunnar Wagenknecht Priority: Minor Fix For: 3.4, 4.0 In 1.4. it was fine to run Solr without Velocity on the classpath. However, in 3.2. SolrCore won't load because of a hard reference to the Velocity response writer in a static initializer. {noformat} ... ERROR org.apache.solr.core.CoreContainer - java.lang.NoClassDefFoundError: org/apache/velocity/context/Context at org.apache.solr.core.SolrCore.clinit(SolrCore.java:1447) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062056#comment-13062056 ] Robert Muir commented on SOLR-2399: --- its great to see all this progress here! I had one suggestion, I felt this way about Version 1980 too... should we default the verbose checbox for analysis to on? I could be in the minority here, am I the only one who clicks verbose every time when using this? Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Assignee: Ryan McKinley Priority: Minor Fix For: 4.0 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, SOLR-2399-110606.patch, SOLR-2399-110622.patch, SOLR-2399-110702.patch, SOLR-2399-110702.patch, SOLR-2399-admin-interface.patch, SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] *Features:* * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png] * [Query-Form|http://files.mathe.is/solr-admin/02_query.png] * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png] * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, SOLR-2400) * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482) * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] * [Replication|http://files.mathe.is/solr-admin/10_replication.png] * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png] * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459) ** Stub (using static data) Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062058#comment-13062058 ] Erik Hatcher commented on SOLR-2399: verbose default to on please, yes: +1 - I always check that myself, and teach it that way to others. Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Assignee: Ryan McKinley Priority: Minor Fix For: 4.0 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, SOLR-2399-110606.patch, SOLR-2399-110622.patch, SOLR-2399-110702.patch, SOLR-2399-110702.patch, SOLR-2399-admin-interface.patch, SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] *Features:* * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png] * [Query-Form|http://files.mathe.is/solr-admin/02_query.png] * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png] * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, SOLR-2400) * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482) * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] * [Replication|http://files.mathe.is/solr-admin/10_replication.png] * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png] * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459) ** Stub (using static data) Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062061#comment-13062061 ] Uwe Schindler commented on SOLR-2399: - verbose on: +1 Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Assignee: Ryan McKinley Priority: Minor Fix For: 4.0 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, SOLR-2399-110606.patch, SOLR-2399-110622.patch, SOLR-2399-110702.patch, SOLR-2399-110702.patch, SOLR-2399-admin-interface.patch, SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] *Features:* * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png] * [Query-Form|http://files.mathe.is/solr-admin/02_query.png] * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png] * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, SOLR-2400) * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482) * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] * [Replication|http://files.mathe.is/solr-admin/10_replication.png] * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png] * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459) ** Stub (using static data) Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: FindBugs PMD ?
Just a stupid question: Once you add those annotations, wouldn't the JAR file not require then this annotations.jar? Or are all of them not available to runtime? Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Smiley, David W. [mailto:dsmi...@mitre.org] Sent: Friday, July 08, 2011 4:30 PM To: dev@lucene.apache.org Subject: Re: FindBugs PMD ? Rob, there is an ASL 2.0 licensed implementation here: https://github.com/stephenc/findbugs-annotations ~ David On Jul 8, 2011, at 10:12 AM, Robert Muir wrote: On Fri, Jul 8, 2011 at 10:08 AM, Smiley, David W. dsmi...@mitre.org wrote: Developers, Any thoughts on using FindBugs PMD to catch more bugs in Lucene/Solr? Jenkins could be configured to run FindBugs PMD analysis nightly. It would have helped find this: (LUCENE-3294) Some code still compares string equality instead using equals I am aware there are a high degree of false-positives but there are ways of dealing with them, such as with @SuppressWarnings(PMD) and with //NOPMD and for Findbugs, there is @edu.umd.cs.findbugs.annotations.SuppressWarnings() and there's a fairly detailed configuration file for FindBugs to really control it and to make exceptions. I'd also really like to see use of FindBugs concurrency annotations @GuardedBy, @Immutable, @NotThreadSafe, @ThreadSafe. I think its a good idea for nightly, but I am strongly against linking to an LGPL library for these annotations. I would prefer PMD instead, because of the license. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Matheis (steffkes) updated SOLR-2399: Attachment: SOLR-2399-110702.patch bq. verbose on: +1 I've updated the last Patch, now based on SVN-Rev {{1144392}} -- verbose activated per default Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Assignee: Ryan McKinley Priority: Minor Fix For: 4.0 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, SOLR-2399-110606.patch, SOLR-2399-110622.patch, SOLR-2399-110702.patch, SOLR-2399-110702.patch, SOLR-2399-110702.patch, SOLR-2399-admin-interface.patch, SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] *Features:* * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png] * [Query-Form|http://files.mathe.is/solr-admin/02_query.png] * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png] * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, SOLR-2400) * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482) * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] * [Replication|http://files.mathe.is/solr-admin/10_replication.png] * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png] * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459) ** Stub (using static data) Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2541) Plugininfo tries to load nodes of type long
[ https://issues.apache.org/jira/browse/SOLR-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-2541. Resolution: Fixed Fix Version/s: 4.0 3.4 Assignee: Hoss Man Frank: your assumption was spot on, definitely a bug the way PluginInfo was ignore long. thank you so much for the test! Committed revision 1144415. - trunk Committed revision 1144417. - 3x Plugininfo tries to load nodes of type long - Key: SOLR-2541 URL: https://issues.apache.org/jira/browse/SOLR-2541 Project: Solr Issue Type: Bug Affects Versions: 3.1 Environment: all Reporter: Frank Wesemann Assignee: Hoss Man Fix For: 3.4, 4.0 Attachments: PlugininfoTest.java, Solr-2541.patch As of version 3.1 Plugininfo adds all nodes whose types are not lst,str,int,bool,arr,float or double to the children list. The type long is missing in the NL_TAGS set. I assume this a bug because DOMUtil recognizes this type, so I consider it a valid tag in solrconfig.xml Maybe it's time for a dtd? Or one may define SolrConfig.nodetypes somewhere. I'll add a patch, that extends the NL_TAGS Set. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-3.x - Build # 9438 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9438/ 1 tests failed. REGRESSION: org.apache.lucene.facet.search.CategoryListIteratorTest.testPayloadIntDecodingIterator Error Message: expected category not found: 3 Stack Trace: junit.framework.AssertionFailedError: expected category not found: 3 at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1277) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1195) at org.apache.lucene.facet.search.CategoryListIteratorTest.testPayloadIntDecodingIterator(CategoryListIteratorTest.java:125) Build Log (for compile errors): [...truncated 8788 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3295) BitVector never skips fully populated bytes when writing ClearedDgaps
[ https://issues.apache.org/jira/browse/LUCENE-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3295: --- Attachment: LUCENE-3295.patch Egads, thanks Simon! I found a few more crazy problems with BitVector (patch attached, merged with the first patch), and added some asserts and a few more test cases. BitVector never skips fully populated bytes when writing ClearedDgaps - Key: LUCENE-3295 URL: https://issues.apache.org/jira/browse/LUCENE-3295 Project: Lucene - Java Issue Type: Bug Components: core/other Affects Versions: 3.4, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 3.4, 4.0 Attachments: LUCENE-3295.patch, LUCENE-3295.patch When writing cleared DGaps in BitVector we compare a byte against 0xFF (255) yet the byte is casted into an int (-1) and the comparison will never succeed. We should mask the byte with 0xFF before comparing or compare against -1 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3292) IOContext should be part of the SegmentReader cache key
[ https://issues.apache.org/jira/browse/LUCENE-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062119#comment-13062119 ] Michael McCandless commented on LUCENE-3292: Right now the ReaderPool.readerMap is a MapSegmentInfo,SegmentReader. I think we just need to change that to MapSegmentInfoAndIOContext,SegmentReader instead, where SegmentInfoAndIOContext is a new struct holding SegmentInfo and IOContext.Context and implementing hashCode/equals by delegating to the SegmentInfo and IOContext.Context. IOContext should be part of the SegmentReader cache key Key: LUCENE-3292 URL: https://issues.apache.org/jira/browse/LUCENE-3292 Project: Lucene - Java Issue Type: Task Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Varun Thacker Priority: Minor Fix For: 4.0 Once IOContext (LUCENE-2793) is landed the IOContext should be part of the key used to cache that reader in the pool -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: FindBugs PMD ?
The annotations defined by FindBugs are marked with CLASS retention, which means there shouldn't be a runtime dependency. However the JCIP (Java Concurrency In Practice, a book) annotations, such as @ThreadSafe, are unfortunately marked with RUNTIME retention. Information I've found leads me to believe that in Java 6, there is no runtime or compile time dependency for 3rd party libraries using Lucene/Solr if there are annotations there, but Java 5 has problems with it: https://issues.apache.org/jira/browse/HTTPCLIENT-866Just now I messaged the maintainer of the ASL licensed cleanroom port of the findbugs annotations to see if he'll do the same for the JCIP ones. ~ David On Jul 8, 2011, at 1:16 PM, Uwe Schindler wrote: Just a stupid question: Once you add those annotations, wouldn't the JAR file not require then this annotations.jar? Or are all of them not available to runtime? Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Smiley, David W. [mailto:dsmi...@mitre.org] Sent: Friday, July 08, 2011 4:30 PM To: dev@lucene.apache.org Subject: Re: FindBugs PMD ? Rob, there is an ASL 2.0 licensed implementation here: https://github.com/stephenc/findbugs-annotations ~ David On Jul 8, 2011, at 10:12 AM, Robert Muir wrote: On Fri, Jul 8, 2011 at 10:08 AM, Smiley, David W. dsmi...@mitre.org wrote: Developers, Any thoughts on using FindBugs PMD to catch more bugs in Lucene/Solr? Jenkins could be configured to run FindBugs PMD analysis nightly. It would have helped find this: (LUCENE-3294) Some code still compares string equality instead using equals I am aware there are a high degree of false-positives but there are ways of dealing with them, such as with @SuppressWarnings(PMD) and with //NOPMD and for Findbugs, there is @edu.umd.cs.findbugs.annotations.SuppressWarnings() and there's a fairly detailed configuration file for FindBugs to really control it and to make exceptions. I'd also really like to see use of FindBugs concurrency annotations @GuardedBy, @Immutable, @NotThreadSafe, @ThreadSafe. I think its a good idea for nightly, but I am strongly against linking to an LGPL library for these annotations. I would prefer PMD instead, because of the license. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: FindBugs PMD ?
For build integration, I'd like to do this on the maven side. It's easier there, and it should not matter that it's not the official build since it only needs to be run by Jenkins which is already running the maven build any way. ~ David On Jul 8, 2011, at 10:08 AM, Smiley, David W. wrote: Developers, Any thoughts on using FindBugs PMD to catch more bugs in Lucene/Solr? Jenkins could be configured to run FindBugs PMD analysis nightly. It would have helped find this: (LUCENE-3294) Some code still compares string equality instead using equals I am aware there are a high degree of false-positives but there are ways of dealing with them, such as with @SuppressWarnings(PMD) and with //NOPMD and for Findbugs, there is @edu.umd.cs.findbugs.annotations.SuppressWarnings() and there's a fairly detailed configuration file for FindBugs to really control it and to make exceptions. I'd also really like to see use of FindBugs concurrency annotations @GuardedBy, @Immutable, @NotThreadSafe, @ThreadSafe. ~ David Smiley - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3282) BlockJoinQuery: Allow to add a custom child collector, and customize the parent bitset extraction
[ https://issues.apache.org/jira/browse/LUCENE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062151#comment-13062151 ] Michael McCandless commented on LUCENE-3282: This looks great Shay! What was the use case for subclassing to translate the filter into OBS? Is it a custom filter cache? Makes me nervous because the app really should create reuse this OBS filter, usually... On the Collector: we try to keep our Querys IR-state-free... so it makes me nervous to stick a Collector right on the Query. Can we add a CollectorProvider that the Query invokes when it makes the Weight/Scorer? Instead of NoOpCollector can we just check for null? BlockJoinQuery: Allow to add a custom child collector, and customize the parent bitset extraction - Key: LUCENE-3282 URL: https://issues.apache.org/jira/browse/LUCENE-3282 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: 3.4, 4.0 Reporter: Shay Banon Attachments: LUCENE-3282.patch It would be nice to allow to add a custom child collector to the BlockJoinQuery to be called on every matching doc (so we can do things with it, like counts and such). Also, allow to extend BlockJoinQuery to have a custom code that converts the filter bitset to an OpenBitSet. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2535) REGRESSION: in Solr 3.x and trunk the admin/file handler fails to show directory listings
[ https://issues.apache.org/jira/browse/SOLR-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062155#comment-13062155 ] Erick Erickson commented on SOLR-2535: -- OK, I've applied the patch to both 3x and trunk and it looks good. If nobody objects I'll commit this Monday. REGRESSION: in Solr 3.x and trunk the admin/file handler fails to show directory listings - Key: SOLR-2535 URL: https://issues.apache.org/jira/browse/SOLR-2535 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 3.1, 3.2, 4.0 Environment: java 1.6, jetty Reporter: Peter Wolanin Assignee: Erick Erickson Fix For: 3.4, 4.0 Attachments: SOLR-2535.patch, SOLR-2535_fix_admin_file_handler_for_directory_listings.patch In Solr 1.4.1, going to the path solr/admin/file I see an XML-formatted listing of the conf directory, like: {noformat} response lst name=responseHeaderint name=status0/intint name=QTime1/int/lst lst name=files lst name=elevate.xmllong name=size1274/longdate name=modified2011-03-06T20:42:54Z/date/lst ... /lst /response {noformat} I can list the xslt sub-dir using solr/admin/files?file=/xslt In Solr 3.1.0, both of these fail with a 500 error: {noformat} HTTP ERROR 500 Problem accessing /solr/admin/file/. Reason: did not find a CONTENT object java.io.IOException: did not find a CONTENT object {noformat} Looking at the code in class ShowFileRequestHandler, it seem like 3.1.0 should still handle directory listings if not file name is given, or if the file is a directory, so I am filing this as a bug. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile
[ https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062161#comment-13062161 ] Shawn Heisey commented on SOLR-1972: I tried to add a percentile of 100, so I could see the slowest query, and it didn't seem to do anything. I'll be changing the following line so it works: if (percentile = 0 percentile 100) { Need additional query stats in admin interface - median, 95th and 99th percentile - Key: SOLR-1972 URL: https://issues.apache.org/jira/browse/SOLR-1972 Project: Solr Issue Type: Improvement Affects Versions: 1.4 Reporter: Shawn Heisey Priority: Minor Attachments: SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, elyograg-1972-3.2.patch, elyograg-1972-trunk.patch I would like to see more detailed query statistics from the admin GUI. This is what you can get now: requests : 809 errors : 0 timeouts : 0 totalTime : 70053 avgTimePerRequest : 86.59209 avgRequestsPerSecond : 0.8148785 I'd like to see more data on the time per request - median, 95th percentile, 99th percentile, and any other statistical function that makes sense to include. In my environment, the first bunch of queries after startup tend to take several seconds each. I find that the average value tends to be useless until it has several thousand queries under its belt and the caches are thoroughly warmed. The statistical functions I have mentioned would quickly eliminate the influence of those initial slow queries. The system will have to store individual data about each query. I don't know if this is something Solr does already. It would be nice to have a configurable count of how many of the most recent data points are kept, to control the amount of memory the feature uses. The default value could be something like 1024 or 4096. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2535) REGRESSION: in Solr 3.x and trunk the admin/file handler fails to show directory listings
[ https://issues.apache.org/jira/browse/SOLR-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062162#comment-13062162 ] Simon Willnauer commented on SOLR-2535: --- bq. OK, I've applied the patch to both 3x and trunk and it looks good. If nobody objects I'll commit this Monday. don't wait too long, no need to wait until monday. REGRESSION: in Solr 3.x and trunk the admin/file handler fails to show directory listings - Key: SOLR-2535 URL: https://issues.apache.org/jira/browse/SOLR-2535 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 3.1, 3.2, 4.0 Environment: java 1.6, jetty Reporter: Peter Wolanin Assignee: Erick Erickson Fix For: 3.4, 4.0 Attachments: SOLR-2535.patch, SOLR-2535_fix_admin_file_handler_for_directory_listings.patch In Solr 1.4.1, going to the path solr/admin/file I see an XML-formatted listing of the conf directory, like: {noformat} response lst name=responseHeaderint name=status0/intint name=QTime1/int/lst lst name=files lst name=elevate.xmllong name=size1274/longdate name=modified2011-03-06T20:42:54Z/date/lst ... /lst /response {noformat} I can list the xslt sub-dir using solr/admin/files?file=/xslt In Solr 3.1.0, both of these fail with a 500 error: {noformat} HTTP ERROR 500 Problem accessing /solr/admin/file/. Reason: did not find a CONTENT object java.io.IOException: did not find a CONTENT object {noformat} Looking at the code in class ShowFileRequestHandler, it seem like 3.1.0 should still handle directory listings if not file name is given, or if the file is a directory, so I am filing this as a bug. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2541) Plugininfo tries to load nodes of type long
[ https://issues.apache.org/jira/browse/SOLR-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062168#comment-13062168 ] Frank Wesemann commented on SOLR-2541: -- Thanks for taking this issue, Hoss. Btw: Do you know the reason for this change? I regarded the old rule load/instantiate everything that has a class attribute as a good practice. Plugininfo tries to load nodes of type long - Key: SOLR-2541 URL: https://issues.apache.org/jira/browse/SOLR-2541 Project: Solr Issue Type: Bug Affects Versions: 3.1 Environment: all Reporter: Frank Wesemann Assignee: Hoss Man Fix For: 3.4, 4.0 Attachments: PlugininfoTest.java, Solr-2541.patch As of version 3.1 Plugininfo adds all nodes whose types are not lst,str,int,bool,arr,float or double to the children list. The type long is missing in the NL_TAGS set. I assume this a bug because DOMUtil recognizes this type, so I consider it a valid tag in solrconfig.xml Maybe it's time for a dtd? Or one may define SolrConfig.nodetypes somewhere. I'll add a patch, that extends the NL_TAGS Set. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3280) Add new bit set impl for caching filters
[ https://issues.apache.org/jira/browse/LUCENE-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3280: --- Attachment: LUCENE-3280.patch New patch, renaming to FixedBitSet, adding test (adapted from TestOBS's), adding getBits, hashCode, equals. I think it's ready to commit! Add new bit set impl for caching filters Key: LUCENE-3280 URL: https://issues.apache.org/jira/browse/LUCENE-3280 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.4, 4.0 Attachments: LUCENE-3280.patch, LUCENE-3280.patch I think OpenBitSet is trying to satisfy too many audiences, and it's confusing/error-proned as a result. It has int/long variants of many methods. Some methods require in-bound access, others don't; of those others, some methods auto-grow the bits, some don't. OpenBitSet doesn't always know its numBits. I'd like to factor out a more focused bit set impl whose primary target usage is a cached Lucene Filter, ie a bit set indexed by docID (int, not long) whose size is known and fixed up front (backed by final long[]) and is always accessed in-bounds. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2535) REGRESSION: in Solr 3.x and trunk the admin/file handler fails to show directory listings
[ https://issues.apache.org/jira/browse/SOLR-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062180#comment-13062180 ] Yonik Seeley commented on SOLR-2535: bq. don't wait too long, no need to wait until monday. +1, commit it now! Esp for a bug fix, unless one thinks there is something likely controversial about it, or one is unsure about something and is thus requesting feedback. REGRESSION: in Solr 3.x and trunk the admin/file handler fails to show directory listings - Key: SOLR-2535 URL: https://issues.apache.org/jira/browse/SOLR-2535 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 3.1, 3.2, 4.0 Environment: java 1.6, jetty Reporter: Peter Wolanin Assignee: Erick Erickson Fix For: 3.4, 4.0 Attachments: SOLR-2535.patch, SOLR-2535_fix_admin_file_handler_for_directory_listings.patch In Solr 1.4.1, going to the path solr/admin/file I see an XML-formatted listing of the conf directory, like: {noformat} response lst name=responseHeaderint name=status0/intint name=QTime1/int/lst lst name=files lst name=elevate.xmllong name=size1274/longdate name=modified2011-03-06T20:42:54Z/date/lst ... /lst /response {noformat} I can list the xslt sub-dir using solr/admin/files?file=/xslt In Solr 3.1.0, both of these fail with a 500 error: {noformat} HTTP ERROR 500 Problem accessing /solr/admin/file/. Reason: did not find a CONTENT object java.io.IOException: did not find a CONTENT object {noformat} Looking at the code in class ShowFileRequestHandler, it seem like 3.1.0 should still handle directory listings if not file name is given, or if the file is a directory, so I am filing this as a bug. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
I really hate to even mention formatting code, but I don't want to gaffe too badly on my first commit.
This is one of those topics that generates far more passion than it deserves, all I want to know is what the norms are. Personally, I tend to reformat the entire file. But then I didn't cut my eye teeth on code that a zillion other people work with and I fully appreciate that the diffs get hard to read, you can't easily separate the code changes from the format changes, so I'm not *proposing* reformatting the whole file... So do we take a page from Martin Fowler's Refactoring book and only reformat the parts that we're working on? What about reformatting the whole file and noting in the checkin notes reformat only, no code changes? (assuming an egregiously badly-formatted file that one is working on). I guess the more I think about it the more sense only reformatting the bits we're working on makes. I'd guess that someone working on a large patch would...er...not appreciate merge conflicts because of reformatting even though it's s easy Again, I'm just looking for norms here. I suspect this topic has been...er...discussed upon occasion in loud tones with much table pounding... Thanks, Erick P.S. This is relative to my first checkin, SOLR-2535 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2452) rewrite solr build system
[ https://issues.apache.org/jira/browse/SOLR-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062184#comment-13062184 ] Steven Rowe commented on SOLR-2452: --- Merged with trunk, committed in r1144510. (Forgot to include issue number in log comment.) rewrite solr build system - Key: SOLR-2452 URL: https://issues.apache.org/jira/browse/SOLR-2452 Project: Solr Issue Type: Task Components: Build Reporter: Robert Muir Assignee: Steven Rowe Fix For: 3.4, 4.0 Attachments: SOLR-2452-post-reshuffling.patch, SOLR-2452-post-reshuffling.patch, SOLR-2452-post-reshuffling.patch, SOLR-2452.diffSource.py.patch.zip, SOLR-2452.dir.reshuffle.sh, SOLR-2452.dir.reshuffle.sh As discussed some in SOLR-2002 (but that issue is long and hard to follow), I think we should rewrite the solr build system. Its slow, cumbersome, and messy, and makes it hard for us to improve things. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: I really hate to even mention formatting code, but I don't want to gaffe too badly on my first commit.
On Fri, Jul 8, 2011 at 5:06 PM, Erick Erickson erickerick...@gmail.com wrote: This is one of those topics that generates far more passion than it deserves, all I want to know is what the norms are. Personally, I tend to reformat the entire file. But then I didn't cut my eye teeth on code that a zillion other people work with and I fully appreciate that the diffs get hard to read, you can't easily separate the code changes from the format changes, so I'm not *proposing* reformatting the whole file... My opinion: just get rid of the shitty bad in whatever way is convenient for you. -- lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: I really hate to even mention formatting code, but I don't want to gaffe too badly on my first commit.
On Fri, Jul 8, 2011 at 5:11 PM, Robert Muir rcm...@gmail.com wrote: On Fri, Jul 8, 2011 at 5:06 PM, Erick Erickson erickerick...@gmail.com wrote: This is one of those topics that generates far more passion than it deserves, all I want to know is what the norms are. Personally, I tend to reformat the entire file. But then I didn't cut my eye teeth on code that a zillion other people work with and I fully appreciate that the diffs get hard to read, you can't easily separate the code changes from the format changes, so I'm not *proposing* reformatting the whole file... My opinion: just get rid of the shitty bad in whatever way is convenient for you. +1 Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3290) add FieldInvertState.numUniqueTerms, Terms.sumDocFreq
[ https://issues.apache.org/jira/browse/LUCENE-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-3290. - Resolution: Fixed Fix Version/s: 3.4 The FieldInvertState.numUniqueTerms portion is backported to 3.x (no collection level stats are in 3.x in general, seems tricky) add FieldInvertState.numUniqueTerms, Terms.sumDocFreq - Key: LUCENE-3290 URL: https://issues.apache.org/jira/browse/LUCENE-3290 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Robert Muir Assignee: Robert Muir Fix For: 3.4, 4.0 Attachments: LUCENE-3290.patch, LUCENE-3290.patch For scoring systems like lnu.ltc (http://trec.nist.gov/pubs/trec16/papers/ibm-haifa.mq.final.pdf), we need to supply 3 stats: * average tf within d * # of unique terms within d * average number of unique terms across field If we add FieldInvertState.numUniqueTerms, you can incorporate the first two into your norms/docvalues (once we cut over), the average tf within d being length / numUniqueTerms. to compute the average across the field, we can just write the sum of all terms' docfreqs into the terms dictionary header, and you can then divide this by maxdoc to get the average. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
heads up: reindex trunk indexes
i just committed https://issues.apache.org/jira/browse/LUCENE-3290, you need to re-index. -- lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3290) add FieldInvertState.numUniqueTerms, Terms.sumDocFreq
[ https://issues.apache.org/jira/browse/LUCENE-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062202#comment-13062202 ] Yonik Seeley commented on LUCENE-3290: -- Is there currently a way to get the number of documents that have a value in the field? Then one could compute the average length of a (sparse) field via sumTotalTermFreq(field)/docsWithField(field) docsWithField(field) would be useful in other contexts that want to know how sparse a field is (automatically selecting faceting algorithms, etc). add FieldInvertState.numUniqueTerms, Terms.sumDocFreq - Key: LUCENE-3290 URL: https://issues.apache.org/jira/browse/LUCENE-3290 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Robert Muir Assignee: Robert Muir Fix For: 3.4, 4.0 Attachments: LUCENE-3290.patch, LUCENE-3290.patch For scoring systems like lnu.ltc (http://trec.nist.gov/pubs/trec16/papers/ibm-haifa.mq.final.pdf), we need to supply 3 stats: * average tf within d * # of unique terms within d * average number of unique terms across field If we add FieldInvertState.numUniqueTerms, you can incorporate the first two into your norms/docvalues (once we cut over), the average tf within d being length / numUniqueTerms. to compute the average across the field, we can just write the sum of all terms' docfreqs into the terms dictionary header, and you can then divide this by maxdoc to get the average. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: I really hate to even mention formatting code, but I don't want to gaffe too badly on my first commit.
: What about reformatting the whole file and noting in the checkin notes : reformat only, no code changes? (assuming an egregiously : badly-formatted file that one is working on). objection to these types of of commits has historicly been that it may make it harder to apply patches that other people people submit or have submitted (ie: a patch in jira against trunk that hasn't been applied yet, or a patch someone makes against the 3.3 release that can no longer apply to the 3x branch because of the code reformatting, etc...) i use to agree with that as part of the general philosohy of if it aint broke, don't fix it ... but i think over time my definition of broke has changed ... code you can't read because it isn't indented consistently is (t ome) broken code. so fix away. but yes: we should at least isolate such changes. I'd much rather see these two commits: +2/-2 == fixed race condition in DocWriter +425/-410 == fixed consistent formatting in DocWriter ...then see this... +427/-412 == fixed race condition and formatting in DocWriter -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile
[ https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-1972: --- Attachment: elyograg-1972-trunk.patch elyograg-1972-3.2.patch Of course, adding support for a 100th percentile was NOT as easy as simply changing to =. New patches. Need additional query stats in admin interface - median, 95th and 99th percentile - Key: SOLR-1972 URL: https://issues.apache.org/jira/browse/SOLR-1972 Project: Solr Issue Type: Improvement Affects Versions: 1.4 Reporter: Shawn Heisey Priority: Minor Attachments: SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, elyograg-1972-3.2.patch, elyograg-1972-3.2.patch, elyograg-1972-trunk.patch, elyograg-1972-trunk.patch I would like to see more detailed query statistics from the admin GUI. This is what you can get now: requests : 809 errors : 0 timeouts : 0 totalTime : 70053 avgTimePerRequest : 86.59209 avgRequestsPerSecond : 0.8148785 I'd like to see more data on the time per request - median, 95th percentile, 99th percentile, and any other statistical function that makes sense to include. In my environment, the first bunch of queries after startup tend to take several seconds each. I find that the average value tends to be useless until it has several thousand queries under its belt and the caches are thoroughly warmed. The statistical functions I have mentioned would quickly eliminate the influence of those initial slow queries. The system will have to store individual data about each query. I don't know if this is something Solr does already. It would be nice to have a configurable count of how many of the most recent data points are kept, to control the amount of memory the feature uses. The default value could be something like 1024 or 4096. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3290) add FieldInvertState.numUniqueTerms, Terms.sumDocFreq
[ https://issues.apache.org/jira/browse/LUCENE-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062214#comment-13062214 ] Robert Muir commented on LUCENE-3290: - not at the moment, we would have to write this separately. add FieldInvertState.numUniqueTerms, Terms.sumDocFreq - Key: LUCENE-3290 URL: https://issues.apache.org/jira/browse/LUCENE-3290 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Robert Muir Assignee: Robert Muir Fix For: 3.4, 4.0 Attachments: LUCENE-3290.patch, LUCENE-3290.patch For scoring systems like lnu.ltc (http://trec.nist.gov/pubs/trec16/papers/ibm-haifa.mq.final.pdf), we need to supply 3 stats: * average tf within d * # of unique terms within d * average number of unique terms across field If we add FieldInvertState.numUniqueTerms, you can incorporate the first two into your norms/docvalues (once we cut over), the average tf within d being length / numUniqueTerms. to compute the average across the field, we can just write the sum of all terms' docfreqs into the terms dictionary header, and you can then divide this by maxdoc to get the average. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2633) Make SolrDispatchFilter testable and add tests
[ https://issues.apache.org/jira/browse/SOLR-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1306#comment-1306 ] Hoss Man commented on SOLR-2633: the change to a filter was in SOLR-104 some of the history may be spelled out there, or on the mailing list arround the same time, but i believe the crux of hte issue was wanting to ensure we could support dispatching to request handlers using path basd names (ie: http://host:8983/solr/admin/foo - requestHandler name=/admin/foo/) and still allow fallthrough to other jsps / servlets if a requestHandler with teh specified name couldn't be found. using a Filter made this a little saner as i recall, and when multicore support was added, gave us the added bonus of being able to ensure that jsps could be used even when the base path was the core name. Make SolrDispatchFilter testable and add tests -- Key: SOLR-2633 URL: https://issues.apache.org/jira/browse/SOLR-2633 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1, 3.2, 3.3 Reporter: Edoardo Tosca Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2633-tests-only.patch, SOLR-2633-tests-only.patch I have ideas for possible extensions/enhancements to the SolrDispatchFilter. However, as it doesn't have any tests, making safe enhancements is difficult. Given its monolithic nature, it is hard to test. Therefore, I am proposing to refactor it to make it testable, and to provide tests for it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3292) IOContext should be part of the SegmentReader cache key
[ https://issues.apache.org/jira/browse/LUCENE-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Thacker updated LUCENE-3292: -- Attachment: LUCENE-3292.patch Initial patch. There might have been a few methods inside ReaderPool where I have added a IOContext which might not be correct. IOContext should be part of the SegmentReader cache key Key: LUCENE-3292 URL: https://issues.apache.org/jira/browse/LUCENE-3292 Project: Lucene - Java Issue Type: Task Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Varun Thacker Priority: Minor Fix For: 4.0 Attachments: LUCENE-3292.patch Once IOContext (LUCENE-2793) is landed the IOContext should be part of the key used to cache that reader in the pool -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: FindBugs PMD ?
: Any thoughts on using FindBugs PMD to catch more bugs in Lucene/Solr? : Jenkins could be configured to run FindBugs PMD analysis nightly. It : would have helped find this: I was a big fan of PMD for a while, particularly because of how easy it is to not only tweak/customize the rulesets and severities, but also to write new rules entirely from scratch based on the specific principles of your domain. Examples of things that i'm fairly certain would be straightforward PMD rules to write and might be invaluable to Lucene are along the lines of... * No class should use Arrays.copy * No class should call any of hte following constructors: [ insert list of of all IO realted Constructors we can think of that rely on platform charset ] * all classes in the code base which implement CodecProvider must call Codecs.registerCodec(this) in at least one constructor * etc... I use to use PMD quite a bit in a former life, but kind of got out of hte habbit and frequently forget about it. somewhere in Jira is a [patch i submitted to change the solr build system to use PMD, and generate an XML report that would then be parsed and trigger a failure if there were any violations above a specified threshold, but if i remember correctly there was some sort of licensing issue at the time (maybe with the xslts PMD used to generate readable reports from the XML?) I think all of that predated hudson/jenkins having it's own plugin for rendering hte XML results though, which is probably when 99% of the people would want to read it so i don't think there would be anything stoping us from using it now. -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-3.x - Build # 9446 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9446/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestIndexWriterOnDiskFull.testAddDocumentOnDiskFull Error Message: CFS has pending open files Stack Trace: java.lang.IllegalStateException: CFS has pending open files at org.apache.lucene.store.CompoundFileWriter.close(CompoundFileWriter.java:140) at org.apache.lucene.store.CompoundFileDirectory.close(CompoundFileDirectory.java:182) at org.apache.lucene.store.DefaultCompoundFileDirectory.close(DefaultCompoundFileDirectory.java:58) at org.apache.lucene.index.SegmentMerger.createCompoundFile(SegmentMerger.java:139) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4252) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3863) at org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:37) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2715) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2710) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2706) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3513) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2064) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2031) at org.apache.lucene.index.TestIndexWriterOnDiskFull.addDoc(TestIndexWriterOnDiskFull.java:539) at org.apache.lucene.index.TestIndexWriterOnDiskFull.testAddDocumentOnDiskFull(TestIndexWriterOnDiskFull.java:74) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1277) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1195) Build Log (for compile errors): [...truncated 10581 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2633) Make SolrDispatchFilter testable and add tests
[ https://issues.apache.org/jira/browse/SOLR-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062261#comment-13062261 ] Ryan McKinley commented on SOLR-2633: - Oh man... SOLR-104! The key reason was backwards compatibility. At the time there was a servlet mapping /select -- Changing to a filter let us map /* and optionally override the leggacy servlet if that was configured in solrconfg (handleSelect=true) Make SolrDispatchFilter testable and add tests -- Key: SOLR-2633 URL: https://issues.apache.org/jira/browse/SOLR-2633 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1, 3.2, 3.3 Reporter: Edoardo Tosca Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2633-tests-only.patch, SOLR-2633-tests-only.patch I have ideas for possible extensions/enhancements to the SolrDispatchFilter. However, as it doesn't have any tests, making safe enhancements is difficult. Given its monolithic nature, it is hard to test. Therefore, I am proposing to refactor it to make it testable, and to provide tests for it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2633) Make SolrDispatchFilter testable and add tests
[ https://issues.apache.org/jira/browse/SOLR-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062264#comment-13062264 ] Ryan McKinley commented on SOLR-2633: - bq. ensure we could support dispatching to request handlers using path basd names FYI, either servlets or filters would allow us to do this (assuming we map /solr/*) but a filter allows things to fall though to other servlets/filters if we don't match something with the path based request handlers. As an asside, this is how Apache Wicket works. Make SolrDispatchFilter testable and add tests -- Key: SOLR-2633 URL: https://issues.apache.org/jira/browse/SOLR-2633 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1, 3.2, 3.3 Reporter: Edoardo Tosca Assignee: Mark Miller Priority: Minor Fix For: 3.4, 4.0 Attachments: SOLR-2633-tests-only.patch, SOLR-2633-tests-only.patch I have ideas for possible extensions/enhancements to the SolrDispatchFilter. However, as it doesn't have any tests, making safe enhancements is difficult. Given its monolithic nature, it is hard to test. Therefore, I am proposing to refactor it to make it testable, and to provide tests for it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: heads up: reindex trunk indexes
Thanks for the heads up! how hard would it be to write something to upgrade the terms dictionary header? I just released software with a recent trunk :( ryan On Fri, Jul 8, 2011 at 5:28 PM, Robert Muir rcm...@gmail.com wrote: i just committed https://issues.apache.org/jira/browse/LUCENE-3290, you need to re-index. -- lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2027) SolrJ FacetField should never return null from getValues()
[ https://issues.apache.org/jira/browse/SOLR-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Male resolved SOLR-2027. -- Resolution: Fixed Fix Version/s: 4.0 Committed revision 1144561. SolrJ FacetField should never return null from getValues() -- Key: SOLR-2027 URL: https://issues.apache.org/jira/browse/SOLR-2027 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 1.4.1 Reporter: David Smiley Assignee: Chris Male Priority: Minor Fix For: 4.0 Attachments: SOLR-2027.patch In some circumstances, FacetField.getValues() will return null. I'd like for my iteration code to simply be: {code:java} for (c : ff.getValues() { // ... } {code} However this will throw an NPE if I don't wrap this in either a null check or this check: ff.getValueCount() 0 I propose that getValues() return a Collections.EMPTY_LIST if the internal value is null. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org