[jira] Commented: (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007345#comment-13007345 ] Bill Bell commented on SOLR-2242: - OK I did the required work, can we get more feedback or get it committed? What else is needed? > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1 > Here is an example on field "hgid" (without namedistinct): > {code} > - > - > 1 > 1 > 1 > 1 > 1 > 5 > 1 > > > {code} > With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, > HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, > HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows > (7), not the number of values (11). > {code} > - > - > 7 > > > {code} > This works actually really good to get total number of fields for a > group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] Resolved: (SOLR-2426) Build failing
THis started working when I did the following: #cd C:\Users\bbell\solr #ant compile #cd solr #ant example If I did a direct "ant example" it was giving the errors below. I'll double check my java version too. On 3/15/11 5:53 AM, "Robert Muir (JIRA)" wrote: > > [ >https://issues.apache.org/jira/browse/SOLR-2426?page=com.atlassian.jira.pl >ugin.system.issuetabpanels:all-tabpanel ] > >Robert Muir resolved SOLR-2426. >--- > >Resolution: Not A Problem > >Trunk requires java 6. > >> Build failing >> - >> >> Key: SOLR-2426 >> URL: https://issues.apache.org/jira/browse/SOLR-2426 >> Project: Solr >> Issue Type: Bug >>Reporter: Bill Bell >> >> ant clean >> ant example >> trunk >> [javac] ^ >> [javac] >>C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSetHitCo >> llector.java:77: incompatible types >> [javac] found : org.apache.solr.search.BitDocSet >> [javac] required: org.apache.solr.search.DocSet >> [javac] return new BitDocSet(bits,pos); >> [javac] ^ >> [javac] >>C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSetHitCo >> llector.java:132: incompatible types >> [javac] found : org.apache.solr.search.SortedIntDocSet >> [javac] required: org.apache.solr.search.DocSet >> [javac] return new SortedIntDocSet(scratch, pos); >> [javac] ^ >> [javac] >>C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSetHitCo >> llector.java:136: incompatible types >> [javac] found : org.apache.solr.search.BitDocSet >> [javac] required: org.apache.solr.search.DocSet >> [javac] return new BitDocSet(bits,pos); >> [javac] ^ >> [javac] >>C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja >> va:26: org.apache.solr.search.DocSlice is not abstract and does not >>override abs >> tract method getTopFilter() in org.apache.solr.search.DocSet >> [javac] public class DocSlice extends DocSetBase implements DocList >>{ >> [javac]^ >> [javac] >>C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja >> va:54: incompatible types >> [javac] found : org.apache.solr.search.DocSlice >> [javac] required: org.apache.solr.search.DocList >> [javac] if (this.offset == offset && this.len==len) return this; >> [javac]^ >> [javac] >>C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja >> va:62: incompatible types >> [javac] found : org.apache.solr.search.DocSlice >> [javac] required: org.apache.solr.search.DocList >> [javac] if (this.offset == offset && this.len == realLen) >>return this; >> [javac] >> ^ >> [javac] >>C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja >> va:63: incompatible types >> [javac] found : org.apache.solr.search.DocSlice >> [javac] required: org.apache.solr.search.DocList >> [javac] return new DocSlice(offset, realLen, docs, scores, >>matches, maxS >> core); >> [javac]^ >> [javac] >>C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja >> va:130: intersection(org.apache.solr.search.DocSet) in >>org.apache.solr.search.Do >> cSet cannot be applied to (org.apache.solr.search.DocSlice) >> [javac] return other.intersection(this); >> [javac] ^ >> [javac] >>C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja >> va:139: intersectionSize(org.apache.solr.search.DocSet) in >>org.apache.solr.searc >> h.DocSet cannot be applied to (org.apache.solr.search.DocSlice) >> [javac] return other.intersectionSize(this); >> [javac] ^ >> [javac] >>C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\ExtendedDis >> maxQParserPlugin.java:829: warning: [unchecked] unchecked conversion >> [javac] found : java.util.List >> [javac] required: >>java.util.List >> [javac] Query q = super.getBooleanQuery(clauses, >>disableCoord); >> [javac] ^ >> [javac] >>C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\ExtendedDis >> maxQParserPlugin.java:845: warning: [unchecked] unchecked conversion >> [javac] found : java.util.List >> [javac] required: >>java.util.List >> [javac] super.addClause(clauses, conj, mods, q); >> [javac] ^ >> [javac] >>C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\FastLRUCach >> e.java:107: warning: [unchecked] unchecked cast >> [javac] found : java.lang.Object >> [javac] required: >>java.util.List> che.Stats> >> [javac] statsList = (List) >>persistence; >> [javac] ^ >> [javac] >>C:\Users\bbell\solr\solr\src\java\or
[jira] Commented: (SOLR-2429) ability to not cache a filter
[ https://issues.apache.org/jira/browse/SOLR-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007340#comment-13007340 ] David Smiley commented on SOLR-2429: Heh, me too! I was pondering this last night; I know specific queries will needlessly pollute the cache. I was imagining a syntax such as this: fq={!cache=no}queryhere > ability to not cache a filter > - > > Key: SOLR-2429 > URL: https://issues.apache.org/jira/browse/SOLR-2429 > Project: Solr > Issue Type: New Feature >Reporter: Yonik Seeley > > A user should be able to add {!cache=false} to a query or filter query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-1499) SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ
[ https://issues.apache.org/jira/browse/SOLR-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007314#comment-13007314 ] Lance Norskog edited comment on SOLR-1499 at 3/16/11 3:52 AM: -- Yes you can! * The source index has to store all of the fields. * I would do a series of short queries rather than one long one. Thank you for thinking of this. It could also be used to recombine cores- you can change your partitioning strategy, for example. was (Author: lancenorskog): Yes you can! * The source index has to store all of the fields. * I would do a series of short queries rather than one long one. Thank you for thinking of this. > SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via > SolrJ > - > > Key: SOLR-1499 > URL: https://issues.apache.org/jira/browse/SOLR-1499 > Project: Solr > Issue Type: New Feature > Components: contrib - DataImportHandler >Reporter: Lance Norskog >Assignee: Erik Hatcher > Fix For: Next > > Attachments: SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, > SOLR-1499.patch, SOLR-1499.patch > > > The SolrEntityProcessor queries an external Solr instance. The Solr documents > returned are unpacked and emitted as DIH fields. > The SolrEntityProcessor uses the following attributes: > * solr='http://localhost:8983/solr/sms' > ** This gives the URL of the target Solr instance. > *** Note: the connection to the target Solr uses the binary SolrJ format. > * query='Jefferson&sort=id+asc' > ** This gives the base query string use with Solr. It can include any > standard Solr request parameter. This attribute is processed under the > variable resolution rules and can be driven in an inner stage of the indexing > pipeline. > * rows='10' > ** This gives the number of rows to fetch per request.. > ** The SolrEntityProcessor always fetches every document that matches the > request.. > * fields='id,tag' > ** This selects the fields to be returned from the Solr request. > ** These must also be declared as elements. > ** As with all fields, template processors can be used to alter the contents > to be passed downwards. > * timeout='30' > ** This limits the query to 5 seconds. This can be used as a fail-safe to > prevent the indexing session from freezing up. By default the timeout is 5 > minutes. > Limitations: > * Solr errors are not handled correctly. > * Loop control constructs have not been tested. > * Multi-valued returned fields have not been tested. > The unit tests give examples of how to use it as the root entity and an inner > entity. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2429) ability to not cache a filter
[ https://issues.apache.org/jira/browse/SOLR-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007324#comment-13007324 ] Otis Gospodnetic commented on SOLR-2429: I'm with Hoss. For many months now, I've been dreaming about the possibility of telling Solr to execute a query without caching the results. > ability to not cache a filter > - > > Key: SOLR-2429 > URL: https://issues.apache.org/jira/browse/SOLR-2429 > Project: Solr > Issue Type: New Feature >Reporter: Yonik Seeley > > A user should be able to add {!cache=false} to a query or filter query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1499) SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ
[ https://issues.apache.org/jira/browse/SOLR-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007314#comment-13007314 ] Lance Norskog commented on SOLR-1499: - Yes you can! * The source index has to store all of the fields. * I would do a series of short queries rather than one long one. Thank you for thinking of this. > SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via > SolrJ > - > > Key: SOLR-1499 > URL: https://issues.apache.org/jira/browse/SOLR-1499 > Project: Solr > Issue Type: New Feature > Components: contrib - DataImportHandler >Reporter: Lance Norskog >Assignee: Erik Hatcher > Fix For: Next > > Attachments: SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, > SOLR-1499.patch, SOLR-1499.patch > > > The SolrEntityProcessor queries an external Solr instance. The Solr documents > returned are unpacked and emitted as DIH fields. > The SolrEntityProcessor uses the following attributes: > * solr='http://localhost:8983/solr/sms' > ** This gives the URL of the target Solr instance. > *** Note: the connection to the target Solr uses the binary SolrJ format. > * query='Jefferson&sort=id+asc' > ** This gives the base query string use with Solr. It can include any > standard Solr request parameter. This attribute is processed under the > variable resolution rules and can be driven in an inner stage of the indexing > pipeline. > * rows='10' > ** This gives the number of rows to fetch per request.. > ** The SolrEntityProcessor always fetches every document that matches the > request.. > * fields='id,tag' > ** This selects the fields to be returned from the Solr request. > ** These must also be declared as elements. > ** As with all fields, template processors can be used to alter the contents > to be passed downwards. > * timeout='30' > ** This limits the query to 5 seconds. This can be used as a fail-safe to > prevent the indexing session from freezing up. By default the timeout is 5 > minutes. > Limitations: > * Solr errors are not handled correctly. > * Loop control constructs have not been tested. > * Multi-valued returned fields have not been tested. > The unit tests give examples of how to use it as the root entity and an inner > entity. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (LUCENE-2969) fix two stopwords typos
fix two stopwords typos --- Key: LUCENE-2969 URL: https://issues.apache.org/jira/browse/LUCENE-2969 Project: Lucene - Java Issue Type: Bug Components: contrib/analyzers Reporter: Robert Muir Priority: Minor Attachments: LUCENE-2969.patch See: http://svn.tartarus.org/snowball?view=rev&revision=543 http://permalink.gmane.org/gmane.comp.search.snowball/1249 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2969) fix two stopwords typos
[ https://issues.apache.org/jira/browse/LUCENE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2969: Attachment: LUCENE-2969.patch > fix two stopwords typos > --- > > Key: LUCENE-2969 > URL: https://issues.apache.org/jira/browse/LUCENE-2969 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/analyzers >Reporter: Robert Muir >Priority: Minor > Attachments: LUCENE-2969.patch > > > See: > http://svn.tartarus.org/snowball?view=rev&revision=543 > http://permalink.gmane.org/gmane.comp.search.snowball/1249 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2429) ability to not cache a filter
[ https://issues.apache.org/jira/browse/SOLR-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007265#comment-13007265 ] Ryan McKinley commented on SOLR-2429: - I'm not sure this is related -- it could be -- I'm looking writing a custom query from: {code:java} @Override public Query getFieldQuery(QParser parser, SchemaField field, String externalVal) {code} and it would be great to know if this is used as a filter or not -- should it include scoring? Are there ways to build the query where parts are cached and some is not? > ability to not cache a filter > - > > Key: SOLR-2429 > URL: https://issues.apache.org/jira/browse/SOLR-2429 > Project: Solr > Issue Type: New Feature >Reporter: Yonik Seeley > > A user should be able to add {!cache=false} to a query or filter query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1499) SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ
[ https://issues.apache.org/jira/browse/SOLR-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007236#comment-13007236 ] Ahmet Arslan commented on SOLR-1499: Hi, Can i use this to upgrade solr version? Where the lucene/solr indices are not compatible? Thanks, Ahmet > SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via > SolrJ > - > > Key: SOLR-1499 > URL: https://issues.apache.org/jira/browse/SOLR-1499 > Project: Solr > Issue Type: New Feature > Components: contrib - DataImportHandler >Reporter: Lance Norskog >Assignee: Erik Hatcher > Fix For: Next > > Attachments: SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, > SOLR-1499.patch, SOLR-1499.patch > > > The SolrEntityProcessor queries an external Solr instance. The Solr documents > returned are unpacked and emitted as DIH fields. > The SolrEntityProcessor uses the following attributes: > * solr='http://localhost:8983/solr/sms' > ** This gives the URL of the target Solr instance. > *** Note: the connection to the target Solr uses the binary SolrJ format. > * query='Jefferson&sort=id+asc' > ** This gives the base query string use with Solr. It can include any > standard Solr request parameter. This attribute is processed under the > variable resolution rules and can be driven in an inner stage of the indexing > pipeline. > * rows='10' > ** This gives the number of rows to fetch per request.. > ** The SolrEntityProcessor always fetches every document that matches the > request.. > * fields='id,tag' > ** This selects the fields to be returned from the Solr request. > ** These must also be declared as elements. > ** As with all fields, template processors can be used to alter the contents > to be passed downwards. > * timeout='30' > ** This limits the query to 5 seconds. This can be used as a fail-safe to > prevent the indexing session from freezing up. By default the timeout is 5 > minutes. > Limitations: > * Solr errors are not handled correctly. > * Loop control constructs have not been tested. > * Multi-valued returned fields have not been tested. > The unit tests give examples of how to use it as the root entity and an inner > entity. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2749) Co-occurrence filter
[ https://issues.apache.org/jira/browse/LUCENE-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007229#comment-13007229 ] Steven Rowe commented on LUCENE-2749: - bq. this filter would definitely something that i could use What use case(s) are you thinking of? > Co-occurrence filter > > > Key: LUCENE-2749 > URL: https://issues.apache.org/jira/browse/LUCENE-2749 > Project: Lucene - Java > Issue Type: New Feature > Components: Analysis >Affects Versions: 3.1, 4.0 >Reporter: Steven Rowe >Priority: Minor > Fix For: 4.0 > > > The co-occurrence filter to be developed here will output sets of tokens that > co-occur within a given window onto a token stream. > These token sets can be ordered either lexically (to allow order-independent > matching/counting) or positionally (e.g. sliding windows of positionally > ordered co-occurring terms that include all terms in the window are called > n-grams or shingles). > The parameters to this filter will be: > * window size: this can be a fixed sequence length, sentence/paragraph > context (these will require sentence/paragraph segmentation, which is not in > Lucene yet), or over the entire token stream (full field width) > * minimum number of co-occurring terms: >= 2 > * maximum number of co-occurring terms: <= window size > * token set ordering (lexical or positional) > One use case for co-occurring token sets is as candidates for collocations. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (LUCENE-2968) SurroundQuery doesn't support SpanNot
SurroundQuery doesn't support SpanNot - Key: LUCENE-2968 URL: https://issues.apache.org/jira/browse/LUCENE-2968 Project: Lucene - Java Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor It would be nice if we could do span not in the surround query, as they are quite useful for keeping searches within a boundary (say a sentence) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [HUDSON] Lucene-Solr-tests-only-3.x - Build # 5964 - Failure
The build never made it past the initial pre-build "ant clean": --- clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/build BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/build.xml:114: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/common-build.xml:191: Unable to delete file /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/build/backwards/test/6/index.TestLockFactory6.-6904310916879757798/_2b.fdx --- > -Original Message- > From: Apache Hudson Server [mailto:hud...@hudson.apache.org] > Sent: Tuesday, March 15, 2011 5:56 PM > To: dev@lucene.apache.org > Subject: [HUDSON] Lucene-Solr-tests-only-3.x - Build # 5964 - Failure > > Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only- > 3.x/5964/ > > 6 tests failed. > FAILED: TEST-org.apache.lucene.index.TestIndexWriter.xml. > > Error Message: > > > Stack Trace: > Test report file /home/hudson/hudson-slave/workspace/Lucene-Solr-tests- > only-3.x/checkout/lucene/build/backwards/test/TEST- > org.apache.lucene.index.TestIndexWriter.xml was length 0 > > FAILED: TEST-org.apache.lucene.search.TestBoolean2.xml. > > Error Message: > > > Stack Trace: > Test report file /home/hudson/hudson-slave/workspace/Lucene-Solr-tests- > only-3.x/checkout/lucene/build/backwards/test/TEST- > org.apache.lucene.search.TestBoolean2.xml was length 0 > > REGRESSION: org.apache.lucene.store.TestLockFactory.testStressLocks > > Error Message: > IndexWriter hit unexpected exceptions > > Stack Trace: > junit.framework.AssertionFailedError: IndexWriter hit unexpected > exceptions > at > org.apache.lucene.store.TestLockFactory._testStressLocks(TestLockFactory.j > ava:172) > at > org.apache.lucene.store.TestLockFactory.testStressLocks(TestLockFactory.ja > va:142) > at > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:255) > > > FAILED: .org.apache.lucene.store.TestRAMDirectory > > Error Message: > org.apache.lucene.store.TestRAMDirectory > > Stack Trace: > java.lang.ClassNotFoundException: org.apache.lucene.store.TestRAMDirectory > at java.net.URLClassLoader$1.run(URLClassLoader.java:217) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:205) > at java.lang.ClassLoader.loadClass(ClassLoader.java:321) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) > at java.lang.ClassLoader.loadClass(ClassLoader.java:266) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:186) > > > FAILED: .org.apache.lucene.util.TestNumericUtils > > Error Message: > org.apache.lucene.util.TestNumericUtils > > Stack Trace: > java.lang.ClassNotFoundException: org.apache.lucene.util.TestNumericUtils > at java.net.URLClassLoader$1.run(URLClassLoader.java:217) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:205) > at java.lang.ClassLoader.loadClass(ClassLoader.java:321) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) > at java.lang.ClassLoader.loadClass(ClassLoader.java:266) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:186) > > > FAILED: .org.apache.lucene.util.TestSmallFloat > > Error Message: > org.apache.lucene.util.TestSmallFloat > > Stack Trace: > java.lang.ClassNotFoundException: org.apache.lucene.util.TestSmallFloat > at java.net.URLClassLoader$1.run(URLClassLoader.java:217) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:205) > at java.lang.ClassLoader.loadClass(ClassLoader.java:321) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) > at java.lang.ClassLoader.loadClass(ClassLoader.java:266) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:186) > > > > > Build Log (for compile errors): > [...truncated 47 lines...] > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2429) ability to not cache a filter
[ https://issues.apache.org/jira/browse/SOLR-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007217#comment-13007217 ] Hoss Man commented on SOLR-2429: why not extend Query? ... it could actually rewrite to the Query it wraps, giving us the best of both worlds. FWIW: it also seems like it would make sense for this type of syntax/decoration to work with the "q" param (skipping the queryResultCache) > ability to not cache a filter > - > > Key: SOLR-2429 > URL: https://issues.apache.org/jira/browse/SOLR-2429 > Project: Solr > Issue Type: New Feature >Reporter: Yonik Seeley > > A user should be able to add {!cache=false} to a query or filter query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-3.x - Build # 5964 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/5964/ 6 tests failed. FAILED: TEST-org.apache.lucene.index.TestIndexWriter.xml. Error Message: Stack Trace: Test report file /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/build/backwards/test/TEST-org.apache.lucene.index.TestIndexWriter.xml was length 0 FAILED: TEST-org.apache.lucene.search.TestBoolean2.xml. Error Message: Stack Trace: Test report file /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/build/backwards/test/TEST-org.apache.lucene.search.TestBoolean2.xml was length 0 REGRESSION: org.apache.lucene.store.TestLockFactory.testStressLocks Error Message: IndexWriter hit unexpected exceptions Stack Trace: junit.framework.AssertionFailedError: IndexWriter hit unexpected exceptions at org.apache.lucene.store.TestLockFactory._testStressLocks(TestLockFactory.java:172) at org.apache.lucene.store.TestLockFactory.testStressLocks(TestLockFactory.java:142) at org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:255) FAILED: .org.apache.lucene.store.TestRAMDirectory Error Message: org.apache.lucene.store.TestRAMDirectory Stack Trace: java.lang.ClassNotFoundException: org.apache.lucene.store.TestRAMDirectory at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:186) FAILED: .org.apache.lucene.util.TestNumericUtils Error Message: org.apache.lucene.util.TestNumericUtils Stack Trace: java.lang.ClassNotFoundException: org.apache.lucene.util.TestNumericUtils at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:186) FAILED: .org.apache.lucene.util.TestSmallFloat Error Message: org.apache.lucene.util.TestSmallFloat Stack Trace: java.lang.ClassNotFoundException: org.apache.lucene.util.TestSmallFloat at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:186) Build Log (for compile errors): [...truncated 47 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2952: Attachment: LUCENE-2952.patch This minimizes the number of calls to validate (there is still one extra call via the benchmark module since it invokes the common lucene compile target). Also splits it out into Lucene, Solr and Modules. I'd consider it close to good enough at this point. > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007205#comment-13007205 ] Steven Rowe commented on LUCENE-2960: - bq. How about an IWC base class, extended by IWCinit and IWClive. IWCinit has setters for everything, and IW.getConfig() returns IWClive, which has no setters for things you can't set on the fly. I tried to implement this, but couldn't figure out a way to avoid code and javadoc duplication and/or separation for the live setters, which need to be on both the init and live versions. Duplication/separation of this sort would be begging for trouble. (The live setters can't be on the base class because the init and live versions would have to return different types to allow for proper chaining.) > Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter > -- > > Key: LUCENE-2960 > URL: https://issues.apache.org/jira/browse/LUCENE-2960 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Shay Banon >Priority: Blocker > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2960.patch > > > In 3.1 the ability to setRAMBufferSizeMB is deprecated, and removed in trunk. > It would be great to be able to control that on a live IndexWriter. Other > possible two methods that would be great to bring back are > setTermIndexInterval and setReaderTermsIndexDivisor. Most of the other > setters can actually be set on the MergePolicy itself, so no need for setters > for those (I think). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2967) Use linear probing with an additional good bit avalanching function in FST's NodeHash.
[ https://issues.apache.org/jira/browse/LUCENE-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007164#comment-13007164 ] Dawid Weiss commented on LUCENE-2967: - Yes, now I see this difference on the 38M too: trunk: {noformat} 56.462 55.725 55.544 55.522 {noformat} w/patch: {noformat} 59.9 59.6 {noformat} I'll see if I can find out the problem here; I assume the collision ratio should be nearly identical... but who knows. This is of no priority, but interesting stuff. I'll close if I can't get it better than the trunk version. > Use linear probing with an additional good bit avalanching function in FST's > NodeHash. > -- > > Key: LUCENE-2967 > URL: https://issues.apache.org/jira/browse/LUCENE-2967 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Fix For: 4.0 > > Attachments: LUCENE-2967.patch > > > I recently had an interesting discussion with Sebastiano Vigna (fastutil), > who suggested that linear probing, given a hash mixing function with good > avalanche properties, is a way better method of constructing lookups in > associative arrays compared to quadratic probing. Indeed, with linear probing > you can implement removals from a hash map without removed slot markers and > linear probing has nice properties with respect to modern CPUs (caches). I've > reimplemented HPPC's hash maps to use linear probing and we observed a nice > speedup (the same applies for fastutils of course). > This patch changes NodeHash's implementation to use linear probing. The code > is a bit simpler (I think :). I also moved the load factor to a constant -- > 0.5 seems like a generous load factor, especially if we allow large FSTs to > be built. I don't see any significant speedup in constructing large automata, > but there is no slowdown either (I checked on one machine only for now, but > will verify on other machines too). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007139#comment-13007139 ] Robert Muir commented on LUCENE-2960: - You win the fact that this is such an expert thing, and it should not confuse 99% of users who won't need to change these settings in a live way. This is a central API to using lucene, sorry i would rather see IWConfig be reverted completely than see this deprecation/undeprecation loop, it would just cause too much confusion. > Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter > -- > > Key: LUCENE-2960 > URL: https://issues.apache.org/jira/browse/LUCENE-2960 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Shay Banon >Priority: Blocker > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2960.patch > > > In 3.1 the ability to setRAMBufferSizeMB is deprecated, and removed in trunk. > It would be great to be able to control that on a live IndexWriter. Other > possible two methods that would be great to bring back are > setTermIndexInterval and setReaderTermsIndexDivisor. Most of the other > setters can actually be set on the MergePolicy itself, so no need for setters > for those (I think). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007136#comment-13007136 ] Earwin Burrfoot commented on LUCENE-2960: - You avoid deprecation/undeprecation and binary incompatibility, while incompatibly changing semantics. What do you win? > Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter > -- > > Key: LUCENE-2960 > URL: https://issues.apache.org/jira/browse/LUCENE-2960 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Shay Banon >Priority: Blocker > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2960.patch > > > In 3.1 the ability to setRAMBufferSizeMB is deprecated, and removed in trunk. > It would be great to be able to control that on a live IndexWriter. Other > possible two methods that would be great to bring back are > setTermIndexInterval and setReaderTermsIndexDivisor. Most of the other > setters can actually be set on the MergePolicy itself, so no need for setters > for those (I think). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007123#comment-13007123 ] Robert Muir commented on LUCENE-2960: - Its exactly the lack of consensus we see here, thats why I am 100% against having the setter approach. I'm totally against some deprecation/undeprecation loop because we in future releases another setting wants to be "live". It seems the only way we can avoid this, is for javadoc to be the only specification as to whether a setting does or does not take effect "live". > Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter > -- > > Key: LUCENE-2960 > URL: https://issues.apache.org/jira/browse/LUCENE-2960 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Shay Banon >Priority: Blocker > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2960.patch > > > In 3.1 the ability to setRAMBufferSizeMB is deprecated, and removed in trunk. > It would be great to be able to control that on a live IndexWriter. Other > possible two methods that would be great to bring back are > setTermIndexInterval and setReaderTermsIndexDivisor. Most of the other > setters can actually be set on the MergePolicy itself, so no need for setters > for those (I think). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2429) ability to not cache a filter
[ https://issues.apache.org/jira/browse/SOLR-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007122#comment-13007122 ] Yonik Seeley commented on SOLR-2429: The annoying part here is we need more metadata than just "Query" that we use now for a filter. Unfortunately, SolrIndexSearcher uses List everywhere. We could create something like a SolrQuery extends Query that wrapped a normal query and added additional metadata (like cache options). That's a bit messier since we'd have instanceof checks and casts everywhere though. Another option is to create a SolrQuery class that does not extend Query - hence methods taking List would now need to take List {code} class SolrQuery { Query q; QParser qparser; boolean cache; ... } {code} Thoughts? > ability to not cache a filter > - > > Key: SOLR-2429 > URL: https://issues.apache.org/jira/browse/SOLR-2429 > Project: Solr > Issue Type: New Feature >Reporter: Yonik Seeley > > A user should be able to add {!cache=false} to a query or filter query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2429) ability to not cache a filter
ability to not cache a filter - Key: SOLR-2429 URL: https://issues.apache.org/jira/browse/SOLR-2429 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley A user should be able to add {!cache=false} to a query or filter query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2427) UIMA jars are missing version numbers
[ https://issues.apache.org/jira/browse/SOLR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007085#comment-13007085 ] Steven Rowe commented on SOLR-2427: --- bq. I found the problem being (damn) silent JVM update in Mac OSX which simlinked 1.5 Java version to 1.6 Apple rocks! bq. However the uima-core version had to be switched to 2.3.1 release (the snapshot one was the first jar I uploaded just some days before the release). The manifest in {{solr/contrib/uima/lib/uima-core.jar}} listed the version as 2.3.1-SNAPSHOT, and when I did a diff with the jar from the maven central repo, all of the .class files were different. So I'm not sure what happened here, but the jar in Solr's source tree was definitely not the same as the released jar. Maybe the released 2.3.1 jar you posted was never committed? I don't know. Anyway, it's fixed now. bq. Thanks for taking care. No problem. > UIMA jars are missing version numbers > - > > Key: SOLR-2427 > URL: https://issues.apache.org/jira/browse/SOLR-2427 > Project: Solr > Issue Type: Bug >Affects Versions: 3.1 >Reporter: Grant Ingersoll >Assignee: Steven Rowe >Priority: Blocker > Fix For: 3.1, 3.2, 4.0 > > > We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2427) UIMA jars are missing version numbers
[ https://issues.apache.org/jira/browse/SOLR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007078#comment-13007078 ] Tommaso Teofili commented on SOLR-2427: --- Hello Steven, I found the problem being (damn) silent JVM update in Mac OSX which simlinked 1.5 Java version to 1.6 :( However the uima-core version had to be switched to 2.3.1 release (the snapshot one was the first jar I uploaded just some days before the release). Thanks for taking care. > UIMA jars are missing version numbers > - > > Key: SOLR-2427 > URL: https://issues.apache.org/jira/browse/SOLR-2427 > Project: Solr > Issue Type: Bug >Affects Versions: 3.1 >Reporter: Grant Ingersoll >Assignee: Steven Rowe >Priority: Blocker > Fix For: 3.1, 3.2, 4.0 > > > We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007048#comment-13007048 ] Earwin Burrfoot commented on LUCENE-2960: - bq. Oh yeah. But then we'd clone the full IWC on every set... this seems like overkill in the name of "purity". So what? What exactly is overkill? Few wasted bytes and CPU ns for an object that's created a couple of times during application lifetime? There are also builders, which are very similar to what Steven is proposing. bq. Another thought is to offer all settings on the IWC for init convenience and exposure and then add javadoc about updaters on IW for those settings that can be changed on the fly That's exactly how I'd like to see it. > Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter > -- > > Key: LUCENE-2960 > URL: https://issues.apache.org/jira/browse/LUCENE-2960 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Shay Banon >Priority: Blocker > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2960.patch > > > In 3.1 the ability to setRAMBufferSizeMB is deprecated, and removed in trunk. > It would be great to be able to control that on a live IndexWriter. Other > possible two methods that would be great to bring back are > setTermIndexInterval and setReaderTermsIndexDivisor. Most of the other > setters can actually be set on the MergePolicy itself, so no need for setters > for those (I think). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (SOLR-2427) UIMA jars are missing version numbers
[ https://issues.apache.org/jira/browse/SOLR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe resolved SOLR-2427. --- Resolution: Fixed Fix Version/s: 4.0 3.2 Committed: - lucene_solr_3_1 revision 1081856 - branch_3x revision 1081860 - trunk revision 1081880 Ant build & tests succeed. Maven build & tests succeed. {{ant -Dversion=... -Dspecversion=... prepare-release sign-artifacts}} works and the generated Maven artifacts look good. > UIMA jars are missing version numbers > - > > Key: SOLR-2427 > URL: https://issues.apache.org/jira/browse/SOLR-2427 > Project: Solr > Issue Type: Bug >Affects Versions: 3.1 >Reporter: Grant Ingersoll >Assignee: Steven Rowe >Priority: Blocker > Fix For: 3.1, 3.2, 4.0 > > > We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007043#comment-13007043 ] Steven Rowe commented on LUCENE-2960: - How about an IWC base class, extended by IWCinit and IWClive. IWCinit has setters for everything, and IW.getConfig() returns IWClive, which has no setters for things you can't set on the fly. > Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter > -- > > Key: LUCENE-2960 > URL: https://issues.apache.org/jira/browse/LUCENE-2960 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Shay Banon >Priority: Blocker > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2960.patch > > > In 3.1 the ability to setRAMBufferSizeMB is deprecated, and removed in trunk. > It would be great to be able to control that on a live IndexWriter. Other > possible two methods that would be great to bring back are > setTermIndexInterval and setReaderTermsIndexDivisor. Most of the other > setters can actually be set on the MergePolicy itself, so no need for setters > for those (I think). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007036#comment-13007036 ] Mark Miller commented on LUCENE-2960: - {quote}I really don't like that this approach would split IW configuration into two places. Like you look at the javadocs for IWC and think that you cannot change the RAM buffer size. IWC should be the one place you go to see which settings you can change about the IW. That some of these settings take effect "live" while others do not is really an orthogonal (and I think, secondary, ie handled fine w/ jdocs) aspect/concern.{quote} You can just as easily argue that the javadocs for IWC could explain that live settings are on the IW. That pattern just smells wrong. {quote} But, if you want to change something live, you can IW.getConfig().setFoo(...). The config instance is a private clone to that IW. {quote} This is better than nothing. Another thought is to offer all settings on the IWC for init convenience and exposure and then add javadoc about updaters on IW for those settings that can be changed on the fly - or one update method and enums... > Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter > -- > > Key: LUCENE-2960 > URL: https://issues.apache.org/jira/browse/LUCENE-2960 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Shay Banon >Priority: Blocker > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2960.patch > > > In 3.1 the ability to setRAMBufferSizeMB is deprecated, and removed in trunk. > It would be great to be able to control that on a live IndexWriter. Other > possible two methods that would be great to bring back are > setTermIndexInterval and setReaderTermsIndexDivisor. Most of the other > setters can actually be set on the MergePolicy itself, so no need for setters > for those (I think). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2967) Use linear probing with an additional good bit avalanching function in FST's NodeHash.
[ https://issues.apache.org/jira/browse/LUCENE-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007031#comment-13007031 ] Michael McCandless commented on LUCENE-2967: Hmm, unfortunately, I'm seeing the patch make FST building slower, at least in my env/test set. I built FST for the 38M wikipedia terms. I ran 6 times each, alternating trunk & patch. I also turned off saving the FST, and ran -noverify, so I'm only measuring time to build it. I run java -Xmx2g -Xms2g -Xbatch, and measure wall clock time. Times on trunk (seconds): {noformat} 43.795 43.493 44.343 44.045 43.645 43.846 {noformat} Times w/ patch: {noformat} 46.595 47.751 47.901 47.901 47.901 47.700 {noformat} We could also try less generous load factors... > Use linear probing with an additional good bit avalanching function in FST's > NodeHash. > -- > > Key: LUCENE-2967 > URL: https://issues.apache.org/jira/browse/LUCENE-2967 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Fix For: 4.0 > > Attachments: LUCENE-2967.patch > > > I recently had an interesting discussion with Sebastiano Vigna (fastutil), > who suggested that linear probing, given a hash mixing function with good > avalanche properties, is a way better method of constructing lookups in > associative arrays compared to quadratic probing. Indeed, with linear probing > you can implement removals from a hash map without removed slot markers and > linear probing has nice properties with respect to modern CPUs (caches). I've > reimplemented HPPC's hash maps to use linear probing and we observed a nice > speedup (the same applies for fastutils of course). > This patch changes NodeHash's implementation to use linear probing. The code > is a bit simpler (I think :). I also moved the load factor to a constant -- > 0.5 seems like a generous load factor, especially if we allow large FSTs to > be built. I don't see any significant speedup in constructing large automata, > but there is no slowdown either (I checked on one machine only for now, but > will verify on other machines too). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007011#comment-13007011 ] Michael McCandless commented on LUCENE-2960: bq. Hmmm, infoStream is just for debugging... should we really make it volatile? I'll remove its volatile... {quote} bq. IWC cannot be made immutable – you build it up incrementally (new IWC(...).setThis(...).setThat(...)). Its fields cannot be final. Setters can return modified immutable copy of 'this'. So you get both incremental building and immutability. {quote} Oh yeah. But then we'd clone the full IWC on every set... this seems like overkill in the name of "purity". {quote} What about earlier compromise mentioned by Shay, Mark, me? Keep setters for 'live' properties on IW. This clearly draws the line, and you don't have to consult Javadocs for each and every setting to know if you can change it live or not. {quote} I really don't like that this approach would split IW configuration into two places. Like you look at the javadocs for IWC and think that you cannot change the RAM buffer size. IWC should be the one place you go to see which settings you can change about the IW. That some of these settings take effect "live" while others do not is really an orthogonal (and I think, secondary, ie handled fine w/ jdocs) aspect/concern. > Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter > -- > > Key: LUCENE-2960 > URL: https://issues.apache.org/jira/browse/LUCENE-2960 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Shay Banon >Priority: Blocker > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2960.patch > > > In 3.1 the ability to setRAMBufferSizeMB is deprecated, and removed in trunk. > It would be great to be able to control that on a live IndexWriter. Other > possible two methods that would be great to bring back are > setTermIndexInterval and setReaderTermsIndexDivisor. Most of the other > setters can actually be set on the MergePolicy itself, so no need for setters > for those (I think). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2573) Tiered flushing of DWPTs by RAM with low/high water marks
[ https://issues.apache.org/jira/browse/LUCENE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007003#comment-13007003 ] Michael McCandless commented on LUCENE-2573: bq. it currently holds the ram usage for that DWPT when it was checked out so that I can reduce the flushBytes accordingly. We can maybe get rid of it entirely but I don't want to rely on the DWPT bytesUsed() though. Hmm, but, once a DWPT is pulled from production, its bytesUsed() should not be changing anymore? Like why can't we use it to hold its bytesUsed? bq. I generally don't like cluttering DocWriter and let it grow like IW. DocWriterSession might not be the ideal name for this class but its really a ram tracker for this DW. Yet, we can move out some parts that do not directly relate to mem tracking. Maybe DocWriterBytes? Well DocWriter is quite small now :) (On RT branch). And adding another class means we have to be careful about proper sync'ing (lock order, to avoid deadlock)... and I think it should get smaller if we can remove state[] array, FlushState enum, etc. but, OK I guess we can leave it as separate for now. How about DocumentsWriterRAMUsage? RAMTracker? {quote} bq. Instead of FlushPolicy.message, can't the policy call DW.message? I don't want to couple that API to DW. What would be the benefit beside from saving a single method? {quote} Hmm, good point. Though, it already has a SetOnce -- how come? Can the policy call IW.message? I just think FlushPolicy ought to be very lean, ie show you exactly what you need to implement... {quote} bq. On the by-RAM flush policies... when you hit the high water mark, we should 1) flush all DWPTs and 2) stall any other threads. Well I am not sure if we should do that. I don't really see why we should forcefully stop the world here. Incoming threads will pick up a flush immediately and if we have enough resources to index further why should we wait until all DWPT are flushed. if we stall I fear that we could queue up threads that could help flushing while stalling would simply stop them doing anything, right? You can still control this with the healthiness though. We currently do flush all DWPT btw. once we hit the HW. {quote} As long as we default the high mark to something "generous" (2X low mark), I think this approach should work well. Ie, we "begin" flushing as soon as low mark is crossed on active RAM. We pick the biggest DWPT and take it of rotation, and immediately deduct its RAM usage from the active pool. If, while we are still flushing, active RAM again grows above the low mark, then we pull another DWPT, etc. But then if ever the total flushing + active exceeds the high mark, we stall. BTW why do we track flushPending RAM vs flushing RAM? Is that distinction necessary? (Can't we just track "flushing" RAM?). > Tiered flushing of DWPTs by RAM with low/high water marks > - > > Key: LUCENE-2573 > URL: https://issues.apache.org/jira/browse/LUCENE-2573 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Michael Busch >Assignee: Simon Willnauer >Priority: Minor > Fix For: Realtime Branch > > Attachments: LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, > LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch > > > Now that we have DocumentsWriterPerThreads we need to track total consumed > RAM across all DWPTs. > A flushing strategy idea that was discussed in LUCENE-2324 was to use a > tiered approach: > - Flush the first DWPT at a low water mark (e.g. at 90% of allowed RAM) > - Flush all DWPTs at a high water mark (e.g. at 110%) > - Use linear steps in between high and low watermark: E.g. when 5 DWPTs are > used, flush at 90%, 95%, 100%, 105% and 110%. > Should we allow the user to configure the low and high water mark values > explicitly using total values (e.g. low water mark at 120MB, high water mark > at 140MB)? Or shall we keep for simplicity the single setRAMBufferSizeMB() > config method and use something like 90% and 110% for the water marks? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2427) UIMA jars are missing version numbers
[ https://issues.apache.org/jira/browse/SOLR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated SOLR-2427: -- Priority: Blocker (was: Trivial) Affects Version/s: 3.1 Fix Version/s: 3.1 > UIMA jars are missing version numbers > - > > Key: SOLR-2427 > URL: https://issues.apache.org/jira/browse/SOLR-2427 > Project: Solr > Issue Type: Bug >Affects Versions: 3.1 >Reporter: Grant Ingersoll >Assignee: Steven Rowe >Priority: Blocker > Fix For: 3.1 > > > We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2427) UIMA jars are missing version numbers
[ https://issues.apache.org/jira/browse/SOLR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006960#comment-13006960 ] Steven Rowe commented on SOLR-2427: --- It looks to me like the UIMA contrib was committed before uima-core 2.3.1 was released, using a 2.3.1-SNAPSHOT version of the jar, and then never upgraded after the release. I think it makes sense to switch the version of the uima-core jar in Solr's source tree to the released 2.3.1 version, and then stop publishing a Solr-specific uima-core jar. > UIMA jars are missing version numbers > - > > Key: SOLR-2427 > URL: https://issues.apache.org/jira/browse/SOLR-2427 > Project: Solr > Issue Type: Bug >Reporter: Grant Ingersoll >Assignee: Steven Rowe >Priority: Trivial > > We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2427) UIMA jars are missing version numbers
[ https://issues.apache.org/jira/browse/SOLR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006952#comment-13006952 ] Steven Rowe commented on SOLR-2427: --- Crap, I got the uima-core situation exactly backward. The version in {{solr/contrib/uima/lib/}} was compiled, by you, Tommaso, using Java 1.6 (according to {{META-INF/MANIFEST.MF}}). However, since the clustering contrib tests succeed under Java 1.5, I assume that although the jar was compiled using Java 1.6, the target version was 1.5. The version in the maven central repository was actually compiled with 1.5 (again, according to {{META-INF/MANIFEST.MF}}). Tommaso, why is the version in Solr's source tree different from the maven version of the jar? > UIMA jars are missing version numbers > - > > Key: SOLR-2427 > URL: https://issues.apache.org/jira/browse/SOLR-2427 > Project: Solr > Issue Type: Bug >Reporter: Grant Ingersoll >Assignee: Steven Rowe >Priority: Trivial > > We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2427) UIMA jars are missing version numbers
[ https://issues.apache.org/jira/browse/SOLR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006951#comment-13006951 ] Tommaso Teofili commented on SOLR-2427: --- That is unexpected as UIMA should've been deployed with 1.5. I'll check this out as soon as I can. > UIMA jars are missing version numbers > - > > Key: SOLR-2427 > URL: https://issues.apache.org/jira/browse/SOLR-2427 > Project: Solr > Issue Type: Bug >Reporter: Grant Ingersoll >Assignee: Steven Rowe >Priority: Trivial > > We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2427) UIMA jars are missing version numbers
[ https://issues.apache.org/jira/browse/SOLR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006950#comment-13006950 ] Steven Rowe commented on SOLR-2427: --- Hmm, [uimaj-core-2.3.1.jar in the maven repository|http://repo1.maven.org/maven2/org/apache/uima/uimaj-core/2.3.1/] was compiled with Java 1.6, while the version in {{solr/contrib/uima/lib/}} was compiled with Java 1.5. Tommaso, do you know of a maven-hosted Java-1.5-compiled version of the uima-core jar? If not, I will leave things as they are now, continuing to publish a Solr-specific Java-1.5-compiled uima-core jar. > UIMA jars are missing version numbers > - > > Key: SOLR-2427 > URL: https://issues.apache.org/jira/browse/SOLR-2427 > Project: Solr > Issue Type: Bug >Reporter: Grant Ingersoll >Assignee: Steven Rowe >Priority: Trivial > > We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2427) UIMA jars are missing version numbers
[ https://issues.apache.org/jira/browse/SOLR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006946#comment-13006946 ] Tommaso Teofili commented on SOLR-2427: --- bq. That makes little sense, though, now that I have reconsidered it, so I'll drop maven publishing of the Solr-specific uima-core jar. The other UIMA SNAPSHOT dependencies, however, will need to be published as Solr-specific versions, since the maven central repository rejects POMs with SNAPSHOT dependencies. +1 :) > UIMA jars are missing version numbers > - > > Key: SOLR-2427 > URL: https://issues.apache.org/jira/browse/SOLR-2427 > Project: Solr > Issue Type: Bug >Reporter: Grant Ingersoll >Assignee: Steven Rowe >Priority: Trivial > > We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2952: Attachment: LUCENE-2952.patch This hooks it into compile-core, but has the unfortunate side-effect of being called a whole bunch of times, which is not good. Need to read up on how to avoid that in ant (or if anyone has suggestions, that would be great). Otherwise, I think the baseline functionality is ready to go. > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2427) UIMA jars are missing version numbers
[ https://issues.apache.org/jira/browse/SOLR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006942#comment-13006942 ] Steven Rowe commented on SOLR-2427: --- Thanks Tommaso, I will rename them. Separately, although you previously said that uima-core.jar is the released 2.3.1 version, I still had been thinking that along with the other UIMA jars, its maven artifact should be published under the Apache Solr project. That makes little sense, though, now that I have reconsidered it, so I'll drop maven publishing of the Solr-specific uima-core jar. The other UIMA SNAPSHOT dependencies, however, will need to be published as Solr-specific versions, since the maven central repository rejects POMs with SNAPSHOT dependencies. > UIMA jars are missing version numbers > - > > Key: SOLR-2427 > URL: https://issues.apache.org/jira/browse/SOLR-2427 > Project: Solr > Issue Type: Bug >Reporter: Grant Ingersoll >Assignee: Steven Rowe >Priority: Trivial > > We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2428) Upgrade carrot2-core dependency to a version with a Java 1.5-compiled jar
[ https://issues.apache.org/jira/browse/SOLR-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated SOLR-2428: -- Description: As of not-yet-released version 3.4.4, the carrot2-core jar will be published as a retrowoven 1.5 version (in addition to a Java-1.6-compiled version) - see Dawid Weiss's comment on [LUCENE-2957|https://issues.apache.org/jira/browse/LUCENE-2957?focusedCommentId=13006878&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13006878] (was: As of not-yet-released version 3.4.4, the carrot2-core will publish a retowoven 1.5 version of the jar - see Dawid Weiss's comment on [LUCENE-2957|https://issues.apache.org/jira/browse/LUCENE-2957?focusedCommentId=13006878&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13006878]) > Upgrade carrot2-core dependency to a version with a Java 1.5-compiled jar > - > > Key: SOLR-2428 > URL: https://issues.apache.org/jira/browse/SOLR-2428 > Project: Solr > Issue Type: Improvement > Components: contrib - Clustering >Affects Versions: 3.1.1, 3.2 >Reporter: Steven Rowe >Assignee: Dawid Weiss >Priority: Minor > Fix For: 3.1.1, 3.2 > > > As of not-yet-released version 3.4.4, the carrot2-core jar will be published > as a retrowoven 1.5 version (in addition to a Java-1.6-compiled version) - > see Dawid Weiss's comment on > [LUCENE-2957|https://issues.apache.org/jira/browse/LUCENE-2957?focusedCommentId=13006878&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13006878] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Assigned: (SOLR-2428) Upgrade carrot2-core dependency to a version with a Java 1.5-compiled jar
[ https://issues.apache.org/jira/browse/SOLR-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss reassigned SOLR-2428: - Assignee: Dawid Weiss > Upgrade carrot2-core dependency to a version with a Java 1.5-compiled jar > - > > Key: SOLR-2428 > URL: https://issues.apache.org/jira/browse/SOLR-2428 > Project: Solr > Issue Type: Improvement > Components: contrib - Clustering >Affects Versions: 3.1.1, 3.2 >Reporter: Steven Rowe >Assignee: Dawid Weiss >Priority: Minor > Fix For: 3.1.1, 3.2 > > > As of not-yet-released version 3.4.4, the carrot2-core will publish a > retowoven 1.5 version of the jar - see Dawid Weiss's comment on > [LUCENE-2957|https://issues.apache.org/jira/browse/LUCENE-2957?focusedCommentId=13006878&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13006878] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Assigned: (SOLR-2427) UIMA jars are missing version numbers
[ https://issues.apache.org/jira/browse/SOLR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe reassigned SOLR-2427: - Assignee: Steven Rowe > UIMA jars are missing version numbers > - > > Key: SOLR-2427 > URL: https://issues.apache.org/jira/browse/SOLR-2427 > Project: Solr > Issue Type: Bug >Reporter: Grant Ingersoll >Assignee: Steven Rowe >Priority: Trivial > > We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2957) generate-maven-artifacts target should include all non-Mavenized Lucene & Solr dependencies
[ https://issues.apache.org/jira/browse/LUCENE-2957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006937#comment-13006937 ] Steven Rowe commented on LUCENE-2957: - Thanks Dawid - I've created SOLR-2428 to track upgrading once 3.4.4 has been released. > generate-maven-artifacts target should include all non-Mavenized Lucene & > Solr dependencies > --- > > Key: LUCENE-2957 > URL: https://issues.apache.org/jira/browse/LUCENE-2957 > Project: Lucene - Java > Issue Type: Improvement > Components: Build >Affects Versions: 3.1, 3.2, 4.0 >Reporter: Steven Rowe >Assignee: Steven Rowe >Priority: Minor > Fix For: 3.1, 3.2, 4.0 > > Attachments: LUCENE-2923-part3.patch, LUCENE-2957-part2.patch, > LUCENE-2957.patch > > > Currently, in addition to deploying artifacts for all of the Lucene and Solr > modules to a repository (by default local), the {{generate-maven-artifacts}} > target also deploys artifacts for the following non-Mavenized Solr > dependencies (lucene_solr_3_1 version given here): > # {{solr/lib/commons-csv-1.0-SNAPSHOT-r966014.jar}} as > org.apache.solr:solr-commons-csv:3.1 > # {{solr/lib/apache-solr-noggit-r944541.jar}} as > org.apache.solr:solr-noggit:3.1 > \\ \\ > The following {{.jar}}'s should be added to the above list (lucene_solr_3_1 > version given here): > \\ \\ > # {{lucene/contrib/icu/lib/icu4j-4_6.jar}} > # > {{lucene/contrib/benchmark/lib/xercesImpl-2.9.1-patched-XERCESJ}}{{-1257.jar}} > # {{solr/contrib/clustering/lib/carrot2-core-3.4.2.jar}}** > # {{solr/contrib/uima/lib/uima-an-alchemy.jar}} > # {{solr/contrib/uima/lib/uima-an-calais.jar}} > # {{solr/contrib/uima/lib/uima-an-tagger.jar}} > # {{solr/contrib/uima/lib/uima-an-wst.jar}} > # {{solr/contrib/uima/lib/uima-core.jar}} > \\ \\ > I think it makes sense to follow the same model as the current non-Mavenized > dependencies: > \\ \\ > * {{groupId}} = {{org.apache.solr/.lucene}} > * {{artifactId}} = {{solr-/lucene-}}, > * {{version}} = . > **The carrot2-core jar doesn't need to be included in trunk's release > artifacts, since there already is a Mavenized Java6-compiled jar. branch_3x > and lucene_solr_3_1 will need this Solr-specific Java5-compiled maven > artifact, though. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2427) UIMA jars are missing version numbers
[ https://issues.apache.org/jira/browse/SOLR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006936#comment-13006936 ] Tommaso Teofili commented on SOLR-2427: --- The mentioned jars have the following versions and revisions: - uima-core.jar is 2.3.1 (released) - uima-an-alchemy.jar is 2.3.1-SNAPSHOT revision 1062868 - uima-an-calais.jaris 2.3.1-SNAPSHOT revision 1062868 - uima-an-tagger.jar is 2.3.1-SNAPSHOT revision 1062868 - uima-an-wst.jar is 2.3.1-SNAPSHOT revision 1076132 Hope this helps > UIMA jars are missing version numbers > - > > Key: SOLR-2427 > URL: https://issues.apache.org/jira/browse/SOLR-2427 > Project: Solr > Issue Type: Bug >Reporter: Grant Ingersoll >Priority: Trivial > > We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2428) Upgrade carrot2-core dependency to a version with a Java 1.5-compiled jar
Upgrade carrot2-core dependency to a version with a Java 1.5-compiled jar - Key: SOLR-2428 URL: https://issues.apache.org/jira/browse/SOLR-2428 Project: Solr Issue Type: Improvement Components: contrib - Clustering Affects Versions: 3.1.1, 3.2 Reporter: Steven Rowe Priority: Minor Fix For: 3.1.1, 3.2 As of not-yet-released version 3.4.4, the carrot2-core will publish a retowoven 1.5 version of the jar - see Dawid Weiss's comment on [LUCENE-2957|https://issues.apache.org/jira/browse/LUCENE-2957?focusedCommentId=13006878&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13006878] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2427) UIMA jars are missing version numbers
[ https://issues.apache.org/jira/browse/SOLR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006930#comment-13006930 ] Robert Muir commented on SOLR-2427: --- I agree, i think best would be to format them like the others in solr: for example commons-csv-1.0-SNAPSHOT-r966014.jar > UIMA jars are missing version numbers > - > > Key: SOLR-2427 > URL: https://issues.apache.org/jira/browse/SOLR-2427 > Project: Solr > Issue Type: Bug >Reporter: Grant Ingersoll >Priority: Trivial > > We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2952: Attachment: LUCENE-2952.patch Pretty close to standalone completion. Next step to hook it in. I'm going to commit the license naming normalization now but not the validation code yet. Also, renamed LicenseChecker to DependencyChecker as it might be useful for checking other things like that all jars have version numbers. > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2427) UIMA jars are missing version numbers
UIMA jars are missing version numbers - Key: SOLR-2427 URL: https://issues.apache.org/jira/browse/SOLR-2427 Project: Solr Issue Type: Bug Reporter: Grant Ingersoll Priority: Trivial We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2967) Use linear probing with an additional good bit avalanching function in FST's NodeHash.
[ https://issues.apache.org/jira/browse/LUCENE-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-2967: Attachment: LUCENE-2967.patch Linear probing in NodeHash. > Use linear probing with an additional good bit avalanching function in FST's > NodeHash. > -- > > Key: LUCENE-2967 > URL: https://issues.apache.org/jira/browse/LUCENE-2967 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Fix For: 4.0 > > Attachments: LUCENE-2967.patch > > > I recently had an interesting discussion with Sebastiano Vigna (fastutil), > who suggested that linear probing, given a hash mixing function with good > avalanche properties, is a way better method of constructing lookups in > associative arrays compared to quadratic probing. Indeed, with linear probing > you can implement removals from a hash map without removed slot markers and > linear probing has nice properties with respect to modern CPUs (caches). I've > reimplemented HPPC's hash maps to use linear probing and we observed a nice > speedup (the same applies for fastutils of course). > This patch changes NodeHash's implementation to use linear probing. The code > is a bit simpler (I think :). I also moved the load factor to a constant -- > 0.5 seems like a generous load factor, especially if we allow large FSTs to > be built. I don't see any significant speedup in constructing large automata, > but there is no slowdown either (I checked on one machine only for now, but > will verify on other machines too). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (LUCENE-2967) Use linear probing with an additional good bit avalanching function in FST's NodeHash.
Use linear probing with an additional good bit avalanching function in FST's NodeHash. -- Key: LUCENE-2967 URL: https://issues.apache.org/jira/browse/LUCENE-2967 Project: Lucene - Java Issue Type: Improvement Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 4.0 I recently had an interesting discussion with Sebastiano Vigna (fastutil), who suggested that linear probing, given a hash mixing function with good avalanche properties, is a way better method of constructing lookups in associative arrays compared to quadratic probing. Indeed, with linear probing you can implement removals from a hash map without removed slot markers and linear probing has nice properties with respect to modern CPUs (caches). I've reimplemented HPPC's hash maps to use linear probing and we observed a nice speedup (the same applies for fastutils of course). This patch changes NodeHash's implementation to use linear probing. The code is a bit simpler (I think :). I also moved the load factor to a constant -- 0.5 seems like a generous load factor, especially if we allow large FSTs to be built. I don't see any significant speedup in constructing large automata, but there is no slowdown either (I checked on one machine only for now, but will verify on other machines too). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1081745 - /lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/automaton/fst/TestFSTs.java
Thanks Mike :) Dawid On Tue, Mar 15, 2011 at 1:22 PM, Michael McCandless wrote: > Looks good Dawid! > > On Tue, Mar 15, 2011 at 8:20 AM, wrote: >> Author: dweiss >> Date: Tue Mar 15 12:20:03 2011 >> New Revision: 1081745 >> >> URL: http://svn.apache.org/viewvc?rev=1081745&view=rev >> Log: >> Adding -noverify and a little bit nicer output to TestFSTs. These are >> debugging/analysis utils that are not used anywhere, so I commit them >> without the patch. >> >> Modified: >> >> lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/automaton/fst/TestFSTs.java >> >> Modified: >> lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/automaton/fst/TestFSTs.java >> URL: >> http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/automaton/fst/TestFSTs.java?rev=1081745&r1=1081744&r2=1081745&view=diff >> == >> --- >> lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/automaton/fst/TestFSTs.java >> (original) >> +++ >> lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/automaton/fst/TestFSTs.java >> Tue Mar 15 12:20:03 2011 >> @@ -25,16 +25,7 @@ import java.io.IOException; >> import java.io.InputStreamReader; >> import java.io.OutputStreamWriter; >> import java.io.Writer; >> -import java.util.ArrayList; >> -import java.util.Arrays; >> -import java.util.Collections; >> -import java.util.HashMap; >> -import java.util.HashSet; >> -import java.util.Iterator; >> -import java.util.List; >> -import java.util.Map; >> -import java.util.Random; >> -import java.util.Set; >> +import java.util.*; >> >> import org.apache.lucene.analysis.MockAnalyzer; >> import org.apache.lucene.document.Document; >> @@ -1098,7 +1089,7 @@ public class TestFSTs extends LuceneTest >> >> protected abstract T getOutput(IntsRef input, int ord) throws >> IOException; >> >> - public void run(int limit) throws IOException { >> + public void run(int limit, boolean verify) throws IOException { >> BufferedReader is = new BufferedReader(new InputStreamReader(new >> FileInputStream(wordsFileIn), "UTF-8"), 65536); >> try { >> final IntsRef intsRef = new IntsRef(10); >> @@ -1115,7 +1106,9 @@ public class TestFSTs extends LuceneTest >> >> ord++; >> if (ord % 50 == 0) { >> - System.out.println(((System.currentTimeMillis()-tStart)/1000.0) >> + "s: " + ord + "..."); >> + System.out.println( >> + String.format(Locale.ENGLISH, >> + "%6.2fs: %9d...", ((System.currentTimeMillis() - >> tStart) / 1000.0), ord)); >> } >> if (ord >= limit) { >> break; >> @@ -1144,6 +1137,10 @@ public class TestFSTs extends LuceneTest >> >> System.out.println("Saved FST to fst.bin."); >> >> + if (!verify) { >> + System.exit(0); >> + } >> + >> System.out.println("\nNow verify..."); >> >> is.close(); >> @@ -1194,6 +1191,7 @@ public class TestFSTs extends LuceneTest >> int inputMode = 0; // utf8 >> boolean storeOrds = false; >> boolean storeDocFreqs = false; >> + boolean verify = true; >> while(idx < args.length) { >> if (args[idx].equals("-prune")) { >> prune = Integer.valueOf(args[1+idx]); >> @@ -1215,6 +1213,9 @@ public class TestFSTs extends LuceneTest >> if (args[idx].equals("-ords")) { >> storeOrds = true; >> } >> + if (args[idx].equals("-noverify")) { >> + verify = false; >> + } >> idx++; >> } >> >> @@ -1235,7 +1236,7 @@ public class TestFSTs extends LuceneTest >> return new PairOutputs.Pair(o1.get(ord), >> >> o2.get(_TestUtil.nextInt(rand, 1, 5000))); >> } >> - }.run(limit); >> + }.run(limit, verify); >> } else if (storeOrds) { >> // Store only ords >> final PositiveIntOutputs outputs = >> PositiveIntOutputs.getSingleton(true); >> @@ -1244,7 +1245,7 @@ public class TestFSTs extends LuceneTest >> public Long getOutput(IntsRef input, int ord) { >> return outputs.get(ord); >> } >> - }.run(limit); >> + }.run(limit, verify); >> } else if (storeDocFreqs) { >> // Store only docFreq >> final PositiveIntOutputs outputs = >> PositiveIntOutputs.getSingleton(false); >> @@ -1257,7 +1258,7 @@ public class TestFSTs extends LuceneTest >> } >> return outputs.get(_TestUtil.nextInt(rand, 1, 5000)); >> } >> - }.run(limit); >> + }.run(limit, verify); >> } else { >> // Store nothing >> final NoOutputs outputs = NoOutputs.getSingleton(); >> @@ -1267,7 +1268,7 @@ public class TestFSTs extends LuceneTest >> public Object getOutput(IntsRef input, int ord) { >> return NO_OUTPUT; >> } >> - }.run(limit); >> + }.run(limit, verify); >> } >> } >> >
Re: svn commit: r1081745 - /lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/automaton/fst/TestFSTs.java
Looks good Dawid! On Tue, Mar 15, 2011 at 8:20 AM, wrote: > Author: dweiss > Date: Tue Mar 15 12:20:03 2011 > New Revision: 1081745 > > URL: http://svn.apache.org/viewvc?rev=1081745&view=rev > Log: > Adding -noverify and a little bit nicer output to TestFSTs. These are > debugging/analysis utils that are not used anywhere, so I commit them without > the patch. > > Modified: > > lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/automaton/fst/TestFSTs.java > > Modified: > lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/automaton/fst/TestFSTs.java > URL: > http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/automaton/fst/TestFSTs.java?rev=1081745&r1=1081744&r2=1081745&view=diff > == > --- > lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/automaton/fst/TestFSTs.java > (original) > +++ > lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/automaton/fst/TestFSTs.java > Tue Mar 15 12:20:03 2011 > @@ -25,16 +25,7 @@ import java.io.IOException; > import java.io.InputStreamReader; > import java.io.OutputStreamWriter; > import java.io.Writer; > -import java.util.ArrayList; > -import java.util.Arrays; > -import java.util.Collections; > -import java.util.HashMap; > -import java.util.HashSet; > -import java.util.Iterator; > -import java.util.List; > -import java.util.Map; > -import java.util.Random; > -import java.util.Set; > +import java.util.*; > > import org.apache.lucene.analysis.MockAnalyzer; > import org.apache.lucene.document.Document; > @@ -1098,7 +1089,7 @@ public class TestFSTs extends LuceneTest > > protected abstract T getOutput(IntsRef input, int ord) throws IOException; > > - public void run(int limit) throws IOException { > + public void run(int limit, boolean verify) throws IOException { > BufferedReader is = new BufferedReader(new InputStreamReader(new > FileInputStream(wordsFileIn), "UTF-8"), 65536); > try { > final IntsRef intsRef = new IntsRef(10); > @@ -1115,7 +1106,9 @@ public class TestFSTs extends LuceneTest > > ord++; > if (ord % 50 == 0) { > - System.out.println(((System.currentTimeMillis()-tStart)/1000.0) > + "s: " + ord + "..."); > + System.out.println( > + String.format(Locale.ENGLISH, > + "%6.2fs: %9d...", ((System.currentTimeMillis() - tStart) > / 1000.0), ord)); > } > if (ord >= limit) { > break; > @@ -1144,6 +1137,10 @@ public class TestFSTs extends LuceneTest > > System.out.println("Saved FST to fst.bin."); > > + if (!verify) { > + System.exit(0); > + } > + > System.out.println("\nNow verify..."); > > is.close(); > @@ -1194,6 +1191,7 @@ public class TestFSTs extends LuceneTest > int inputMode = 0; // utf8 > boolean storeOrds = false; > boolean storeDocFreqs = false; > + boolean verify = true; > while(idx < args.length) { > if (args[idx].equals("-prune")) { > prune = Integer.valueOf(args[1+idx]); > @@ -1215,6 +1213,9 @@ public class TestFSTs extends LuceneTest > if (args[idx].equals("-ords")) { > storeOrds = true; > } > + if (args[idx].equals("-noverify")) { > + verify = false; > + } > idx++; > } > > @@ -1235,7 +1236,7 @@ public class TestFSTs extends LuceneTest > return new PairOutputs.Pair(o1.get(ord), > > o2.get(_TestUtil.nextInt(rand, 1, 5000))); > } > - }.run(limit); > + }.run(limit, verify); > } else if (storeOrds) { > // Store only ords > final PositiveIntOutputs outputs = > PositiveIntOutputs.getSingleton(true); > @@ -1244,7 +1245,7 @@ public class TestFSTs extends LuceneTest > public Long getOutput(IntsRef input, int ord) { > return outputs.get(ord); > } > - }.run(limit); > + }.run(limit, verify); > } else if (storeDocFreqs) { > // Store only docFreq > final PositiveIntOutputs outputs = > PositiveIntOutputs.getSingleton(false); > @@ -1257,7 +1258,7 @@ public class TestFSTs extends LuceneTest > } > return outputs.get(_TestUtil.nextInt(rand, 1, 5000)); > } > - }.run(limit); > + }.run(limit, verify); > } else { > // Store nothing > final NoOutputs outputs = NoOutputs.getSingleton(); > @@ -1267,7 +1268,7 @@ public class TestFSTs extends LuceneTest > public Object getOutput(IntsRef input, int ord) { > return NO_OUTPUT; > } > - }.run(limit); > + }.run(limit, verify); > } > } > > > > -- Mike http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucen
[jira] Resolved: (SOLR-2426) Build failing
[ https://issues.apache.org/jira/browse/SOLR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved SOLR-2426. --- Resolution: Not A Problem Trunk requires java 6. > Build failing > - > > Key: SOLR-2426 > URL: https://issues.apache.org/jira/browse/SOLR-2426 > Project: Solr > Issue Type: Bug >Reporter: Bill Bell > > ant clean > ant example > trunk > [javac] ^ > [javac] > C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSetHitCo > llector.java:77: incompatible types > [javac] found : org.apache.solr.search.BitDocSet > [javac] required: org.apache.solr.search.DocSet > [javac] return new BitDocSet(bits,pos); > [javac] ^ > [javac] > C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSetHitCo > llector.java:132: incompatible types > [javac] found : org.apache.solr.search.SortedIntDocSet > [javac] required: org.apache.solr.search.DocSet > [javac] return new SortedIntDocSet(scratch, pos); > [javac] ^ > [javac] > C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSetHitCo > llector.java:136: incompatible types > [javac] found : org.apache.solr.search.BitDocSet > [javac] required: org.apache.solr.search.DocSet > [javac] return new BitDocSet(bits,pos); > [javac] ^ > [javac] > C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja > va:26: org.apache.solr.search.DocSlice is not abstract and does not override > abs > tract method getTopFilter() in org.apache.solr.search.DocSet > [javac] public class DocSlice extends DocSetBase implements DocList { > [javac]^ > [javac] > C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja > va:54: incompatible types > [javac] found : org.apache.solr.search.DocSlice > [javac] required: org.apache.solr.search.DocList > [javac] if (this.offset == offset && this.len==len) return this; > [javac]^ > [javac] > C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja > va:62: incompatible types > [javac] found : org.apache.solr.search.DocSlice > [javac] required: org.apache.solr.search.DocList > [javac] if (this.offset == offset && this.len == realLen) return this; > [javac] ^ > [javac] > C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja > va:63: incompatible types > [javac] found : org.apache.solr.search.DocSlice > [javac] required: org.apache.solr.search.DocList > [javac] return new DocSlice(offset, realLen, docs, scores, matches, > maxS > core); > [javac]^ > [javac] > C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja > va:130: intersection(org.apache.solr.search.DocSet) in > org.apache.solr.search.Do > cSet cannot be applied to (org.apache.solr.search.DocSlice) > [javac] return other.intersection(this); > [javac] ^ > [javac] > C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja > va:139: intersectionSize(org.apache.solr.search.DocSet) in > org.apache.solr.searc > h.DocSet cannot be applied to (org.apache.solr.search.DocSlice) > [javac] return other.intersectionSize(this); > [javac] ^ > [javac] > C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\ExtendedDis > maxQParserPlugin.java:829: warning: [unchecked] unchecked conversion > [javac] found : java.util.List > [javac] required: java.util.List > [javac] Query q = super.getBooleanQuery(clauses, disableCoord); > [javac] ^ > [javac] > C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\ExtendedDis > maxQParserPlugin.java:845: warning: [unchecked] unchecked conversion > [javac] found : java.util.List > [javac] required: java.util.List > [javac] super.addClause(clauses, conj, mods, q); > [javac] ^ > [javac] > C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\FastLRUCach > e.java:107: warning: [unchecked] unchecked cast > [javac] found : java.lang.Object > [javac] required: > java.util.List che.Stats> > [javac] statsList = (List) persistence; > [javac] ^ > [javac] > C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\FastLRUCach > e.java:263: warning: [unchecked] unchecked cast > [javac] found : java.util.Set > [javac] required: java.util.Set > [javac] for (Map.Entry e : (Set )items.entrySet()) { > [javac] ^ > [javac] >
[jira] Commented: (LUCENE-2957) generate-maven-artifacts target should include all non-Mavenized Lucene & Solr dependencies
[ https://issues.apache.org/jira/browse/LUCENE-2957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006878#comment-13006878 ] Dawid Weiss commented on LUCENE-2957: - Hi Steven. This issue is closed, but just to mark it for the future: I've added a retrowoven version of Carrot2-core, it will be part of maintenance release 3.4.4: https://oss.sonatype.org/content/repositories/snapshots/org/carrot2/carrot2-core/3.4.4-SNAPSHOT/ The -jdk15 classifier is the one working with Java 1.5 (I checked with our examples and they work fine, so there should be no problems with it in SOLR). > generate-maven-artifacts target should include all non-Mavenized Lucene & > Solr dependencies > --- > > Key: LUCENE-2957 > URL: https://issues.apache.org/jira/browse/LUCENE-2957 > Project: Lucene - Java > Issue Type: Improvement > Components: Build >Affects Versions: 3.1, 3.2, 4.0 >Reporter: Steven Rowe >Assignee: Steven Rowe >Priority: Minor > Fix For: 3.1, 3.2, 4.0 > > Attachments: LUCENE-2923-part3.patch, LUCENE-2957-part2.patch, > LUCENE-2957.patch > > > Currently, in addition to deploying artifacts for all of the Lucene and Solr > modules to a repository (by default local), the {{generate-maven-artifacts}} > target also deploys artifacts for the following non-Mavenized Solr > dependencies (lucene_solr_3_1 version given here): > # {{solr/lib/commons-csv-1.0-SNAPSHOT-r966014.jar}} as > org.apache.solr:solr-commons-csv:3.1 > # {{solr/lib/apache-solr-noggit-r944541.jar}} as > org.apache.solr:solr-noggit:3.1 > \\ \\ > The following {{.jar}}'s should be added to the above list (lucene_solr_3_1 > version given here): > \\ \\ > # {{lucene/contrib/icu/lib/icu4j-4_6.jar}} > # > {{lucene/contrib/benchmark/lib/xercesImpl-2.9.1-patched-XERCESJ}}{{-1257.jar}} > # {{solr/contrib/clustering/lib/carrot2-core-3.4.2.jar}}** > # {{solr/contrib/uima/lib/uima-an-alchemy.jar}} > # {{solr/contrib/uima/lib/uima-an-calais.jar}} > # {{solr/contrib/uima/lib/uima-an-tagger.jar}} > # {{solr/contrib/uima/lib/uima-an-wst.jar}} > # {{solr/contrib/uima/lib/uima-core.jar}} > \\ \\ > I think it makes sense to follow the same model as the current non-Mavenized > dependencies: > \\ \\ > * {{groupId}} = {{org.apache.solr/.lucene}} > * {{artifactId}} = {{solr-/lucene-}}, > * {{version}} = . > **The carrot2-core jar doesn't need to be included in trunk's release > artifacts, since there already is a Mavenized Java6-compiled jar. branch_3x > and lucene_solr_3_1 will need this Solr-specific Java5-compiled maven > artifact, though. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Solr query POST and not in GET
Hi, is possible to change Solr sending query method from get to post? because my query has a lot of OR..OR..OR and the log says to me Request URI too large Where can i change it?? thanx -- Gastone Penzo www.solr-italia.it The first italian blog about SOLR
[jira] Commented: (LUCENE-2573) Tiered flushing of DWPTs by RAM with low/high water marks
[ https://issues.apache.org/jira/browse/LUCENE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006871#comment-13006871 ] Simon Willnauer commented on LUCENE-2573: - bq. I still see a healtiness (mis-spelled) in DW. ugh. I will fix {quote} I'd rather not have the stalling/healthiness be baked into the API, at all. Can we put the hijack logic entirely private in the flush-by-ram policies? (Ie remove isStalled()/hijackThreadsForFlush()). {quote} I agree for the hijack part but the isStalled is something I might want to control. I mean we can still open it up eventually so rather make it private for now but keep a not on in. {quote} Can we move FlushSpecification out of FlushPolicy? Ie, it's a private impl detail of DW right? (Not part of FlushPolicy's API). Actually why do we need it? Can't we just return the DWPT? {quote} it currently holds the ram usage for that DWPT when it was checked out so that I can reduce the flushBytes accordingly. We can maybe get rid of it entirely but I don't want to rely on the DWPT bytesUsed() though. We can certainly move it out - this inner class is a relict though. bq. Why do we have a separate DocWriterSession? Can't this be absorbed into DocWriter? I generally don't like cluttering DocWriter and let it grow like IW. DocWriterSession might not be the ideal name for this class but its really a ram tracker for this DW. Yet, we can move out some parts that do not directly relate to mem tracking. Maybe DocWriterBytes? bq. Be careful defaulting TermsHash.trackAllocations to true – eg term vectors wants this to be false. I need to go through the IndexingChain and check carefully where to track memory anyway. I haven't got to that yet but good that you mention it that one could easily get lost. bq. Instead of FlushPolicy.message, can't the policy call DW.message? I don't want to couple that API to DW. What would be the benefit beside from saving a single method? {quote} On the by-RAM flush policies... when you hit the high water mark, we should 1) flush all DWPTs and 2) stall any other threads. {quote} Well I am not sure if we should do that. I don't really see why we should forcefully stop the world here. Incoming threads will pick up a flush immediately and if we have enough resources to index further why should we wait until all DWPT are flushed. if we stall I fear that we could queue up threads that could help flushing while stalling would simply stop them doing anything, right? You can still control this with the healthiness though. We currently do flush all DWPT btw. once we hit the HW. {quote} Why do we dereference the DWPTs with their ord? EG, can't we just store their "state" (active or flushPending) on the DWPT instead of in a separate states[]? {quote} That is definitely an option. I will give that a go. {quote} Do we really need FlushState.Aborted? And if not... do we really need FlushState (since it just becomes 2 states, ie, Active or Flushing, which I think is then redundant w/ flushPending boolean?). {quote} this needs some more refactoring I will attach another iteration {quote} I think the default low water should be 1X of your RAM buffer? And high water maybe 2X? (For both flush-by-RAM policies). {quote} hmm, I think we need to revise the maxRAMBufferMB Javadoc anyway so we have all the freedom to do whatever we want. yet, I think we should try to keep the RAM consumption similar to what it would have used in a previous release. So if we say HW is 2x then suddenly some apps might run out of memory. I am not sure if we should do that or rather stick to the 90% to 110% for now. We need to find good defaults for this anyway. > Tiered flushing of DWPTs by RAM with low/high water marks > - > > Key: LUCENE-2573 > URL: https://issues.apache.org/jira/browse/LUCENE-2573 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Michael Busch >Assignee: Simon Willnauer >Priority: Minor > Fix For: Realtime Branch > > Attachments: LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, > LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch > > > Now that we have DocumentsWriterPerThreads we need to track total consumed > RAM across all DWPTs. > A flushing strategy idea that was discussed in LUCENE-2324 was to use a > tiered approach: > - Flush the first DWPT at a low water mark (e.g. at 90% of allowed RAM) > - Flush all DWPTs at a high water mark (e.g. at 110%) > - Use linear steps in between high and low watermark: E.g. when 5 DWPTs are > used, flush at 90%, 95%, 100%, 105% and 110%. > Should we allow the user to configure the low and high water mark values > explicitly using total values (e.g. low water mark at 120MB, high water m
Re: I want to take part in Google Summer Code 2011
I did one of the project where i crawled the data through Nutch-1.0 and did indexing to Apache solr to establish a search engine with proper UI like autosuggest,spellcheck running on tomcat server . Now we are extending the project to included novel fuzzy queries usign OWA operator like "at least half", "as many as possible" etc...this is different from usual boolean search. We are refering to a paper presented by our respected Prof. M.M. Sufyan Beg. This will be implemented in Apache-solr . - Kumar Anurag -- View this message in context: http://lucene.472066.n3.nabble.com/I-want-to-take-part-in-Google-Summer-Code-2011-tp2668316p2680987.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-2242: Comment: was deleted (was: v2 of the release based on feedback. Note: SOLR-2242-distinctFacet.patch not needed (left for history)) > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1 > Here is an example on field "hgid" (without namedistinct): > {code} > - > - > 1 > 1 > 1 > 1 > 1 > 5 > 1 > > > {code} > With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, > HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, > HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows > (7), not the number of values (11). > {code} > - > - > 7 > > > {code} > This works actually really good to get total number of fields for a > group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-2242: Comment: was deleted (was: New ver) > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1 > Here is an example on field "hgid" (without namedistinct): > {code} > - > - > 1 > 1 > 1 > 1 > 1 > 5 > 1 > > > {code} > With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, > HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, > HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows > (7), not the number of values (11). > {code} > - > - > 7 > > > {code} > This works actually really good to get total number of fields for a > group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-2242: Comment: was deleted (was: Maybe, but I thought all params were supposed to be lower case? I can easily change that ??) > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1 > Here is an example on field "hgid" (without namedistinct): > {code} > - > - > 1 > 1 > 1 > 1 > 1 > 5 > 1 > > > {code} > With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, > HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, > HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows > (7), not the number of values (11). > {code} > - > - > 7 > > > {code} > This works actually really good to get total number of fields for a > group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-2242: Attachment: (was: SOLR-2242-distinctFacet.patch) > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1 > Here is an example on field "hgid" (without namedistinct): > {code} > - > - > 1 > 1 > 1 > 1 > 1 > 5 > 1 > > > {code} > With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, > HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, > HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows > (7), not the number of values (11). > {code} > - > - > 7 > > > {code} > This works actually really good to get total number of fields for a > group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006792#comment-13006792 ] Bill Bell edited comment on SOLR-2242 at 3/15/11 8:22 AM: -- I am going to use your suggestion. You will not have to set the limit. Getting the numFacetTerms will be optional, and you also will be able to NOT get the hgids as well. I propose this (please comment): This will ONLY output the numFacetTerms (no hgid facet counts): http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&f.hgid.facet.numFacetTerms=1 This assumes the count will be limit=-1 {code} 7 {code} This will output the numFacetTerms AND hgid: http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numFacetTerms=2 {code} 7 1 1 1 1 1 5 1 {code} was (Author: billnbell): I am going to use your suggestion. You will not have to set the limit. Getting the numFacetTerms will be optional, and you also will be able to NOT get the hgids as well. I propose this (please comment): This will ONLY output the numFacetTerms (no hgid facet counts): http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&f.hgid.facet.numfacetterms=1 This assumes the count will be limit=-1 {code} 7 {code} This will output the numFacetTerms AND hgid: http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=hgid&facet.mincount=1&f.hgid.facet.numfacetterms=2 {code} 7 1 1 1 1 1 5 1 {code} > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1 > Here is an example on field "hgid" (without namedistinct): > {code} > - > - > 1 > 1 > 1 > 1 > 1 > 5 > 1 > > > {code} > With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, > HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, > HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows > (7), not the number of values (11). > {code} > - > - > 7 > > > {code} > This works actually really good to get total number of fields for a > group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-2242: Attachment: SOLR.2242.v2.patch New ver > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-2242-distinctFacet.patch, SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1 > Here is an example on field "hgid" (without namedistinct): > {code} > - > - > 1 > 1 > 1 > 1 > 1 > 5 > 1 > > > {code} > With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, > HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, > HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows > (7), not the number of values (11). > {code} > - > - > 7 > > > {code} > This works actually really good to get total number of fields for a > group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-2242: Attachment: (was: SOLR-2242.v2.patch) > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-2242-distinctFacet.patch, SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1 > Here is an example on field "hgid" (without namedistinct): > {code} > - > - > 1 > 1 > 1 > 1 > 1 > 5 > 1 > > > {code} > With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, > HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, > HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows > (7), not the number of values (11). > {code} > - > - > 7 > > > {code} > This works actually really good to get total number of fields for a > group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2426) Build failing
Build failing - Key: SOLR-2426 URL: https://issues.apache.org/jira/browse/SOLR-2426 Project: Solr Issue Type: Bug Reporter: Bill Bell ant clean ant example trunk [javac] ^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSetHitCo llector.java:77: incompatible types [javac] found : org.apache.solr.search.BitDocSet [javac] required: org.apache.solr.search.DocSet [javac] return new BitDocSet(bits,pos); [javac] ^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSetHitCo llector.java:132: incompatible types [javac] found : org.apache.solr.search.SortedIntDocSet [javac] required: org.apache.solr.search.DocSet [javac] return new SortedIntDocSet(scratch, pos); [javac] ^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSetHitCo llector.java:136: incompatible types [javac] found : org.apache.solr.search.BitDocSet [javac] required: org.apache.solr.search.DocSet [javac] return new BitDocSet(bits,pos); [javac] ^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja va:26: org.apache.solr.search.DocSlice is not abstract and does not override abs tract method getTopFilter() in org.apache.solr.search.DocSet [javac] public class DocSlice extends DocSetBase implements DocList { [javac]^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja va:54: incompatible types [javac] found : org.apache.solr.search.DocSlice [javac] required: org.apache.solr.search.DocList [javac] if (this.offset == offset && this.len==len) return this; [javac]^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja va:62: incompatible types [javac] found : org.apache.solr.search.DocSlice [javac] required: org.apache.solr.search.DocList [javac] if (this.offset == offset && this.len == realLen) return this; [javac] ^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja va:63: incompatible types [javac] found : org.apache.solr.search.DocSlice [javac] required: org.apache.solr.search.DocList [javac] return new DocSlice(offset, realLen, docs, scores, matches, maxS core); [javac]^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja va:130: intersection(org.apache.solr.search.DocSet) in org.apache.solr.search.Do cSet cannot be applied to (org.apache.solr.search.DocSlice) [javac] return other.intersection(this); [javac] ^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\DocSlice.ja va:139: intersectionSize(org.apache.solr.search.DocSet) in org.apache.solr.searc h.DocSet cannot be applied to (org.apache.solr.search.DocSlice) [javac] return other.intersectionSize(this); [javac] ^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\ExtendedDis maxQParserPlugin.java:829: warning: [unchecked] unchecked conversion [javac] found : java.util.List [javac] required: java.util.List [javac] Query q = super.getBooleanQuery(clauses, disableCoord); [javac] ^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\ExtendedDis maxQParserPlugin.java:845: warning: [unchecked] unchecked conversion [javac] found : java.util.List [javac] required: java.util.List [javac] super.addClause(clauses, conj, mods, q); [javac] ^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\FastLRUCach e.java:107: warning: [unchecked] unchecked cast [javac] found : java.lang.Object [javac] required: java.util.List [javac] statsList = (List) persistence; [javac] ^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\FastLRUCach e.java:263: warning: [unchecked] unchecked cast [javac] found : java.util.Set [javac] required: java.util.Set [javac] for (Map.Entry e : (Set )items.entrySet()) { [javac] ^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\Grouping.ja va:61: warning: [unchecked] unchecked call to add(java.lang.String,T) as a membe r of the raw type org.apache.solr.common.util.NamedList [javac] grouped.add(key, groupResult); // grouped={ key={ [javac] ^ [javac] C:\Users\bbell\solr\solr\src\java\org\apache\solr\search\Grouping.ja va:64: warning: [unchecked] unchecked call to add(java.lang.String,T) as a membe r of the raw type o
ClassCastException SOLR 1709 Distributed Date Faceting
Folks, I applied the 4.x patch onto trunk and complied. However there seems to be run time exception as below Thanks Viswa type Status report message java.util.Date cannot be cast to java.lang.Integer java.lang.ClassCastException: java.util.Date cannot be cast to java.lang.Integer at org.apache.solr.handler.component.FacetComponent.countFacets(FacetComponent.java:294) at org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:232) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:326) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1325) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at java.lang.Thread.run(Unknown Source) description The server encountered an internal error (java.util.Date cannot be cast to java.lang.Integer java.lang.ClassCastException: java.util.Date cannot be cast to java.lang.Integer at org.apache.solr.handler.component.FacetComponent.countFacets(FacetComponent.java:294) at org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:232) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:326) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1325) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at java.lang.Thread.run(Unknown Source) ) that prevented it from fulfilling this request.