Re: [Lucene.Net] Blog Setup
On 2012-02-15, Christopher Currens wrote: That's similar to a suggestion Stefan made in another email: The only alternative would be [...] running a dynamic server on a dedicated VM. The later would be easier to negotiate for a top level project. Though, his response seems to imply that it would need to stay hosted on Apache servers? That's not what I meant. It's more an if it stays on Apache infrastructure then Personally I'd prefer to keep our stuff together in a single place, but there is no hard requirement. Stefan
[Lucene.Net] Re: trouble getting cms content to work correctly
Modulo this particular bug affecting only your publication requests, massive documentation commits merely require some follow-through (to publication) as I've written about today here: http://www.apache.org/dev/cmsref.html#mass-change So the regularity with which you do this won't present any particular problems other than increasing the frequency of the subsequent pain you needto endure to walk mass-changes through to the live site. Small changes will normally induce small amounts of pain (modulo this particular bug). HTH From: Prescott Nasser geobmx...@hotmail.com To: lucene-net-dev@lucene.apache.org Cc: joe_schae...@yahoo.com Sent: Wednesday, February 15, 2012 6:30 PM Subject: FW: trouble getting cms content to work correctly Took all day, but Joe was there babysitting and correcting things for us. Basically there is a bug in svn 1.6.17 that the CMS is based on, which is making our commits a pain at the moment. Once that gets upgraded it should be relatively smooth sailing. It won't help us though if we want still planning on updating massive amounts of documentation on a regular basis. Thanks Joe, I can't thank you enough for the help today. ~Prescott Date: Wed, 15 Feb 2012 14:49:48 -0800 From: joe_schae...@yahoo.com Subject: Re: trouble getting cms content to work correctly To: geobmx...@hotmail.com After some testing it appears that this performance bug is fixed in svn 1.7, but the CMS is currently running 1.6.17. I hope to have the host upgraded within the next 30 days or so, but for now I still recommend using the script. From: Prescott Nasser geobmx...@hotmail.com To: joe_schae...@yahoo.com Sent: Wednesday, February 15, 2012 5:28 PM Subject: RE: trouble getting cms content to work correctly Alright - sounds good Thanks again! ~P Date: Wed, 15 Feb 2012 14:25:45 -0800 From: joe_schae...@yahoo.com Subject: Re: trouble getting cms content to work correctly To: geobmx...@hotmail.com I'm having some svn people look at the merge issues. Right now all I can suggest is that you publish using the publish.pl script on people.apache.org. It's taking me about 10 min total to carry that out, which is certainly too long given the nature of the changes it's merging, but I'll let you know what I find out. From: Prescott Nasser geobmx...@hotmail.com To: joe_schae...@yahoo.com Sent: Wednesday, February 15, 2012 5:13 PM Subject: RE: trouble getting cms content to work correctly It's butt ugly - all in one directory, 8206 files. I'd prefer a more natural docs structure, but that's how it gets generated ~P Date: Wed, 15 Feb 2012 14:10:47 -0800 From: joe_schae...@yahoo.com Subject: Re: trouble getting cms content to work correctly To: geobmx...@hotmail.com Ok lemee kill it and use the publish.pl script on people to see if I can get it to work right. Just curious tho- about how many files do you have within that docs dir- all in one dir I presume? From: Prescott Nasser geobmx...@hotmail.com To: joe_schae...@yahoo.com Sent: Wednesday, February 15, 2012 5:08 PM Subject: RE: trouble getting cms content to work correctly I'm thinking still merge funk Date: Wed, 15 Feb 2012 14:05:21 -0800 From: joe_schae...@yahoo.com Subject: Re: trouble getting cms content to work correctly To: geobmx...@hotmail.com Looks like it just completed. Hmm, go ahead and publish and lets try this one more time. From: Joe Schaefer joe_schae...@yahoo.com To: Prescott Nasser geobmx...@hotmail.com Sent: Wednesday, February 15, 2012 5:02 PM Subject: Re: trouble getting cms content to work correctly Yeah more merge funk. Leave it run for now, but don't take any further action until you hear from me. From: Prescott Nasser geobmx...@hotmail.com To: joe_schae...@yahoo.com Sent: Wednesday, February 15, 2012 4:59 PM Subject: RE: trouble getting cms content to work correctly I hate to be the bearer of bad news... still taking days to publish (I'm not sure if there is a merge error or not) let me know I'll kill this quick Date: Wed, 15 Feb 2012 13:54:52 -0800 From: joe_schae...@yahoo.com Subject: Re: trouble getting cms content to work correctly To: geobmx...@hotmail.com Yeah try out the webgui and edit/commit/publish a minor change. It should take you no more than a minute or so total. From: Prescott Nasser geobmx...@hotmail.com To: joe_schae...@yahoo.com Sent: Wednesday, February 15, 2012 4:52 PM Subject: RE: trouble getting cms content to work correctly Man that sounds like a tool full of awesome! Ok - so for the moment no new docs, a
RE: Welcome James Dyer
A belated welcome! (I'm new too) - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Welcome-James-Dyer-tp3732469p3749495.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1279) ApostropheTokenizer
[ https://issues.apache.org/jira/browse/SOLR-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209231#comment-13209231 ] Mauro Asprea commented on SOLR-1279: I confirm this is working using the WordDelimiterFilterFactory like Robert said: {code} filter class=solr.WordDelimiterFilterFactory stemEnglishPossessive=0 preserveOriginal=1 catenateAll=1/ {code} The using Solr Admin Analysis page I get the following: Value: McDonal's ||Indexed Term| |McDonald's| |Mc| |Donald| |s| |McDonalds| One thing: You have to be sure that no previous filters remove the trailing 's. In my case I had the StandardFilterFactory which does remove tailing apostrophes. ApostropheTokenizer --- Key: SOLR-1279 URL: https://issues.apache.org/jira/browse/SOLR-1279 Project: Solr Issue Type: New Feature Components: Schema and Analysis Reporter: Sergey Borisov Priority: Minor Fix For: 3.6, 4.0 Attachments: ApostropheTokenizer.zip ApostropheTokenizer creates extra tokens during the analysis stage for the fields containing apostrophes. The reason for adding this is to ensure that documents that differ only by apostrophe have the same relevancy score. For example, if the document contains string McDonald's, it will be tokenized as McDonald's McDonalds. This way when the search is performed against McDonald's or McDonalds will produce similar score. This code handles up to two apostrophes in a token. To use this tokenizer add the following line in schema.xml analyzer type=index filter class=org.apache.lucene.analysis.ApostropheTokenFactory/ ... /analyzer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-1279) ApostropheTokenizer
[ https://issues.apache.org/jira/browse/SOLR-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209231#comment-13209231 ] Mauro Asprea edited comment on SOLR-1279 at 2/16/12 9:02 AM: - I confirm this is working using the WordDelimiterFilterFactory like Robert said: {code} filter class=solr.WordDelimiterFilterFactory stemEnglishPossessive=0 preserveOriginal=1 catenateAll=1/ {code} Then using Solr Admin Analysis page I get the following: Value: McDonald's ||Indexed Term| |McDonald's| |Mc| |Donald| |s| |McDonalds| One thing: You have to be sure that no previous filters remove the trailing 's. In my case I had the StandardFilterFactory which does remove tailing apostrophes. was (Author: brutuscat): I confirm this is working using the WordDelimiterFilterFactory like Robert said: {code} filter class=solr.WordDelimiterFilterFactory stemEnglishPossessive=0 preserveOriginal=1 catenateAll=1/ {code} The using Solr Admin Analysis page I get the following: Value: McDonald's ||Indexed Term| |McDonald's| |Mc| |Donald| |s| |McDonalds| One thing: You have to be sure that no previous filters remove the trailing 's. In my case I had the StandardFilterFactory which does remove tailing apostrophes. ApostropheTokenizer --- Key: SOLR-1279 URL: https://issues.apache.org/jira/browse/SOLR-1279 Project: Solr Issue Type: New Feature Components: Schema and Analysis Reporter: Sergey Borisov Priority: Minor Fix For: 3.6, 4.0 Attachments: ApostropheTokenizer.zip ApostropheTokenizer creates extra tokens during the analysis stage for the fields containing apostrophes. The reason for adding this is to ensure that documents that differ only by apostrophe have the same relevancy score. For example, if the document contains string McDonald's, it will be tokenized as McDonald's McDonalds. This way when the search is performed against McDonald's or McDonalds will produce similar score. This code handles up to two apostrophes in a token. To use this tokenizer add the following line in schema.xml analyzer type=index filter class=org.apache.lucene.analysis.ApostropheTokenFactory/ ... /analyzer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-1279) ApostropheTokenizer
[ https://issues.apache.org/jira/browse/SOLR-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209231#comment-13209231 ] Mauro Asprea edited comment on SOLR-1279 at 2/16/12 9:02 AM: - I confirm this is working using the WordDelimiterFilterFactory like Robert said: {code} filter class=solr.WordDelimiterFilterFactory stemEnglishPossessive=0 preserveOriginal=1 catenateAll=1/ {code} The using Solr Admin Analysis page I get the following: Value: McDonald's ||Indexed Term| |McDonald's| |Mc| |Donald| |s| |McDonalds| One thing: You have to be sure that no previous filters remove the trailing 's. In my case I had the StandardFilterFactory which does remove tailing apostrophes. was (Author: brutuscat): I confirm this is working using the WordDelimiterFilterFactory like Robert said: {code} filter class=solr.WordDelimiterFilterFactory stemEnglishPossessive=0 preserveOriginal=1 catenateAll=1/ {code} The using Solr Admin Analysis page I get the following: Value: McDonal's ||Indexed Term| |McDonald's| |Mc| |Donald| |s| |McDonalds| One thing: You have to be sure that no previous filters remove the trailing 's. In my case I had the StandardFilterFactory which does remove tailing apostrophes. ApostropheTokenizer --- Key: SOLR-1279 URL: https://issues.apache.org/jira/browse/SOLR-1279 Project: Solr Issue Type: New Feature Components: Schema and Analysis Reporter: Sergey Borisov Priority: Minor Fix For: 3.6, 4.0 Attachments: ApostropheTokenizer.zip ApostropheTokenizer creates extra tokens during the analysis stage for the fields containing apostrophes. The reason for adding this is to ensure that documents that differ only by apostrophe have the same relevancy score. For example, if the document contains string McDonald's, it will be tokenized as McDonald's McDonalds. This way when the search is performed against McDonald's or McDonalds will produce similar score. This code handles up to two apostrophes in a token. To use this tokenizer add the following line in schema.xml analyzer type=index filter class=org.apache.lucene.analysis.ApostropheTokenFactory/ ... /analyzer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers
[ https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209247#comment-13209247 ] Tommaso Teofili commented on LUCENE-3731: - Right, everything seems ok now. I also tried to comment the {noformat} property name=tests.threadspercpu value=0 / {noformat} line in build.xml in order to execute tests in parallel. Multiple parallel tests executions, with also -Dtests.multiplier=100, with Java6 passed flawlessly; will see if that is the case for Java7 too. Create a analysis/uima module for UIMA based tokenizers/analyzers - Key: LUCENE-3731 URL: https://issues.apache.org/jira/browse/LUCENE-3731 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 3.6, 4.0 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored out in a separate module (modules/analysis/uima) as they can be used in plain Lucene. Then the solr/contrib/uima will contain only the related factories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3792) Remove StringField
Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3792: Priority: Blocker (was: Major) Affects Version/s: 4.0 Fix Version/s: 4.0 Setting this as blocker (sorry). Its a huge trap when someone sets the same Analyzer on IndexWriter and QueryParser but the analysis isn't actually the same. Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Priority: Blocker Fix For: 4.0 Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209254#comment-13209254 ] Robert Muir commented on LUCENE-3792: - On 3.x, I'd like to deprecate NOT_ANALYZED for the same reasons. This at least discourages people from running into that trap there and using KeywordAnalyzer instead. Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Priority: Blocker Fix For: 4.0 Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209256#comment-13209256 ] Uwe Schindler commented on LUCENE-3792: --- The backside of this is now, that you need to explicitely use a KeywordAnalyzer now for Primary Key fields. If you don't run those through a query analyzer (e.g. generally produce TermQuery directly) then you have lots of additional work. For simple lookup queries and indexing a PK key, this is a no go. -1 on removing that completely, it should simply called different. We should maybe add PKQuery and PKField to have a symmetry. Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Priority: Blocker Fix For: 4.0 Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209256#comment-13209256 ] Uwe Schindler edited comment on LUCENE-3792 at 2/16/12 10:46 AM: - The backside of this is now, that you need to explicitely use a KeywordAnalyzer now for Primary Key fields. If you don't run those through a query analyzer (e.g. generally produce TermQuery directly) then you have lots of additional work. For simple lookup queries and indexing a PK key, this is a no go. -1 on removing that completely, it should simply called different. We should maybe add PKQuery (a ConstantScore TermQuery) and PKField to have a symmetry. was (Author: thetaphi): The backside of this is now, that you need to explicitely use a KeywordAnalyzer now for Primary Key fields. If you don't run those through a query analyzer (e.g. generally produce TermQuery directly) then you have lots of additional work. For simple lookup queries and indexing a PK key, this is a no go. -1 on removing that completely, it should simply called different. We should maybe add PKQuery and PKField to have a symmetry. Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Priority: Blocker Fix For: 4.0 Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209258#comment-13209258 ] Robert Muir commented on LUCENE-3792: - Well we are at a standstill. We constantly get these problems on the users list from NOT_ANALYZED and I don't like reintroducing the trap again. So I'm -1 to StringField in 4.0 Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Priority: Blocker Fix For: 4.0 Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209260#comment-13209260 ] Uwe Schindler commented on LUCENE-3792: --- I said call it different. Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Priority: Blocker Fix For: 4.0 Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209267#comment-13209267 ] Robert Muir commented on LUCENE-3792: - {quote} If you don't run those through a query analyzer (e.g. generally produce TermQuery directly) then you have lots of additional work. {quote} Thats not true, because keywordanalyzer does nothing to the terms, you can continue to produce termquery directly and it will work. So expert users are fine. This issue isnt about expert users, its about how our API traps people that are not expert users. Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Priority: Blocker Fix For: 4.0 Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-3792: -- Priority: Major (was: Blocker) Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0 Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209269#comment-13209269 ] Uwe Schindler commented on LUCENE-3792: --- bq. Well we are at a standstill. We constantly get these problems on the users list from NOT_ANALYZED You cannot prevent users from doing the wrong thing. If you remove StringField completely, pleaese also remove NumericField and force users to use PerFieldAnalyzerWrapper with a NumericTokenStream. If you add a numeric field you cannot ask for it with query parser. If you add a StringField, you cann ask with QueryParser. Simple rule. It must just be clearly documented. And possible StringField renamed. People using primary keys or other untokenized values should simply not use QueryParser. Use a ComstantScoreTermyQuery and you are fine. This is all just a documentation problem, so I am completely against removing that. Not everybody is using Lucene purely as a full-text engine. Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Priority: Blocker Fix For: 4.0 Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209272#comment-13209272 ] Robert Muir commented on LUCENE-3792: - {quote} If you remove StringField completely, pleaese also remove NumericField and force users to use PerFieldAnalyzerWrapper with a NumericTokenStream. {quote} I actually am not sure this is such a bad idea? If we were to enforce such a thing, it would also be possible to add a modification to the queryparser (instanceof NumericTokenStream) so that numeric fields then work out of the box with the query parser nicely? Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0 Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209273#comment-13209273 ] Robert Muir commented on LUCENE-3792: - {quote} Not everybody is using Lucene purely as a full-text engine. {quote} But we cannot let non-fulltext uses break the design for the intended use case (full-text). Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0 Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3793) Use ReferenceManager in DirectoryTaxonomyReader
Use ReferenceManager in DirectoryTaxonomyReader --- Key: LUCENE-3793 URL: https://issues.apache.org/jira/browse/LUCENE-3793 Project: Lucene - Java Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Priority: Minor Fix For: 3.6, 4.0 DirTaxoReader uses hairy code to protect its indexReader instance from being modified while threads use it. It maintains a ReentrantLock (indexReaderLock) which is obtained on every 'read' access, while refresh() locks it for 'write' operations (refreshing the IndexReader). Instead of all that, now that we have ReferenceManager in place, I think that we can write a ReaderManagerIndexReader which will be used by DirTR. Every method that requires access to the indexReader will acquire/release (not too different than obtaining/releasing the read lock), and refresh() will call ReaderManager.maybeRefresh(). It will simplify the code and remove some rather long comments, that go into great length explaining why does the code looks like that. This ReaderManager cannot be used for every IndexReader, because DirTR's refresh() logic is special -- it reopens the indexReader, and then verifies that the createTime still matches on the reopened reader as well. Otherwise, it closes the reopened reader and fails with an exception. Therefore, this ReaderManager.refreshIfNeeded will need to take the createTime into consideration and fail if they do not match. And while we're at it ... I wonder if we should have a manager for an IndexReader/ParentArray pair? I think that it makes sense because we don't want DirTR to use a ParentArray that does not match the IndexReader. Today this can happen in refresh() if e.g. after the indexReader instance has been replaced, parentArray.refresh(indexReader) fails. DirTR will be left with a newer IndexReader instance, but old (or worse, corrupt?) ParentArray ... I think it'll be good if we introduce clone() on ParentArray, or a new ctor which takes an int[]. I'll work on a patch once I finish with LUCENE-3786. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3794) DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing DirTaxoReader.refresh() to falsely succeed (or fail)
DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing DirTaxoReader.refresh() to falsely succeed (or fail) - Key: LUCENE-3794 URL: https://issues.apache.org/jira/browse/LUCENE-3794 Project: Lucene - Java Issue Type: Bug Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Fix For: 3.6, 4.0 DirTaxoWriter sets createTime to null after it put it in the commit data once. But that's wrong because if one calls commit(Map) twice, the second time doesn't record the creation time. Also, in the ctor, if an index exists and OpenMode is not CREATE, the creation time property is not read. I wrote a couple of unit tests that assert this, and modified DirTaxoWriter to always record the creation time (in every commit) -- that's the only safe way. Will upload a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3794) DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing DirTaxoReader.refresh() to falsely succeed (or fail)
[ https://issues.apache.org/jira/browse/LUCENE-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-3794: --- Attachment: LUCENE-3794.patch Patch fixes the bug + adds a couple of test cases to ensure the correct behavior. DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing DirTaxoReader.refresh() to falsely succeed (or fail) - Key: LUCENE-3794 URL: https://issues.apache.org/jira/browse/LUCENE-3794 Project: Lucene - Java Issue Type: Bug Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Fix For: 3.6, 4.0 Attachments: LUCENE-3794.patch DirTaxoWriter sets createTime to null after it put it in the commit data once. But that's wrong because if one calls commit(Map) twice, the second time doesn't record the creation time. Also, in the ctor, if an index exists and OpenMode is not CREATE, the creation time property is not read. I wrote a couple of unit tests that assert this, and modified DirTaxoWriter to always record the creation time (in every commit) -- that's the only safe way. Will upload a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3792: Attachment: LUCENE-3792_javadocs_3x.patch Its obvious Uwe and I aren't going to agree here immediately, so here is a patch adding a big warning to 3.x javadocs. For now I'd like to apply the same warning to StringField in trunk (I just made the patch against 3.x) Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3792_javadocs_3x.patch, LUCENE-3792_javadocs_3x.patch Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3792: Attachment: LUCENE-3792_javadocs_3x.patch Sorry, incomplete wording (I forgot to save before svn diff). Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3792_javadocs_3x.patch, LUCENE-3792_javadocs_3x.patch Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers
[ https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209301#comment-13209301 ] Tommaso Teofili commented on LUCENE-3731: - some improvement in performance came out releasing the CAS and AE on close() call {noformat} @Override public void close() throws IOException { super.close(); // release UIMA resources cas.release(); ae.destroy(); } {noformat} Now investigating the use of CASPool for improving throughput on high usages scenarios. Create a analysis/uima module for UIMA based tokenizers/analyzers - Key: LUCENE-3731 URL: https://issues.apache.org/jira/browse/LUCENE-3731 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 3.6, 4.0 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored out in a separate module (modules/analysis/uima) as they can be used in plain Lucene. Then the solr/contrib/uima will contain only the related factories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers
[ https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209304#comment-13209304 ] Robert Muir commented on LUCENE-3731: - Is that safe to do in Tokenizer.close() ? Because Tokenizer.close() is misleading/confusing, the instance is still reused after this for subsequent documents... in other words Tokenizer.close() closes resources like the Reader itself... it just happens to be that CAS/AE don't complain about you continuing to use them after they are release()'ed/destroy()'ed :) Create a analysis/uima module for UIMA based tokenizers/analyzers - Key: LUCENE-3731 URL: https://issues.apache.org/jira/browse/LUCENE-3731 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 3.6, 4.0 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored out in a separate module (modules/analysis/uima) as they can be used in plain Lucene. Then the solr/contrib/uima will contain only the related factories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3109) Rename FieldsConsumer to InvertedFieldsConsumer
[ https://issues.apache.org/jira/browse/LUCENE-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209305#comment-13209305 ] Iulius Curt commented on LUCENE-3109: - Is this still valid? (It looks like a good place for me to enter the community) Should also the *FieldsReader/Writer classes that derive FieldsProducer/Consumer become *InvertedFieldsReader/Writer? Rename FieldsConsumer to InvertedFieldsConsumer --- Key: LUCENE-3109 URL: https://issues.apache.org/jira/browse/LUCENE-3109 Project: Lucene - Java Issue Type: Task Components: core/codecs Affects Versions: 4.0 Reporter: Simon Willnauer Priority: Minor Fix For: 4.0 The name FieldsConsumer is missleading here it really is an InvertedFieldsConsumer and since we are extending codecs to consume non-inverted Fields we should be clear here. Same applies to Fields.java as well as FieldsProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3109) Rename FieldsConsumer to InvertedFieldsConsumer
[ https://issues.apache.org/jira/browse/LUCENE-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209309#comment-13209309 ] Simon Willnauer commented on LUCENE-3109: - bq. Is this still valid? (It looks like a good place for me to enter the community) I think so there should also be an InvertedFieldsProducer Rename FieldsConsumer to InvertedFieldsConsumer --- Key: LUCENE-3109 URL: https://issues.apache.org/jira/browse/LUCENE-3109 Project: Lucene - Java Issue Type: Task Components: core/codecs Affects Versions: 4.0 Reporter: Simon Willnauer Priority: Minor Fix For: 4.0 The name FieldsConsumer is missleading here it really is an InvertedFieldsConsumer and since we are extending codecs to consume non-inverted Fields we should be clear here. Same applies to Fields.java as well as FieldsProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3794) DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing DirTaxoReader.refresh() to falsely succeed (or fail)
[ https://issues.apache.org/jira/browse/LUCENE-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-3794. Resolution: Fixed Committed revision 1244960 (3x). Committed revision 1244964 (trunk). DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing DirTaxoReader.refresh() to falsely succeed (or fail) - Key: LUCENE-3794 URL: https://issues.apache.org/jira/browse/LUCENE-3794 Project: Lucene - Java Issue Type: Bug Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Fix For: 3.6, 4.0 Attachments: LUCENE-3794.patch DirTaxoWriter sets createTime to null after it put it in the commit data once. But that's wrong because if one calls commit(Map) twice, the second time doesn't record the creation time. Also, in the ctor, if an index exists and OpenMode is not CREATE, the creation time property is not read. I wrote a couple of unit tests that assert this, and modified DirTaxoWriter to always record the creation time (in every commit) -- that's the only safe way. Will upload a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3760) Cleanup DR.getCurrentVersion/DR.getUserData/DR.getIndexCommit().getUserData()
[ https://issues.apache.org/jira/browse/LUCENE-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-3760. Resolution: Fixed Lucene Fields: New,Patch Available (was: New) Resolving back ... looks like I'm the only one that it bothers. Cleanup DR.getCurrentVersion/DR.getUserData/DR.getIndexCommit().getUserData() - Key: LUCENE-3760 URL: https://issues.apache.org/jira/browse/LUCENE-3760 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.6, 4.0 Attachments: LUCENE-3760.patch, LUCENE-3760.patch Spinoff from Ryan's dev thread DR.getCommitUserData() vs DR.getIndexCommit().getUserData()... these methods are confusing/dups right now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-3776) NRTManager shouldn't expose its private SearcherManager
[ https://issues.apache.org/jira/browse/LUCENE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-3776: -- Assignee: Michael McCandless NRTManager shouldn't expose its private SearcherManager --- Key: LUCENE-3776 URL: https://issues.apache.org/jira/browse/LUCENE-3776 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Priority: Blocker Fix For: 3.6, 4.0 Spinoff from LUCENE-3769. To actually obtain an IndexSearcher from NRTManager, it's a 2-step process now. You must .getSearcherManager(), then .acquire() from the returned SearcherManager. This is very trappy... because if the app incorrectly calls maybeReopen on that private SearcherManager (instead of NRTManager.maybeReopen) then it can unexpectedly cause threads to block forever, waiting for the necessary gen to become visible. This will be hard to debug... I don't like creating trappy APIs. Hopefully once LUCENE-3761 is in, we can fix NRTManager to no longer expose its private SM, instead subclassing ReferenceManaager. Or alternatively, or in addition, maybe we factor out a new interface (SearcherProvider or something...) that only has acquire and release methods, and both NRTManager and ReferenceManager/SM impl that, and we keep NRTManager's SM private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3776) NRTManager shouldn't expose its private SearcherManager
[ https://issues.apache.org/jira/browse/LUCENE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3776: --- Attachment: LUCENE-3776.patch Patch, cutting over NRTManager to subclass ReferenceManager, and also some minor cleanups to ReferenceManager/SearcherManager. I added a method, afterRefresh, to ReferenceManager, which it calls after a refresh; NRTManager needs this so it can notify any waiting threads that the new gen is now live. NRTManager shouldn't expose its private SearcherManager --- Key: LUCENE-3776 URL: https://issues.apache.org/jira/browse/LUCENE-3776 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Priority: Blocker Fix For: 3.6, 4.0 Attachments: LUCENE-3776.patch Spinoff from LUCENE-3769. To actually obtain an IndexSearcher from NRTManager, it's a 2-step process now. You must .getSearcherManager(), then .acquire() from the returned SearcherManager. This is very trappy... because if the app incorrectly calls maybeReopen on that private SearcherManager (instead of NRTManager.maybeReopen) then it can unexpectedly cause threads to block forever, waiting for the necessary gen to become visible. This will be hard to debug... I don't like creating trappy APIs. Hopefully once LUCENE-3761 is in, we can fix NRTManager to no longer expose its private SM, instead subclassing ReferenceManaager. Or alternatively, or in addition, maybe we factor out a new interface (SearcherProvider or something...) that only has acquire and release methods, and both NRTManager and ReferenceManager/SM impl that, and we keep NRTManager's SM private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12433 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12433/ 1 tests failed. REGRESSION: org.apache.solr.cloud.RecoveryZkTest.testDistribSearch Error Message: expected:501 but was:432 Stack Trace: junit.framework.AssertionFailedError: expected:501 but was:432 at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) at org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:105) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:664) at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:700) at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:599) at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:504) at org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:562) Build Log (for compile errors): [...truncated 8153 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3776) NRTManager shouldn't expose its private SearcherManager
[ https://issues.apache.org/jira/browse/LUCENE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209392#comment-13209392 ] Shai Erera commented on LUCENE-3776: Patch looks good ! *SearcherManager* with these changes, if the app passes an IndexReader that is not DirectoryReader, it will get ClassCastException (if asserts are disabled). Is that ok? Perhaps it'd be better if you check that in SM's ctor and throw IllegalArgumentException? The problem is that app cannot pass DirReader in 3x, so this will apply to trunk only. In fact, I think that for trunk it will be better if SM declared it expects a DirectoryReader up front? We cannot avoid the cast in refreshIfNeeded because IR is obtained from IS, but at least the app won't hit ClassCastExceptions after it created SM? That kinda makes SearcherManager a DirReader only impl which is unfortunate IMO. But I'm not sure if any IR can openIfChanged() anymore, so perhaps that's unavoidable. *ReferenceManager* About close() -- do you think it'll be better to keep close() final, and introduce a new protected closeResource()/closeInternal() that NRTManager can override? That way, RefManagers won't accidentally override close() and forget to call super.close()? About afterRefresh() -- I'll admit that first I didn't understand why you need it. Previously, it was used to warm an IndexSearcher, but now we say it's the responsibility of SearcherFactory. I can see why it's useful for NRTManager, and it might even help me in LUCENE-3793 ! Do you think that we should declare that it can throw IOE? I know that if I'll use it in LUCENE-3793, I'll need that and I'd hate to throw RuntimeException. NRTManager can still override and not declare that. I'm just thinking that since almost all methods declare throwing IOE, it won't be odd if we declare it too on afterRefresh(), and it's not unlikely that afterRefresh() will do something that throws exceptions. *NRTManager* About openIfNeeded: # Can you cast to DirectoryReader once? I don't know if the assert is better than a ClassCastException ... with how the code is written, ClassCastException is better than assert because at least it will tell the user what went wrong? # How critical it is to declare newSearcher final? If you didn't, you could init it to null, and only change if newReader != null. Saving 4 lines of code (improves readability IMO -- something that I know you care about :)). NRTManager shouldn't expose its private SearcherManager --- Key: LUCENE-3776 URL: https://issues.apache.org/jira/browse/LUCENE-3776 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Priority: Blocker Fix For: 3.6, 4.0 Attachments: LUCENE-3776.patch Spinoff from LUCENE-3769. To actually obtain an IndexSearcher from NRTManager, it's a 2-step process now. You must .getSearcherManager(), then .acquire() from the returned SearcherManager. This is very trappy... because if the app incorrectly calls maybeReopen on that private SearcherManager (instead of NRTManager.maybeReopen) then it can unexpectedly cause threads to block forever, waiting for the necessary gen to become visible. This will be hard to debug... I don't like creating trappy APIs. Hopefully once LUCENE-3761 is in, we can fix NRTManager to no longer expose its private SM, instead subclassing ReferenceManaager. Or alternatively, or in addition, maybe we factor out a new interface (SearcherProvider or something...) that only has acquire and release methods, and both NRTManager and ReferenceManager/SM impl that, and we keep NRTManager's SM private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3079) Backport of Solr-1431 (CommComponent abstracted)
[ https://issues.apache.org/jira/browse/SOLR-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-3079: - Attachment: SOLR-3079.patch The patch isn't in SVN format, looks like you made it with Git? The git repo is a shadow repository, not used for released code as far as I know. Through the magic of IntelliJ, I managed to apply the patch and I'm uploading that version. Can you take a look and see if it made it through the transformations OK? And any Git people out there; is there magic to make Git produce a SVN-compatibile patch? Seems like a good addition to the How to contribute page, lots of people seem to be using Git... Beyond that, I'll run the tests with it and report back if there's a problem. I'd really like someone who knows what this is all about to take a look before committing Meanwhile, keep prompting G Backport of Solr-1431 (CommComponent abstracted) Key: SOLR-3079 URL: https://issues.apache.org/jira/browse/SOLR-3079 Project: Solr Issue Type: New Feature Components: search Affects Versions: 3.5 Reporter: Greg Bowyer Attachments: 0001-Initial-backport-of-solr-cloud-ShardHandler.patch, SOLR-3079.patch Initial attempt at backporting the work done for Solr-1431 into the 3.x series -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3079) Backport of Solr-1431 (CommComponent abstracted)
[ https://issues.apache.org/jira/browse/SOLR-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209399#comment-13209399 ] Robert Muir commented on SOLR-3079: --- {quote} And any Git people out there; is there magic to make Git produce a SVN-compatibile patch? Seems like a good addition to the How to contribute page, lots of people seem to be using Git... {quote} I just use patch -p1 when I want to apply git patches... (eclipse has a checkbox or some other gui-toggle for -p if you prefer guis) Backport of Solr-1431 (CommComponent abstracted) Key: SOLR-3079 URL: https://issues.apache.org/jira/browse/SOLR-3079 Project: Solr Issue Type: New Feature Components: search Affects Versions: 3.5 Reporter: Greg Bowyer Attachments: 0001-Initial-backport-of-solr-cloud-ShardHandler.patch, SOLR-3079.patch Initial attempt at backporting the work done for Solr-1431 into the 3.x series -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209430#comment-13209430 ] Robert Muir commented on LUCENE-3792: - OK, i think seriously it would take major work to do something here that would make everyone happy. I still don't like the situation, but unless there are serious objections, I'd like to commit the javadocs, just to hopefully reduce the amount of time this trap gets answered on the user list. Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3792_javadocs_3x.patch, LUCENE-3792_javadocs_3x.patch Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers
[ https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tommaso Teofili updated LUCENE-3731: Attachment: LUCENE-3731_rsrel.patch bq. Because Tokenizer.close() is misleading/confusing, the instance is still reused after this for subsequent documents. When I call close() it looks the correct way one could reuse that Tokenizer instance is by calling reset(someOtherInput) before doing anything else, so, after adding {code} assert reader != null : input has been closed, please reset it; {code} as first line inside the toString(Reader reader) method in BaseUIMATokenizer, I tried this test: {code} @Test public void testSetReaderAndClose() throws Exception { StringReader input = new StringReader(the big brown fox jumped on the wood); Tokenizer t = new UIMAAnnotationsTokenizer(/uima/AggregateSentenceAE.xml, org.apache.uima.TokenAnnotation, input); assertTokenStreamContents(t, new String[]{the, big, brown, fox, jumped, on, the, wood}); t.close(); try { t.incrementToken(); fail(should've been failed as reader is not set); } catch (AssertionError error) { // ok } input = new StringReader(hi oh my); t = new UIMAAnnotationsTokenizer(/uima/TestAggregateSentenceAE.xml, org.apache.lucene.uima.ts.TokenAnnotation, input); assertTrue(should've been incremented , t.incrementToken()); t.close(); try { t.incrementToken(); fail(should've been failed as reader is not set); } catch (AssertionError error) { // ok } t.reset(new StringReader(hey what do you say)); assertTrue(should've been incremented , t.incrementToken()); } {code} and it looks to me it's behaving correctly. Still working on improving it and trying to catch possible corner cases. Create a analysis/uima module for UIMA based tokenizers/analyzers - Key: LUCENE-3731 URL: https://issues.apache.org/jira/browse/LUCENE-3731 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 3.6, 4.0 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_rsrel.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored out in a separate module (modules/analysis/uima) as they can be used in plain Lucene. Then the solr/contrib/uima will contain only the related factories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers
[ https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209439#comment-13209439 ] Tommaso Teofili edited comment on LUCENE-3731 at 2/16/12 3:40 PM: -- bq. Because Tokenizer.close() is misleading/confusing, the instance is still reused after this for subsequent documents. When I call close() it looks the correct way one could reuse that Tokenizer instance is by calling reset(someOtherInput) before doing anything else, so, after adding {code} assert reader != null : input has been closed, please reset it; {code} as first line inside the toString(Reader reader) method in BaseUIMATokenizer, I tried this test: {code} @Test public void testSetReaderAndClose() throws Exception { StringReader input = new StringReader(the big brown fox jumped on the wood); Tokenizer t = new UIMAAnnotationsTokenizer(/uima/AggregateSentenceAE.xml, org.apache.uima.TokenAnnotation, input); assertTokenStreamContents(t, new String[]{the, big, brown, fox, jumped, on, the, wood}); t.close(); try { t.incrementToken(); fail(should've been failing as reader is not set); } catch (AssertionError error) { // ok } input = new StringReader(hi oh my); t = new UIMAAnnotationsTokenizer(/uima/TestAggregateSentenceAE.xml, org.apache.lucene.uima.ts.TokenAnnotation, input); assertTrue(should've been incremented , t.incrementToken()); t.close(); try { t.incrementToken(); fail(should've been failing as reader is not set); } catch (AssertionError error) { // ok } t.reset(new StringReader(hey what do you say)); assertTrue(should've been incremented , t.incrementToken()); } {code} and it looks to me it's behaving correctly. Still working on improving it and trying to catch possible corner cases. was (Author: teofili): bq. Because Tokenizer.close() is misleading/confusing, the instance is still reused after this for subsequent documents. When I call close() it looks the correct way one could reuse that Tokenizer instance is by calling reset(someOtherInput) before doing anything else, so, after adding {code} assert reader != null : input has been closed, please reset it; {code} as first line inside the toString(Reader reader) method in BaseUIMATokenizer, I tried this test: {code} @Test public void testSetReaderAndClose() throws Exception { StringReader input = new StringReader(the big brown fox jumped on the wood); Tokenizer t = new UIMAAnnotationsTokenizer(/uima/AggregateSentenceAE.xml, org.apache.uima.TokenAnnotation, input); assertTokenStreamContents(t, new String[]{the, big, brown, fox, jumped, on, the, wood}); t.close(); try { t.incrementToken(); fail(should've been failed as reader is not set); } catch (AssertionError error) { // ok } input = new StringReader(hi oh my); t = new UIMAAnnotationsTokenizer(/uima/TestAggregateSentenceAE.xml, org.apache.lucene.uima.ts.TokenAnnotation, input); assertTrue(should've been incremented , t.incrementToken()); t.close(); try { t.incrementToken(); fail(should've been failed as reader is not set); } catch (AssertionError error) { // ok } t.reset(new StringReader(hey what do you say)); assertTrue(should've been incremented , t.incrementToken()); } {code} and it looks to me it's behaving correctly. Still working on improving it and trying to catch possible corner cases. Create a analysis/uima module for UIMA based tokenizers/analyzers - Key: LUCENE-3731 URL: https://issues.apache.org/jira/browse/LUCENE-3731 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 3.6, 4.0 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_rsrel.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored out in a separate module (modules/analysis/uima) as they can be used in plain Lucene. Then the solr/contrib/uima will contain only the related factories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
[jira] [Resolved] (SOLR-2947) DIH caching bug - EntityRunner destroys child entity processor
[ https://issues.apache.org/jira/browse/SOLR-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer resolved SOLR-2947. -- Resolution: Fixed Trunk Only: r1245014 r1245018. Thank you Mikhail (now to the next one :) ). DIH caching bug - EntityRunner destroys child entity processor -- Key: SOLR-2947 URL: https://issues.apache.org/jira/browse/SOLR-2947 Project: Solr Issue Type: Sub-task Components: contrib - DataImportHandler Affects Versions: 4.0 Reporter: Mikhail Khludnev Assignee: James Dyer Labels: noob Fix For: 4.0 Attachments: SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, dih-cache-destroy-on-threads-fix.patch, dih-cache-threads-enabling-bug.patch My intention is fix multithread import with SQL cache. Here is the 2nd stage. If I enable DocBuilder.EntityRunner flow even for single thread, it breaks the pretty basic functionality: parent-child join. the reason is [line 473 entityProcessor.destroy();|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DocBuilder.java?revision=1201659view=markup] breaks children entityProcessor. see attachement comments for more details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12434 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12434/ 1 tests failed. FAILED: org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch Error Message: shard2 is not consistent, expected:52 and got:51 Stack Trace: junit.framework.AssertionFailedError: shard2 is not consistent, expected:52 and got:51 at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) at org.apache.solr.cloud.FullSolrCloudTest.checkShardConsistency(FullSolrCloudTest.java:1062) at org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.doTest(ChaosMonkeySafeLeaderTest.java:114) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:664) at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:700) at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:599) at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:504) at org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:562) Build Log (for compile errors): [...truncated 7501 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-2933) DIHCacheSupport ignores left side of where=xid=x.id attribute
[ https://issues.apache.org/jira/browse/SOLR-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer reassigned SOLR-2933: Assignee: James Dyer DIHCacheSupport ignores left side of where=xid=x.id attribute --- Key: SOLR-2933 URL: https://issues.apache.org/jira/browse/SOLR-2933 Project: Solr Issue Type: Sub-task Components: contrib - DataImportHandler Affects Versions: 4.0 Reporter: Mikhail Khludnev Assignee: James Dyer Priority: Minor Labels: noob, random Attachments: AbstractDataImportHandlerTestCase.java-choose-map-randomly.patch, SOLR-2933.patch Original Estimate: 1h Remaining Estimate: 1h DIHCacheSupport introduced at SOLR-2382 uses new config attributes cachePk and cacheLookup. But support old one where=xid=x.id is broken by [DIHCacheSupport.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DIHCacheSupport.java?view=markup] - it never put where= sides into the context, but it revealed by [SortedMapBackedCache.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/SortedMapBackedCache.java?view=markup], which takes just first column as a primary key. That's why all tests are green. To reproduce the issue I need just reorder entry at [line 219|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/TestCachedSqlEntityProcessor.java?revision=1201659view=markup] and make desc first and picked up as a primary key. To do that I propose to chose concrete map class randomly for all DIH test cases at [createMap()|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/AbstractDataImportHandlerTestCase.java?revision=1149600view=markup]. I'm attaching test breaking patch and seed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2933) DIHCacheSupport ignores left side of where=xid=x.id attribute
[ https://issues.apache.org/jira/browse/SOLR-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-2933: - Fix Version/s: 4.0 3.6 for 3.6, we should backport the test improvement (only). DIHCacheSupport ignores left side of where=xid=x.id attribute --- Key: SOLR-2933 URL: https://issues.apache.org/jira/browse/SOLR-2933 Project: Solr Issue Type: Sub-task Components: contrib - DataImportHandler Affects Versions: 4.0 Reporter: Mikhail Khludnev Assignee: James Dyer Priority: Minor Labels: noob, random Fix For: 3.6, 4.0 Attachments: AbstractDataImportHandlerTestCase.java-choose-map-randomly.patch, SOLR-2933.patch Original Estimate: 1h Remaining Estimate: 1h DIHCacheSupport introduced at SOLR-2382 uses new config attributes cachePk and cacheLookup. But support old one where=xid=x.id is broken by [DIHCacheSupport.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DIHCacheSupport.java?view=markup] - it never put where= sides into the context, but it revealed by [SortedMapBackedCache.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/SortedMapBackedCache.java?view=markup], which takes just first column as a primary key. That's why all tests are green. To reproduce the issue I need just reorder entry at [line 219|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/TestCachedSqlEntityProcessor.java?revision=1201659view=markup] and make desc first and picked up as a primary key. To do that I propose to chose concrete map class randomly for all DIH test cases at [createMap()|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/AbstractDataImportHandlerTestCase.java?revision=1149600view=markup]. I'm attaching test breaking patch and seed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2933) DIHCacheSupport ignores left side of where=xid=x.id attribute
[ https://issues.apache.org/jira/browse/SOLR-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209462#comment-13209462 ] James Dyer commented on SOLR-2933: -- I will commit this one shortly. DIHCacheSupport ignores left side of where=xid=x.id attribute --- Key: SOLR-2933 URL: https://issues.apache.org/jira/browse/SOLR-2933 Project: Solr Issue Type: Sub-task Components: contrib - DataImportHandler Affects Versions: 4.0 Reporter: Mikhail Khludnev Assignee: James Dyer Priority: Minor Labels: noob, random Fix For: 3.6, 4.0 Attachments: AbstractDataImportHandlerTestCase.java-choose-map-randomly.patch, SOLR-2933.patch Original Estimate: 1h Remaining Estimate: 1h DIHCacheSupport introduced at SOLR-2382 uses new config attributes cachePk and cacheLookup. But support old one where=xid=x.id is broken by [DIHCacheSupport.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DIHCacheSupport.java?view=markup] - it never put where= sides into the context, but it revealed by [SortedMapBackedCache.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/SortedMapBackedCache.java?view=markup], which takes just first column as a primary key. That's why all tests are green. To reproduce the issue I need just reorder entry at [line 219|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/TestCachedSqlEntityProcessor.java?revision=1201659view=markup] and make desc first and picked up as a primary key. To do that I propose to chose concrete map class randomly for all DIH test cases at [createMap()|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/AbstractDataImportHandlerTestCase.java?revision=1149600view=markup]. I'm attaching test breaking patch and seed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers
[ https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209474#comment-13209474 ] Robert Muir commented on LUCENE-3731: - Right, after you reset(Reader) you set a new reader. But the question is: is it safe to use CAS/AE after you call release()/destroy() on them? Because close() is called on tokenstreams after each invocation, in other words: {noformat} Tokenizer t = new Tokenizer(reader); ... stuff ... t.close(); t.reset(someOtherReader); .. stuff ... t.close(); {noformat} So what does CAS.release() really mean? If it means you should not use the CAS again afterwards, then we cannot have it in TokenStream.close(), and same with AE.destroy() Create a analysis/uima module for UIMA based tokenizers/analyzers - Key: LUCENE-3731 URL: https://issues.apache.org/jira/browse/LUCENE-3731 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 3.6, 4.0 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_rsrel.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored out in a separate module (modules/analysis/uima) as they can be used in plain Lucene. Then the solr/contrib/uima will contain only the related factories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2947) DIH caching bug - EntityRunner destroys child entity processor
[ https://issues.apache.org/jira/browse/SOLR-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209477#comment-13209477 ] Robert Muir commented on SOLR-2947: --- Hi James: I think we should add a CHANGES.txt entry for this fix? DIH caching bug - EntityRunner destroys child entity processor -- Key: SOLR-2947 URL: https://issues.apache.org/jira/browse/SOLR-2947 Project: Solr Issue Type: Sub-task Components: contrib - DataImportHandler Affects Versions: 4.0 Reporter: Mikhail Khludnev Assignee: James Dyer Labels: noob Fix For: 4.0 Attachments: SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, dih-cache-destroy-on-threads-fix.patch, dih-cache-threads-enabling-bug.patch My intention is fix multithread import with SQL cache. Here is the 2nd stage. If I enable DocBuilder.EntityRunner flow even for single thread, it breaks the pretty basic functionality: parent-child join. the reason is [line 473 entityProcessor.destroy();|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DocBuilder.java?revision=1201659view=markup] breaks children entityProcessor. see attachement comments for more details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been in development for approximately 1 year by David Smiley, Ryan McKinley, and Chris Male and we feel it is ready. LSP is here: http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene module is intuitively in svn/trunk/spatial-lucene/. I'll add more comments to prevent the issue description from being too long. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2947) DIH caching bug - EntityRunner destroys child entity processor
[ https://issues.apache.org/jira/browse/SOLR-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209489#comment-13209489 ] James Dyer commented on SOLR-2947: -- This bug was caused by SOLR-2382 which is trunk-only. Do we need a CHANGES.txt entry for that? DIH caching bug - EntityRunner destroys child entity processor -- Key: SOLR-2947 URL: https://issues.apache.org/jira/browse/SOLR-2947 Project: Solr Issue Type: Sub-task Components: contrib - DataImportHandler Affects Versions: 4.0 Reporter: Mikhail Khludnev Assignee: James Dyer Labels: noob Fix For: 4.0 Attachments: SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, dih-cache-destroy-on-threads-fix.patch, dih-cache-threads-enabling-bug.patch My intention is fix multithread import with SQL cache. Here is the 2nd stage. If I enable DocBuilder.EntityRunner flow even for single thread, it breaks the pretty basic functionality: parent-child join. the reason is [line 473 entityProcessor.destroy();|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DocBuilder.java?revision=1201659view=markup] breaks children entityProcessor. see attachement comments for more details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers
[ https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209490#comment-13209490 ] Tommaso Teofili commented on LUCENE-3731: - bq. But the question is: is it safe to use CAS/AE after you call release()/destroy() on them? no it isn't, so you're right: those methods should not be inside the close() method. Create a analysis/uima module for UIMA based tokenizers/analyzers - Key: LUCENE-3731 URL: https://issues.apache.org/jira/browse/LUCENE-3731 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 3.6, 4.0 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_rsrel.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored out in a separate module (modules/analysis/uima) as they can be used in plain Lucene. Then the solr/contrib/uima will contain only the related factories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2947) DIH caching bug - EntityRunner destroys child entity processor
[ https://issues.apache.org/jira/browse/SOLR-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209495#comment-13209495 ] Robert Muir commented on SOLR-2947: --- Sorry James, my mistake, I didn't realize that! DIH caching bug - EntityRunner destroys child entity processor -- Key: SOLR-2947 URL: https://issues.apache.org/jira/browse/SOLR-2947 Project: Solr Issue Type: Sub-task Components: contrib - DataImportHandler Affects Versions: 4.0 Reporter: Mikhail Khludnev Assignee: James Dyer Labels: noob Fix For: 4.0 Attachments: SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, dih-cache-destroy-on-threads-fix.patch, dih-cache-threads-enabling-bug.patch My intention is fix multithread import with SQL cache. Here is the 2nd stage. If I enable DocBuilder.EntityRunner flow even for single thread, it breaks the pretty basic functionality: parent-child join. the reason is [line 473 entityProcessor.destroy();|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DocBuilder.java?revision=1201659view=markup] breaks children entityProcessor. see attachement comments for more details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
[ https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209497#comment-13209497 ] David Smiley commented on LUCENE-3795: -- LSP is comprised of several modules: * spatial-lucene: The heart of the project. * spatial-solr: Solr support, notably field types using spatial-lucene. * spatial-extras: An extension of spatial-lucene that uses JTS (LGPL licensed) for polygon support. * spatial-demo: A demonstration web UI using OpenLayers, Solr, Wicket, and the other LSP modules. The spatial-solr module of LSP can be considered in another issue following the conclusion of this one. The other modules aren't being considered for incorporation into Lucene/Solr. LSP is largely new code although some of it originated using chunks of the existing Lucene spatial contrib module and SOLR-2155 (A recursive PrefixTree/Trie algorithm using geohashes). It's fair to say this is a superset and descendent of SOLR-2155 but with a real framework around it and plenty of refactorings and tests. I ran Atlassian's Clover code coverage to get some statistics of this spatial-lucene module of LSP: * LOC: 6,605, NCLOC: 3,959 * Packages: 18, Classes: 70 * Code coverage: 53% The code coverage surprises me a little... perhaps the number is higher when the spatial-solr module gets involved which uses more of the classes then the tests do here alone. Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been in development for approximately 1 year by David Smiley, Ryan McKinley, and Chris Male and we feel it is ready. LSP is here: http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene module is intuitively in svn/trunk/spatial-lucene/. I'll add more comments to prevent the issue description from being too long. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3137) When solr.xml is persisted, you lose all system property substitution that was used.
[ https://issues.apache.org/jira/browse/SOLR-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-3137: -- Attachment: SOLR-3137.patch updates patch - close to done I think - I don't handle properties because of some oddity I have not figured out - they appear to stored un-sys-subbed, but then when written out they are subbed? I'm not sure they are that important to handle anyway? When solr.xml is persisted, you lose all system property substitution that was used. - Key: SOLR-3137 URL: https://issues.apache.org/jira/browse/SOLR-3137 Project: Solr Issue Type: Bug Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.0 Attachments: SOLR-3137.patch, SOLR-3137.patch A lesser issue is that we also write out properties that where not originally in the file with the defaults they picked up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3767) Explore streaming Viterbi search in Kuromoji
[ https://issues.apache.org/jira/browse/LUCENE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209540#comment-13209540 ] Michael McCandless commented on LUCENE-3767: I think the branch is ready to land... I'll post an applyable patch soon. In Mode.SEARCH the tokenizer produces the same tokens as current trunk. The only real end-user visible change is the addition of Mode.SEARCH_WITH_COMPOUNDS, which can produce two paths (compound token + its segmentation). This mode uses the new PositionLengthAttribute to record how long the compound token is. In this mode, the Viterbi search first runs without penalties, but then, if a too-long token (a token where the penalty would have been 0) is in the best path, we effectively re-run the Viterbi under that compound token, this time with penalties included. If this results in a different backtrace, we add that into the output tokens as well. Note that this will not produce congruent results as Mode.SEARCH, because the 2nd segmentation runs in context of the best path, meaning the chosen best wordID before and after the compound token are enforced in the 2nd segmentation. Sometimes this results in still picking only the compound token where trunk today would have split it up. From TestQuality, the total number of edits was 4418 vs trunk's 4828. I didn't explore this, but, we may want to use harsher penalties in SEARCH_WITH_COMPOUNDS mode, ie, since we're going to output the compound as well we may as well try harder to produce the 2nd best segmentation. I left the default mode as Mode.SEARCH... maybe if we can somehow run some relevance tests we can make the default SEARCH_WITH_COMPOUNDS. But it'd also be tricky at query time... It looks like the rolling Viterbi is a bit faster (~16%: 1460 bytes/msec vs 1700 bytes/msec on TestQuality.testSingleText). Explore streaming Viterbi search in Kuromoji Key: LUCENE-3767 URL: https://issues.apache.org/jira/browse/LUCENE-3767 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.6, 4.0 Attachments: LUCENE-3767.patch, LUCENE-3767.patch I've been playing with the idea of changing the Kuromoji viterbi search to be 2 passes (intersect, backtrace) instead of 4 passes (break into sentences, intersect, score, backtrace)... this is very much a work in progress, so I'm just getting my current state up. It's got tons of nocommits, doesn't properly handle the user dict nor extended modes yet, etc. One thing I'm playing with is to add a double backtrace for the long compound tokens, ie, instead of penalizing these tokens so that shorter tokens are picked, leave the scores unchanged but on backtrace take that penalty and use it as a threshold for a 2nd best segmentation... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 1743 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/1743/ 1 tests failed. FAILED: org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch Error Message: shard1 is not consistent, expected:62 and got:63 Stack Trace: junit.framework.AssertionFailedError: shard1 is not consistent, expected:62 and got:63 at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) at org.apache.solr.cloud.FullSolrCloudTest.checkShardConsistency(FullSolrCloudTest.java:1062) at org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.doTest(ChaosMonkeySafeLeaderTest.java:114) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:664) at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:700) at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:599) at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:504) at org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:562) Build Log (for compile errors): [...truncated 10291 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12436 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12436/ 1 tests failed. REGRESSION: org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch Error Message: shard3 is not consistent, expected:59 and got:58 Stack Trace: junit.framework.AssertionFailedError: shard3 is not consistent, expected:59 and got:58 at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) at org.apache.solr.cloud.FullSolrCloudTest.checkShardConsistency(FullSolrCloudTest.java:1062) at org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.doTest(ChaosMonkeySafeLeaderTest.java:114) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:664) at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:700) at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:599) at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:504) at org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:562) Build Log (for compile errors): [...truncated 7674 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
[ https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209588#comment-13209588 ] David Smiley edited comment on LUCENE-3795 at 2/16/12 6:38 PM: --- h3. Features The main goals of LSP is to be a great framework to plug in spatial search algorithms and shape implementations. It of course includes good implementations of these key abstractions. Here are some key features, most of which related to using RecursivePrefixTreeStrategy with geohashes: * Multi-valued fields * Index shapes that have area (e.g. not just points) Tests have yet to be added for this. * No special RAM caches for filtering, just standard term index Unlike Solr's LatLonType which needs to cache all points in RAM if the query shape is a circle * Fast filtering Although SOLR-2155 has been proven, technically LSP hasn't. 3rd party anecodes re-inforce this claim. * Multi-value sort Based on closest index point to center of query shape. Distances are returned via the score of an LSP query. * Specify precision of query shape and index shape Thereby allowing for faster filtering tunable precision * Multiple distance algorithms: ** Spherical: Law of Cosines, Haversine, Vincenty ** Cartesian: Pythagorean Theorem * Cartesian (2d flat) Geospatial sphere models h3. Todo There are many things I want to improve and add but in my view there isn't anything truly making this non-committable. Chris has raised concerns that the other committers will want to see benchmark results before accepting this. I'll leave that for you (the other committers) to decide. And I also heard that some committers are unsure wether Lucene should have a spatial module at all. However there is certainly demand for it, at least at the Solr level. Furthermore, there are some non-spatial use cases of the spatial module. One interesting use-case is RecursivePrefixTreeStrategy's (RPTS) unique ability to index shapes with area. If you had a requirement to index a variable number of time durations, then unlike Lucene's trie numeric support in which only discrete numbers are supported, RPTS could be used with x being time and y being unused. Buy the way, PrefixTree and Trie are synonymous words. was (Author: dsmiley): h3. Features The main goals of LSP is to be a great framework to plug in spatial search algorithms and shape implementations. It of course includes good implementations of these key abstractions. Here are some key features, most of which related to using RecursivePrefixTreeStrategy with geohashes: * Multi-valued fields * Index shapes that have area (e.g. not just points) Tests have yet to be added for this. * No special RAM caches for filtering, just standard term index Unlike Solr's LatLonType which needs to cache all points in RAM if the query shape is a circle * Fast filtering Although SOLR-2155 has been proven, technically LSP hasn't. 3rd party anecodes re-inforce this claim. * Multi-value sort Based on closest index point to center of query shape. Distances are returned via the score of an LSP query. * Specify precision of query shape and index shape Thereby allowing for faster filtering tunable precision * Multiple distance algorithms: ** Spherical: Law of Cosines, Haversine, Vincency ** Cartesian: Pythagorean Theorem * Cartesian (2d flat) Geospatial sphere models h3. Todo There are many things I want to improve and add but in my view there isn't anything truly making this non-committable. Chris has raised concerns that the other committers will want to see benchmark results before accepting this. I'll leave that for you (the other committers) to decide. And I also heard that some committers are unsure wether Lucene should have a spatial module at all. However there is certainly demand for it, at least at the Solr level. Furthermore, there are some non-spatial use cases of the spatial module. One interesting use-case is RecursivePrefixTreeStrategy's (RPTS) unique ability to index shapes with area. If you had a requirement to index a variable number of time durations, then unlike Lucene's trie numeric support in which only discrete numbers are supported, RPTS could be used with x being time and y being unused. Buy the way, PrefixTree and Trie are synonymous words. Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module
LUCENE-3795 Replace spatial contrib module with LSP's spatial-lucene module
I made a major proposal to Lucene to replace its spatial contrib module with one in LSP -- a project that Chris Male, Ryan McKinley and I have been working on. In case you guys missed the JIRA issue, here it is: https://issues.apache.org/jira/browse/LUCENE-3795 I ask for any input, assuming you have an opinion. ~ David - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3778) Create a grouping convenience class
[ https://issues.apache.org/jira/browse/LUCENE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209594#comment-13209594 ] Michael McCandless commented on LUCENE-3778: {quote} bq. Would you also handle block (single pass) grouping with the same class...? I think we can do this. The block grouping returns TopGroups as result. {quote} Nice. {quote} bq. I guess you'd then .getAllGroups(), .getAllGroupHeads() after .search(...)? Yes, we need that. In the case of getAllGroups() the TopGroups#totalGroupCount field can be used when the user is only interested in the number of matching groups. {quote} OK. {quote} bq. Hmm would we try to handle Term/BytesRef and Function/MutableValue with the same class? With generics? {quote} I think so... but I think it may get tricky. Like, I think you should specify up front (to GroupingSearch ctor) the required things about your request (block join OR group field OR field + DV type OR function VS/ctx map), setters for the numerous optional things (sort, groupSort, getScores, getMaxScores, maxDocsPerGroup) and maybe params to search for the per-requesty things (topNGroups, groupOffset, withinGroupOffset). But then the T will depend on which ctor you used...? Not sure how it'd work... bq. Maybe distributed grouping needs its own class? Since the usage is different from a non distributed grouping. Yeah... Maybe we can do this for join module too! Create a grouping convenience class --- Key: LUCENE-3778 URL: https://issues.apache.org/jira/browse/LUCENE-3778 Project: Lucene - Java Issue Type: Improvement Components: modules/grouping Reporter: Martijn van Groningen Currently the grouping module has many collector classes with a lot of different options per class. I think it would be a good idea to have a GroupUtil (Or another name?) convenience class. I think this could be a builder, because of the many options (sort,sortWithinGroup,groupOffset,groupCount and more) and implementations (term/dv/function) grouping has. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3750) Convert Versioned docs to Markdown/New CMS
[ https://issues.apache.org/jira/browse/LUCENE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209602#comment-13209602 ] Yonik Seeley commented on LUCENE-3750: -- bq. if 1 out of N committers who have tried doing local www site builds can't get it to work +1 more I've spent the last few hours trying to get it to work on my OS-X (lion) box... I figured out how to install the cpan perl modules (not being a perl person myself), and the python modules installed fine, but now the daemon just won't run: {code}/opt/code/cms/build$ python --version Python 2.7.1 /opt/code/cms/build$ export MARKDOWN_SOCKET=`pwd`/markdown.socket PYTHONPATH=`pwd` /opt/code/cms/build$ python markdownd.py /opt/code/cms/build$ No handlers could be found for logger MARKDOWN Traceback (most recent call last): File markdownd.py, line 41, in module 'codehilite', 'elementid', 'footnotes', 'abbr']) File build/bdist.macosx-10.7-intel/egg/markdown/__init__.py, line 395, in markdown File build/bdist.macosx-10.7-intel/egg/markdown/__init__.py, line 134, in __init__ File build/bdist.macosx-10.7-intel/egg/markdown/__init__.py, line 166, in registerExtensions ValueError: Extension __builtin__.NoneType must be of type: markdown.Extension. {code} Convert Versioned docs to Markdown/New CMS -- Key: LUCENE-3750 URL: https://issues.apache.org/jira/browse/LUCENE-3750 Project: Lucene - Java Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor Since we are moving our main site to the ASF CMS (LUCENE-2748), we should bring in any new versioned Lucene docs into the same format so that we don't have to deal w/ Forrest anymore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
[ https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209605#comment-13209605 ] Simon Willnauer commented on LUCENE-3795: - wow this is a lot of stuff. we certainly need a code donation for this. without getting into details +1 from my side. I think lucene desperatly needs spatial support... it should be a module IMO. we should drop the stuff we have an get this in shape ie. into a module. I am not sure about the LGPL stuff I guess we should try to integrate everything else and if we really want or if there is a way to integrate the LGPL stuff we can take care of this later! Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been in development for approximately 1 year by David Smiley, Ryan McKinley, and Chris Male and we feel it is ready. LSP is here: http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene module is intuitively in svn/trunk/spatial-lucene/. I'll add more comments to prevent the issue description from being too long. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209613#comment-13209613 ] David Smiley commented on SOLR-2155: If someone watching this issue has an interest in this capability winding its way into Solr out of the box, then I suggest you vote (and maybe watch) LUCENE-3795. That issue is the first step, the subsequent step is a follow-on issue that will bring LSP's spatial-solr module which uses spatial-lucene (LUCENE-3795). I don't intend or support committing SOLR-2155 as is. Spatial done-right should involve a good framework; SOLR-2155 isn't a framework and Lucene's existing defunct spatial-contrib module isn't good. That's where LSP comes in, and LUCENE-3795 is the first step to get it incorporated into Lucene/Solr. Geospatial search using geohash prefixes Key: SOLR-2155 URL: https://issues.apache.org/jira/browse/SOLR-2155 Project: Solr Issue Type: Improvement Reporter: David Smiley Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch There currently isn't a solution in Solr for doing geospatial filtering on documents that have a variable number of points. This scenario occurs when there is location extraction (i.e. via a gazateer) occurring on free text. None, one, or many geospatial locations might be extracted from any given document and users want to limit their search results to those occurring in a user-specified area. I've implemented this by furthering the GeoHash based work in Lucene/Solr with a geohash prefix based filter. A geohash refers to a lat-lon box on the earth. Each successive character added further subdivides the box into a 4x8 (or 8x4 depending on the even/odd length of the geohash) grid. The first step in this scheme is figuring out which geohash grid squares cover the user's search query. I've added various extra methods to GeoHashUtils (and added tests) to assist in this purpose. The next step is an actual Lucene Filter, GeoHashPrefixFilter, that uses these geohash prefixes in TermsEnum.seek() to skip to relevant grid squares in the index. Once a matching geohash grid is found, the points therein are compared against the user's query to see if it matches. I created an abstraction GeoShape extended by subclasses named PointDistance... and CartesianBox to support different queried shapes so that the filter need not care about these details. This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome David Smiley
Thanks Mikhail. Here's why: https://issues.apache.org/jira/browse/SOLR-2155?focusedCommentId=13209613page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13209613 ~ David On Feb 10, 2012, at 1:20 PM, Mikhail Khludnev [via Lucene] wrote: I'm joining to all congratulations above! Btw, as well you have password, why don't commit SOLR-2155? Regards On Mon, Feb 6, 2012 at 10:54 AM, David Smiley (@MITRE.org) [hidden email]x-msg://154/user/SendEmail.jtp?type=nodenode=3733295i=0 wrote: Wow! It is truly an honor to be selected by the Lucene PMC to join the committer ranks. You are a top notch team of coders working on one of the most important open-source projects. About me: My technical background is all tiers of web development with a focus on the middle tier and Java. Of course I have expertise in Lucene and Solr but I also focus on geospatial matters as well as threading / concurrency. I like solving hard interesting problems. I am employed full time by The MITRE Corporation, a US federally funded non-profit organization in which I mostly work in the defense sector. I've been with MITRE for ~14 years. I've been fortunate lately to work on projects that fund my open-source geospatial work. I conduct Solr training at MITRE (1 day and 2-day classes), and I'm sort of a search consultant within MITRE, advising MITRE and its government clients. For 6 months, I have also been working part-time for OpenSource Connections as a search consultant. At home, I'm married with two kids: Adeline who is 10 months old (she's in my arms sleeping as I write this) and Camille who is 2 years 10 months old. I don't know how I found the time to write a book, but now that it's done, I'm on full parental duty when at home. For fun, I like to follow Starcraft 2 professional e-sports. It's conveniently something I can do while I hold a baby; playing the game isn't, unfortuantely. I look forward to meeting you all at Lucene Revolution in May! I live close by in Lowell. Cheers, David Smiley - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Welcome-David-Smiley-tp3717248p3718969.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: [hidden email]x-msg://154/user/SendEmail.jtp?type=nodenode=3733295i=1 For additional commands, e-mail: [hidden email]x-msg://154/user/SendEmail.jtp?type=nodenode=3733295i=2 -- Sincerely yours Mikhail Khludnev Lucid Certified Apache Lucene/Solr Developer Grid Dynamics http://www.griddynamics.com/[hidden email]x-msg://154/user/SendEmail.jtp?type=nodenode=3733295i=3 If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Welcome-David-Smiley-tp3717248p3733295.html To unsubscribe from Welcome David Smiley, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3717248code=RFNNSUxFWUBtaXRyZS5vcmd8MzcxNzI0OHwxMDE2NDI2OTUw. NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
[ https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209622#comment-13209622 ] David Smiley commented on LUCENE-3795: -- What constitutes a code donation? By the way, I've gone through the proper channels with my employer with regard to SOLR-2155 and LSP. MITRE has no copyright on this code; I've marked it all as ASF. Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been in development for approximately 1 year by David Smiley, Ryan McKinley, and Chris Male and we feel it is ready. LSP is here: http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene module is intuitively in svn/trunk/spatial-lucene/. I'll add more comments to prevent the issue description from being too long. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209630#comment-13209630 ] Hoss Man commented on LUCENE-3792: -- StrawMan suggestion off the top of my head: * rename NOT_ANALYZED to something like KEYWORD_ANALYZED * document KEYWORD_ANALYZED as being a convenience flag (and/or optimization?) for achieving equivalent behavior as using PerFieldAnalyzer with KeywordAnalyzer for this field, and keep using / re-word rmuir's patch warning to make it clear that if you use this at index time, any attempts to construct queries against it using the QueryParser will need KeywordAnalyzer. ...would that flag name == analyzer name equivalence help people remember not to get trapped by this? Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3792_javadocs_3x.patch, LUCENE-3792_javadocs_3x.patch Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[Lucene.Net] Lucene.Net Blog
Hey All, We've got a blog up and running: https://blogs.apache.org/lucenenet/. Right now we are taking the latest 3 articles and those are being posted onto our main Lucene.Net as news. But I'd like to try and get more regular content up on the blog. If you happen to write an article (or want to write an article) about Lucene.Net, we'd like to have it on the blog (or at least a slug for your article) - if anyone is interested just shoot us an email here. ~Prescott
[jira] [Assigned] (SOLR-3033) numberToKeep on replication handler does not work with backupAfter
[ https://issues.apache.org/jira/browse/SOLR-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer reassigned SOLR-3033: Assignee: James Dyer numberToKeep on replication handler does not work with backupAfter -- Key: SOLR-3033 URL: https://issues.apache.org/jira/browse/SOLR-3033 Project: Solr Issue Type: Bug Components: replication (java) Affects Versions: 3.5 Environment: openjdk 1.6, linux 3.x Reporter: Torsten Krah Assignee: James Dyer Attachments: SOLR-3033.patch Configured my replication handler like this: requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=replicateAfterstartup/str str name=replicateAftercommit/str str name=replicateAfteroptimize/str str name=confFileselevate.xml,schema.xml,spellings.txt,stopwords.txt,stopwords_de.txt,stopwords_en.txt,synonyms_de.txt,synonyms.txt/str str name=backupAfteroptimize/str str name=numberToKeep1/str /lst /requestHandler So after optimize a snapshot should be taken, this works. But numberToKeep is ignored, snapshots are increasing with each call to optimize and are kept forever. Seems this settings have no effect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2933) DIHCacheSupport ignores left side of where=xid=x.id attribute
[ https://issues.apache.org/jira/browse/SOLR-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209648#comment-13209648 ] Mikhail Khludnev commented on SOLR-2933: Great, James! Thank you. Let me refresh SOLR-3011 patch at next week, and I also would like to think about same thread-proof paging for plain JDBC EntityProcessor (w/o caches). DIHCacheSupport ignores left side of where=xid=x.id attribute --- Key: SOLR-2933 URL: https://issues.apache.org/jira/browse/SOLR-2933 Project: Solr Issue Type: Sub-task Components: contrib - DataImportHandler Affects Versions: 4.0 Reporter: Mikhail Khludnev Assignee: James Dyer Priority: Minor Labels: noob, random Fix For: 3.6, 4.0 Attachments: AbstractDataImportHandlerTestCase.java-choose-map-randomly.patch, SOLR-2933.patch Original Estimate: 1h Remaining Estimate: 1h DIHCacheSupport introduced at SOLR-2382 uses new config attributes cachePk and cacheLookup. But support old one where=xid=x.id is broken by [DIHCacheSupport.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DIHCacheSupport.java?view=markup] - it never put where= sides into the context, but it revealed by [SortedMapBackedCache.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/SortedMapBackedCache.java?view=markup], which takes just first column as a primary key. That's why all tests are green. To reproduce the issue I need just reorder entry at [line 219|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/TestCachedSqlEntityProcessor.java?revision=1201659view=markup] and make desc first and picked up as a primary key. To do that I propose to chose concrete map class randomly for all DIH test cases at [createMap()|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/AbstractDataImportHandlerTestCase.java?revision=1149600view=markup]. I'm attaching test breaking patch and seed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3109) Rename FieldsConsumer to InvertedFieldsConsumer
[ https://issues.apache.org/jira/browse/LUCENE-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Iulius Curt updated LUCENE-3109: Attachment: LUCENE-3109.patch Attached a patch with the refactoring of Fields, FieldsProducer, FieldsConsumer and any other related classes. It turned out to be pretty ample (also affected Solr) Please give some feedback if something is wrong. Rename FieldsConsumer to InvertedFieldsConsumer --- Key: LUCENE-3109 URL: https://issues.apache.org/jira/browse/LUCENE-3109 Project: Lucene - Java Issue Type: Task Components: core/codecs Affects Versions: 4.0 Reporter: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: LUCENE-3109.patch The name FieldsConsumer is missleading here it really is an InvertedFieldsConsumer and since we are extending codecs to consume non-inverted Fields we should be clear here. Same applies to Fields.java as well as FieldsProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3750) Convert Versioned docs to Markdown/New CMS
[ https://issues.apache.org/jira/browse/LUCENE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209655#comment-13209655 ] Yonik Seeley commented on LUCENE-3750: -- Heh - and I just tried the bookmarklet, and it crashed crome as soon as I tried to do an edit Convert Versioned docs to Markdown/New CMS -- Key: LUCENE-3750 URL: https://issues.apache.org/jira/browse/LUCENE-3750 Project: Lucene - Java Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor Since we are moving our main site to the ASF CMS (LUCENE-2748), we should bring in any new versioned Lucene docs into the same format so that we don't have to deal w/ Forrest anymore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3138) Add node roles to core admin handler 'create core' and solrj.
[ https://issues.apache.org/jira/browse/SOLR-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-3138: -- Attachment: SOLR-3138.patch simple patch - ill commit it shortly Add node roles to core admin handler 'create core' and solrj. - Key: SOLR-3138 URL: https://issues.apache.org/jira/browse/SOLR-3138 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.0 Attachments: SOLR-3138.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12437 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12437/ 5 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.ChaosMonkeySafeLeaderTest Error Message: ERROR: SolrIndexSearcher opens=39 closes=11 Stack Trace: junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=39 closes=11 at org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:152) at org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76) REGRESSION: org.apache.solr.cloud.FullSolrCloudDistribCmdsTest.testDistribSearch Error Message: null Stack Trace: org.apache.solr.common.cloud.ZooKeeperException: at org.apache.solr.client.solrj.impl.CloudSolrServer.connect(CloudSolrServer.java:123) at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:133) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:104) at org.apache.solr.cloud.FullSolrCloudTest.indexDoc(FullSolrCloudTest.java:464) at org.apache.solr.BaseDistributedSearchTestCase.indexr(BaseDistributedSearchTestCase.java:283) at org.apache.solr.cloud.FullSolrCloudDistribCmdsTest.addUpdateDelete(FullSolrCloudDistribCmdsTest.java:200) at org.apache.solr.cloud.FullSolrCloudDistribCmdsTest.doTest(FullSolrCloudDistribCmdsTest.java:74) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:664) at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:700) at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:599) at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:504) at org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:562) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) Caused by: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper 127.0.0.1:27995/solr within 1 ms at org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:129) at org.apache.solr.common.cloud.SolrZkClient.init(SolrZkClient.java:142) at org.apache.solr.common.cloud.SolrZkClient.init(SolrZkClient.java:90) at org.apache.solr.common.cloud.ZkStateReader.init(ZkStateReader.java:137) at org.apache.solr.client.solrj.impl.CloudSolrServer.connect(CloudSolrServer.java:108) REGRESSION: org.apache.solr.cloud.OverseerTest.testDoubleAssignment Error Message: KeeperErrorCode = NoNode for /solr/node_states/localhost:89825_solr Stack Trace: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /solr/node_states/localhost:89825_solr at org.apache.zookeeper.KeeperException.create(KeeperException.java:102) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:734) at org.apache.solr.common.cloud.SolrZkClient$2.execute(SolrZkClient.java:166) at org.apache.solr.common.cloud.SolrZkClient$2.execute(SolrZkClient.java:163) at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65) at org.apache.solr.common.cloud.SolrZkClient.delete(SolrZkClient.java:163) at org.apache.solr.cloud.AbstractZkTestCase.tryCleanPath(AbstractZkTestCase.java:152) at org.apache.solr.cloud.AbstractZkTestCase.tryCleanPath(AbstractZkTestCase.java:150) at org.apache.solr.cloud.AbstractZkTestCase.tryCleanPath(AbstractZkTestCase.java:150) at org.apache.solr.cloud.AbstractZkTestCase.tryCleanSolrZkNode(AbstractZkTestCase.java:142) at org.apache.solr.cloud.OverseerTest.testDoubleAssignment(OverseerTest.java:554) at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:700) at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:599) at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:504) at org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:562) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) FAILED: junit.framework.TestSuite.org.apache.solr.cloud.OverseerTest Error Message: ERROR: SolrZkClient opens=209 closes=206 Stack Trace: junit.framework.AssertionFailedError: ERROR: SolrZkClient opens=209 closes=206 at
[jira] [Commented] (LUCENE-3776) NRTManager shouldn't expose its private SearcherManager
[ https://issues.apache.org/jira/browse/LUCENE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209693#comment-13209693 ] Michael McCandless commented on LUCENE-3776: Thanks Shai. bq. with these changes, if the app passes an IndexReader that is not DirectoryReader, it will get ClassCastException (if asserts are disabled). Hang on -- SM now takes either IW or Directory, from which we always pull a DirectoryReader, right (we call DR.open ourselves)? Won't we always have a DR in SM...? Hmm, or do you mean the SearcherFactory could make some other reader...? Hmm maybe we should have a hard check for that (SearcherFactory shouldn't do that...?) bq. About close() – do you think it'll be better to keep close() final, and introduce a new protected closeResource()/closeInternal() that NRTManager can override? That way, RefManagers won't accidentally override close() and forget to call super.close()? Good idea... I'll add afterClose (matches afterRefresh); bq. About afterRefresh() – I'll admit that first I didn't understand why you need it. Previously, it was used to warm an IndexSearcher, but now we say it's the responsibility of SearcherFactory. I can see why it's useful for NRTManager, and it might even help me in LUCENE-3793 ! Do you think that we should declare that it can throw IOE? I know that if I'll use it in LUCENE-3793, I'll need that and I'd hate to throw RuntimeException. NRTManager can still override and not declare that. I'm just thinking that since almost all methods declare throwing IOE, it won't be odd if we declare it too on afterRefresh(), and it's not unlikely that afterRefresh() will do something that throws exceptions. Good, I'll add throws IOE. {quote} About openIfNeeded: Can you cast to DirectoryReader once? {quote} Will do. bq. I don't know if the assert is better than a ClassCastException ... with how the code is written, ClassCastException is better than assert because at least it will tell the user what went wrong? I *think* there's no way a non-DirReader can get into NRTManager (like SM), except for SearcherFactory. bq. How critical it is to declare newSearcher final? If you didn't, you could init it to null, and only change if newReader != null. Saving 4 lines of code (improves readability IMO – something that I know you care about ). Not critical! Good idea... NRTManager shouldn't expose its private SearcherManager --- Key: LUCENE-3776 URL: https://issues.apache.org/jira/browse/LUCENE-3776 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Priority: Blocker Fix For: 3.6, 4.0 Attachments: LUCENE-3776.patch Spinoff from LUCENE-3769. To actually obtain an IndexSearcher from NRTManager, it's a 2-step process now. You must .getSearcherManager(), then .acquire() from the returned SearcherManager. This is very trappy... because if the app incorrectly calls maybeReopen on that private SearcherManager (instead of NRTManager.maybeReopen) then it can unexpectedly cause threads to block forever, waiting for the necessary gen to become visible. This will be hard to debug... I don't like creating trappy APIs. Hopefully once LUCENE-3761 is in, we can fix NRTManager to no longer expose its private SM, instead subclassing ReferenceManaager. Or alternatively, or in addition, maybe we factor out a new interface (SearcherProvider or something...) that only has acquire and release methods, and both NRTManager and ReferenceManager/SM impl that, and we keep NRTManager's SM private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209697#comment-13209697 ] Robert Muir commented on LUCENE-3792: - Hossman I think KEYWORD_ANALYZED is the ideal name for 3.x actually. I think in combination with the javadocs it would be more clear. This still leaves the question for trunk (currently StringField): positives are that its actually a nice name, concise and to the point. another positive is that StringField omits things like positions, and in trunk we don't silently fail if you form a phrase from this. one negative is that both StringField and TextField confusingly take String in their ctors, (I've chosen the wrong one myself before on accident). Basically to me, this is a combination of traps. Trunk is somewhat better because it throws exceptions for positional queries if you actually excluded positions... in all cases in 3.x, the wrong 'configuration' here creates a situation where the user just 'does not get results' and they have no idea why... despite the fact they used the same Analyzer at query-time and index-time like a good user. thats what I find so frustrating. Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3792_javadocs_3x.patch, LUCENE-3792_javadocs_3x.patch Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3776) NRTManager shouldn't expose its private SearcherManager
[ https://issues.apache.org/jira/browse/LUCENE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3776: --- Attachment: LUCENE-3776.patch New patch folding in Shai's suggestions (thanks!). I didn't yet add a hard check for an evil SearcherFactory... NRTManager shouldn't expose its private SearcherManager --- Key: LUCENE-3776 URL: https://issues.apache.org/jira/browse/LUCENE-3776 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Priority: Blocker Fix For: 3.6, 4.0 Attachments: LUCENE-3776.patch, LUCENE-3776.patch Spinoff from LUCENE-3769. To actually obtain an IndexSearcher from NRTManager, it's a 2-step process now. You must .getSearcherManager(), then .acquire() from the returned SearcherManager. This is very trappy... because if the app incorrectly calls maybeReopen on that private SearcherManager (instead of NRTManager.maybeReopen) then it can unexpectedly cause threads to block forever, waiting for the necessary gen to become visible. This will be hard to debug... I don't like creating trappy APIs. Hopefully once LUCENE-3761 is in, we can fix NRTManager to no longer expose its private SM, instead subclassing ReferenceManaager. Or alternatively, or in addition, maybe we factor out a new interface (SearcherProvider or something...) that only has acquire and release methods, and both NRTManager and ReferenceManager/SM impl that, and we keep NRTManager's SM private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
[ https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209707#comment-13209707 ] Robert Muir commented on LUCENE-3795: - Simon do we really need a code grant here? Its my understanding (correct me if i am wrong): the developers involved (David, Ryan, Chris) are all committers with iCLA on file, so is it really any different than any other patch from that perspective? As far as LGPL, according to David's description and the title of this jira issue (possible i did not interpret it correctly, correct me if so), the he wants to replace lucene/contrib/spatial with the spatial-lucene project, and that it has no LGPL ties at all, (only spatial-extras does). Without looking at any code myself, if thats really the case I'm +1 on principle because it means we basically have an improved spatial module for lucene core with no catch at all. The current code has not seen much maintenance. (And i agree, we should be shooting for a proper module/ here, not a contrib). Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been in development for approximately 1 year by David Smiley, Ryan McKinley, and Chris Male and we feel it is ready. LSP is here: http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene module is intuitively in svn/trunk/spatial-lucene/. I'll add more comments to prevent the issue description from being too long. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3777) trapping overloaded ctors/setters in Field/NumericField/DocValuesField
[ https://issues.apache.org/jira/browse/LUCENE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3777: --- Attachment: LUCENE-3777.patch Patch, splitting NF into Int/Long/Float/DoubleField, and changing Field.setValue(T value) - Field.setTValue(T value). Tests pass... I'd like to commit this first (big, rote patch, will conflict soon) and then do DocValuesField separately... trapping overloaded ctors/setters in Field/NumericField/DocValuesField -- Key: LUCENE-3777 URL: https://issues.apache.org/jira/browse/LUCENE-3777 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Assignee: Michael McCandless Priority: Blocker Attachments: LUCENE-3777.patch In trunk, these apis let you easily create a field, but my concern is this: {code} public NumericField(String name, int value) public NumericField(String name, long value) .. public Field(String name, int value, FieldType type) public Field(String name, long value, FieldType type) .. public void setValue(int value) public void setValue(long value) .. public DocValuesField(String name, int value, DocValues.Type docValueType) public DocValuesField(String name, long value, DocValues.Type docValueType) {code} I really don't like overloaded ctors/setters where the compiler can type-promote you, I think it makes the apis hard to use. Instead for the setters I think we sohuld have setIntValue, setLongValue, ... For the ctors, I see two other options: # factories like DocValuesField.newIntField() # subclasses like IntField I don't have any patch for this, but I think we should discuss and fix before these apis are released. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3138) Add node roles to core admin handler 'create core' and solrj.
[ https://issues.apache.org/jira/browse/SOLR-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-3138. --- Resolution: Fixed Add node roles to core admin handler 'create core' and solrj. - Key: SOLR-3138 URL: https://issues.apache.org/jira/browse/SOLR-3138 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.0 Attachments: SOLR-3138.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3750) Convert Versioned docs to Markdown/New CMS
[ https://issues.apache.org/jira/browse/LUCENE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209794#comment-13209794 ] Mark Miller commented on LUCENE-3750: - I'm in a similar boat with site updates at the moment - while I struggled through the setup in the past, I had things working smoothly at one point - but since I've updated recently, I can no longer build the site - i get two errors about the JIRA additions in the sidebar. {noformat} . at /Users/markrmiller/Workspaces/lucid/cms/lib/view.pm line 46 File content/solr/index.mdtext had processing errors: Error while rendering output to string get http://s.apache.org/solrjira failed. . at /Users/markrmiller/Workspaces/lucid/cms/lib/view.pm line 46 File content/core/index.mdtext had processing errors: Error while rendering output to string get http://s.apache.org/corejira failed. {noformat} Convert Versioned docs to Markdown/New CMS -- Key: LUCENE-3750 URL: https://issues.apache.org/jira/browse/LUCENE-3750 Project: Lucene - Java Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor Since we are moving our main site to the ASF CMS (LUCENE-2748), we should bring in any new versioned Lucene docs into the same format so that we don't have to deal w/ Forrest anymore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2907) java.lang.IllegalArgumentException: deltaQuery has no column to resolve to declared primary key pk='ITEM_ID, CATEGORY_ID'
[ https://issues.apache.org/jira/browse/SOLR-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209845#comment-13209845 ] Adam Lane commented on SOLR-2907: - Upgraded to 3.5 and confirmed same problem. java.lang.IllegalArgumentException: deltaQuery has no column to resolve to declared primary key pk='ITEM_ID, CATEGORY_ID' - Key: SOLR-2907 URL: https://issues.apache.org/jira/browse/SOLR-2907 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler, Schema and Analysis Affects Versions: 3.4 Reporter: Alan Baker We are using solr for our site and ran into this error in our own schema and I was able to reproduce it using the dataimport example code in the solr project. We do not get this error in SOLR 1.4 only started seeing it as we are working to upgrade to 3.4.0. It fails when delta-importing linked tables. Complete trace: Nov 18, 2011 5:21:02 PM org.apache.solr.handler.dataimport.DataImporter doDeltaImport SEVERE: Delta Import Failed java.lang.IllegalArgumentException: deltaQuery has no column to resolve to declared primary key pk='ITEM_ID, CATEGORY_ID' at org.apache.solr.handler.dataimport.DocBuilder.findMatchingPkColumn(DocBuilder.java:849) at org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:900) at org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:879) at org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:285) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:179) at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:390) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:429) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408) I used this dataConfig from the wiki on the data import: dataConfig dataSource driver=org.hsqldb.jdbcDriver url=jdbc:hsqldb:./example-DIH/hsqldb/ex user=sa / document entity name=item pk=ID query=select * from item deltaImportQuery=select * from item where ID=='${dataimporter.delta.id}' deltaQuery=select id from item where last_modified gt; '${dataimporter.last_index_time}' entity name=item_category pk=ITEM_ID, CATEGORY_ID query=select CATEGORY_ID from item_category where ITEM_ID='${item.ID}' deltaQuery=select ITEM_ID, CATEGORY_ID from item_category where last_modified '${dataimporter.last_index_time}' parentDeltaQuery=select ID from item where ID=${item_category.ITEM_ID} entity name=category pk=ID query=select DESCRIPTION as cat from category where ID = '${item_category.CATEGORY_ID}' deltaQuery=select ID from category where last_modified gt; '${dataimporter.last_index_time}' parentDeltaQuery=select ITEM_ID, CATEGORY_ID from item_category where CATEGORY_ID=${category.ID}/ /entity /entity /document /dataConfig To reproduce use the data config from above and set the dataimport.properties last update times to before the last_modifed date in the example data. I my case I had to set the year to 1969. Then run a delta-import and the exception occurs. Thanks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3137) When solr.xml is persisted, you lose all system property substitution that was used.
[ https://issues.apache.org/jira/browse/SOLR-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-3137. --- Resolution: Fixed When solr.xml is persisted, you lose all system property substitution that was used. - Key: SOLR-3137 URL: https://issues.apache.org/jira/browse/SOLR-3137 Project: Solr Issue Type: Bug Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.0 Attachments: SOLR-3137.patch, SOLR-3137.patch A lesser issue is that we also write out properties that where not originally in the file with the defaults they picked up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2354) CoreAdminRequest#createCore should allow you to specify the data dir
[ https://issues.apache.org/jira/browse/SOLR-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-2354. --- Resolution: Invalid CoreAdminRequest#createCore should allow you to specify the data dir Key: SOLR-2354 URL: https://issues.apache.org/jira/browse/SOLR-2354 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 4.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3131) details command fails when a replication is forced with a fetchIndex command on a non-slave server
[ https://issues.apache.org/jira/browse/SOLR-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209847#comment-13209847 ] Mark Miller commented on SOLR-3131: --- committed to trunk - I'll add changes and back port to 3.6 as well. details command fails when a replication is forced with a fetchIndex command on a non-slave server -- Key: SOLR-3131 URL: https://issues.apache.org/jira/browse/SOLR-3131 Project: Solr Issue Type: Bug Components: replication (java) Affects Versions: 3.5 Reporter: Tomás Fernández Löbbe Assignee: Mark Miller Priority: Minor Fix For: 3.6, 4.0 Attachments: SOLR-3131.patch Steps to reproduce the problem: 1) Start a master Solr instance (called A) 2) Start a Solr instance with replication handler configured, but with no slave configuration. (called B) 3) Issue the request http://B:port/solr/replication?command=fetchindexmasterUrl=http://A:port/solr/replication 4) While B is fetching the index, issue the request: http://B:port/solr/replication?command=details Expected behavior: See the replication details as usual. Getting an exception instead: java.lang.NullPointerException at org.apache.solr.handler.ReplicationHandler.isPollingDisabled(ReplicationHandler.java:447) at org.apache.solr.handler.ReplicationHandler.getReplicationDetails(ReplicationHandler.java:611) at org.apache.solr.handler.ReplicationHandler.handleRequestBody(ReplicationHandler.java:211) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1523) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:339) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:234) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3131) details command fails when a replication is forced with a fetchIndex command on a non-slave server
[ https://issues.apache.org/jira/browse/SOLR-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-3131: -- Affects Version/s: (was: 4.0) 3.5 Fix Version/s: 4.0 3.6 details command fails when a replication is forced with a fetchIndex command on a non-slave server -- Key: SOLR-3131 URL: https://issues.apache.org/jira/browse/SOLR-3131 Project: Solr Issue Type: Bug Components: replication (java) Affects Versions: 3.5 Reporter: Tomás Fernández Löbbe Assignee: Mark Miller Priority: Minor Fix For: 3.6, 4.0 Attachments: SOLR-3131.patch Steps to reproduce the problem: 1) Start a master Solr instance (called A) 2) Start a Solr instance with replication handler configured, but with no slave configuration. (called B) 3) Issue the request http://B:port/solr/replication?command=fetchindexmasterUrl=http://A:port/solr/replication 4) While B is fetching the index, issue the request: http://B:port/solr/replication?command=details Expected behavior: See the replication details as usual. Getting an exception instead: java.lang.NullPointerException at org.apache.solr.handler.ReplicationHandler.isPollingDisabled(ReplicationHandler.java:447) at org.apache.solr.handler.ReplicationHandler.getReplicationDetails(ReplicationHandler.java:611) at org.apache.solr.handler.ReplicationHandler.handleRequestBody(ReplicationHandler.java:211) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1523) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:339) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:234) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
[ https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209862#comment-13209862 ] Jan Høydahl commented on LUCENE-3795: - Impressive piece of work! Given license stuff is ok, here is my +1 Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been in development for approximately 1 year by David Smiley, Ryan McKinley, and Chris Male and we feel it is ready. LSP is here: http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene module is intuitively in svn/trunk/spatial-lucene/. I'll add more comments to prevent the issue description from being too long. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
[ https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209888#comment-13209888 ] Uwe Schindler commented on LUCENE-3795: --- Cool work! I scanned the code quickly and it seems to fit much better than the current spatial! I have some suggestions regarding performance; BooleanQuery usage and related inconsistency with BQ scoring (with coord) in the different strategies; also found some caching problems (AtomicReader is key to cache not AtomicReader.getCoreCacheKey, so new deleted docs after reopen invalidate the cache), but I would prefer to discuss that here once the patch is provided on Lucene's JIRA. Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been in development for approximately 1 year by David Smiley, Ryan McKinley, and Chris Male and we feel it is ready. LSP is here: http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene module is intuitively in svn/trunk/spatial-lucene/. I'll add more comments to prevent the issue description from being too long. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
[ https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209966#comment-13209966 ] Chris Male commented on LUCENE-3795: Huge +1 Thanks so much David for opening this issue and getting the code to a point where it can be contributed. I'm really excited to see this brought into the fold and glad to see support from others. {quote} As far as LGPL, according to David's description and the title of this jira issue (possible i did not interpret it correctly, correct me if so), the he wants to replace lucene/contrib/spatial with the spatial-lucene project, and that it has no LGPL ties at all, (only spatial-extras does). {quote} Absolutely. The portion of the codebase which uses LGPL code is entirely optional and decoupled from the rest of the code. From a functional perspective, as David says, its only really related to polygon support which is hugely powerful but can exist somewhere else if needs be. Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been in development for approximately 1 year by David Smiley, Ryan McKinley, and Chris Male and we feel it is ready. LSP is here: http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene module is intuitively in svn/trunk/spatial-lucene/. I'll add more comments to prevent the issue description from being too long. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
[ https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209971#comment-13209971 ] Chris Male commented on LUCENE-3795: {quote} but I would prefer to discuss that here once the patch is provided on Lucene's JIRA. {quote} Is it best to create a patch here and iterate on any problems, or create a branch and work through them there? Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been in development for approximately 1 year by David Smiley, Ryan McKinley, and Chris Male and we feel it is ready. LSP is here: http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene module is intuitively in svn/trunk/spatial-lucene/. I'll add more comments to prevent the issue description from being too long. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2332) TikaEntityProcessor retrieves only File Names from Zip extraction
[ https://issues.apache.org/jira/browse/SOLR-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210033#comment-13210033 ] Lance Norskog commented on SOLR-2332: - Unpacking a zip file is a very narrow, focused operation. This could also be done with a separate UpdateRequestHandler that does nothing but unpack zip files. It would use the basic JDK zip file code, not Tika. You configure the Tika handler beneath it. Another use case is a ZIP file full of solr update xml files, which TIKA does not know about. To do this, you want an UpdateRequestHandler stack like this: zip unpacker - XmlUpdateRequestHandler TikaEntityProcessor retrieves only File Names from Zip extraction - Key: SOLR-2332 URL: https://issues.apache.org/jira/browse/SOLR-2332 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Reporter: Jayendra Patil Fix For: 3.6, 4.0 Attachments: SOLR-2332.patch, solr-word.zip Extraction of Zip files using TikaEntityProcessor results in only names of file. It does not extract the contents of the Files in the Zip -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3776) NRTManager shouldn't expose its private SearcherManager
[ https://issues.apache.org/jira/browse/LUCENE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210064#comment-13210064 ] Shai Erera commented on LUCENE-3776: bq. Hang on – SM now takes either IW or Director You're right, I missed that. For some reason I had the impression it takes an IR, which is obviously wrong, since it won't be allowed to close it. bq. do you mean the SearcherFactory could make some other reader I'm less worried about that. We give SF an IndexReader, I can only expect that it will return an IndexSearcher on top of it. Maybe we can assert that IndexSearcher.getIndexReader == newReader in refreshIfNeeded? bq. I think there's no way a non-DirReader can get into NRTManager You're right. If you keep the assert, maybe add a nice msg to it? bq. I didn't yet add a hard check for an evil SearcherFactory... I think that's ok to assume that SearcherFactory is not evil. Maybe the assert I suggested above would be enough? NRTManager shouldn't expose its private SearcherManager --- Key: LUCENE-3776 URL: https://issues.apache.org/jira/browse/LUCENE-3776 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Priority: Blocker Fix For: 3.6, 4.0 Attachments: LUCENE-3776.patch, LUCENE-3776.patch Spinoff from LUCENE-3769. To actually obtain an IndexSearcher from NRTManager, it's a 2-step process now. You must .getSearcherManager(), then .acquire() from the returned SearcherManager. This is very trappy... because if the app incorrectly calls maybeReopen on that private SearcherManager (instead of NRTManager.maybeReopen) then it can unexpectedly cause threads to block forever, waiting for the necessary gen to become visible. This will be hard to debug... I don't like creating trappy APIs. Hopefully once LUCENE-3761 is in, we can fix NRTManager to no longer expose its private SM, instead subclassing ReferenceManaager. Or alternatively, or in addition, maybe we factor out a new interface (SearcherProvider or something...) that only has acquire and release methods, and both NRTManager and ReferenceManager/SM impl that, and we keep NRTManager's SM private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3096) Add book information to the new website
[ https://issues.apache.org/jira/browse/SOLR-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210066#comment-13210066 ] David Smiley commented on SOLR-3096: I committed this just now. It'd be nice to have fellow committers comment on the results before it winds up getting published from staging. http://lucene.staging.apache.org/solr/books.html Add book information to the new website --- Key: SOLR-3096 URL: https://issues.apache.org/jira/browse/SOLR-3096 Project: Solr Issue Type: Task Reporter: David Smiley Assignee: David Smiley Attachments: website_books.patch The attached patch modifies the new website design to incorporate the book information. It ads a header mantle slideshow entry with both book images (just the 2 current books), and it adds a book page with the 3 books published (this includes the 1st edition that is out of date now). The image files referenced are the same actual binary images on the current website by I chose a more consistent naming convention. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3096) Add book information to the new website
[ https://issues.apache.org/jira/browse/SOLR-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210068#comment-13210068 ] Chris Male commented on SOLR-3096: -- Did a quick glance, looks great, +1 Add book information to the new website --- Key: SOLR-3096 URL: https://issues.apache.org/jira/browse/SOLR-3096 Project: Solr Issue Type: Task Reporter: David Smiley Assignee: David Smiley Attachments: website_books.patch The attached patch modifies the new website design to incorporate the book information. It ads a header mantle slideshow entry with both book images (just the 2 current books), and it adds a book page with the 3 books published (this includes the 1st edition that is out of date now). The image files referenced are the same actual binary images on the current website by I chose a more consistent naming convention. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
[ https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210070#comment-13210070 ] David Smiley commented on LUCENE-3795: -- FYI the code coverage figure is erroneous, Clover didn't recognize some inner classes extending other tests as tests. Using IntelliJ IDEA Ultimate's built-in coverage, it's 63% (as counted per line), and I believe its higher once the spatial-solr module is brought into the mix which has a bunch of tests. Uwe, I'm very interested in your input on anything to make the code better. Given the volume of code, I believe a feature branch makes the most sense instead of a humungous patch file. Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been in development for approximately 1 year by David Smiley, Ryan McKinley, and Chris Male and we feel it is ready. LSP is here: http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene module is intuitively in svn/trunk/spatial-lucene/. I'll add more comments to prevent the issue description from being too long. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3792) Remove StringField
[ https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210105#comment-13210105 ] Robert Muir commented on LUCENE-3792: - {quote} NOT_ANALYZED has two variants - with and without norms. {quote} You are right, I forgot about this. For NOT_ANALYZED with norms, we should probably just throw CoderMalfunctionError() Remove StringField -- Key: LUCENE-3792 URL: https://issues.apache.org/jira/browse/LUCENE-3792 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3792_javadocs_3x.patch, LUCENE-3792_javadocs_3x.patch Often on the mailing list there is confusion about NOT_ANALYZED. Besides being useless (Just use KeywordAnalyzer instead), people trip up on this not being consistent at query time (you really need to configure KeywordAnalyzer for the field on your PerFieldAnalyzerWrapper so it will do the same thing at query time... oh wait once you've done that, you dont need NOT_ANALYZED). So I think StringField is a trap too for the same reasons, just under a different name, lets remove it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3796) Disallow setBoost() on StringField, throw exception if boosts are set if norms are omitted
Disallow setBoost() on StringField, throw exception if boosts are set if norms are omitted -- Key: LUCENE-3796 URL: https://issues.apache.org/jira/browse/LUCENE-3796 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Priority: Blocker Fix For: 4.0 Occasionally users are confused why index-time boosts are not applied to their norms-omitted fields. This is because we silently discard the boost: there is no reason for this! The most absurd part: in 4.0 you can make a StringField and call setBoost and nothing complains... (more reasons to remove StringField totally in my opinion) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org