Filter based on the sum of values of two fields
Hello, We have documents with many numerical fields. In some search scenario, we would like to create a filter based on the sum of the values of two fields. For example, assume we have fields F1 and F2, we would like to find all documents with condition F1+F2 5.0. This filter may be combined with other filters to form a BooleanFilter. The question is, is there any way to construct an efficient filter to do this? We know it is possible to pre-compute another field F3 with sums of corresponding F1 and F2 values and then filter based on the values of F3. However, we have too many possible combination of pairs of numerical fields which leads to large number of aggregated fields such as F3. Can we directly use the values of F1 and F2 to create a filter? Thanks, Wei - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Assert / NPE using MultiFieldQueryParser
I'm using MultiFieldQueryParser to parse search queries. I find that certain query strings (e.g., /study/ without the quotes) cause MultiFieldQueryParser.parse() to throw an AssertionError, if asserts are enabled. In production, parse() returns a Query, but it seems to be corrupt. using it to search my index results in an NPE. This seems related to regular expressions. That query string is probably invalid regex syntax. but shouldn't MultiFieldQueryParser to throw a ParseException in this case? Here's a simple example that reproduces the assertion: // Turn on asserts ClassLoader loader = ClassLoader.getSystemClassLoader(); loader.setDefaultAssertionStatus(true); try { Analyzer analyzer = new WhitespaceAnalyzer(Version.LUCENE_41); QueryParser parser = new MultiFieldQueryParser(Version.LUCENE_41, new String[]{title, body}, analyzer); Query query = parser.parse(/study/); } catch (ParseException e) { System.out.println(Syntax error, please rephrase your query); } This produces: Exception in thread main java.lang.AssertionError at org.apache.lucene.search.MultiTermQuery.init(MultiTermQuery.java:252) at org.apache.lucene.search.AutomatonQuery.init(AutomatonQuery.java:65) at org.apache.lucene.search.RegexpQuery.init(RegexpQuery.java:90) at org.apache.lucene.search.RegexpQuery.init(RegexpQuery.java:79) at org.apache.lucene.search.RegexpQuery.init(RegexpQuery.java:69) at org.apache.lucene.queryparser.classic.QueryParserBase.newRegexpQuery(QueryPa rserBase.java:790) at org.apache.lucene.queryparser.classic.QueryParserBase.getRegexpQuery(QueryPa rserBase.java:1005) at org.apache.lucene.queryparser.classic.QueryParserBase.handleBareTokenQuery(Q ueryParserBase.java:1075) at org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:359) at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:25 8) at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:182 ) at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser. java:171) at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase. java:120) at QueryParserException.main(QueryParserException.java:21) Turn off the asserts and parse() returns successfully. but subsequent use of that Query instance results in NPEs such as: java.lang.NullPointerException at java.util.TreeMap.getEntry(TreeMap.java:342) at java.util.TreeMap.get(TreeMap.java:273) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.terms( PerFieldPostingsFormat.java:215) at org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRe write.java:58) at org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoR ewrite.java:95) at org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(Mul tiTermQuery.java:220) at org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:286) at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:429) at org.apache.lucene.search.FilteredQuery.rewrite(FilteredQuery.java:334) at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:616) at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher. java:663) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281) at org.labkey.search.model.LuceneSearchServiceImpl.search(LuceneSearchServiceIm pl.java:1160) This is appearing on production deployments with reasonable (from a user's perspective) search queries (e.g., http://labkey.org/study/xml; without the quotes). I'd like to either turn off regex parsing altogether or detect the syntax error at parse time so I can provide my standard syntax guidance back to the user. Thanks, Adam
Re: Assert / NPE using MultiFieldQueryParser
Hey, this is in-fact a bug in the MultiFieldQueryParser, can you open a ticket for this please in our bugtracker? MultifieldQueryParser should override getRegexpQuery but it doesn't simon On Sun, Mar 24, 2013 at 3:57 PM, Adam Rauch a...@labkey.com wrote: I'm using MultiFieldQueryParser to parse search queries. I find that certain query strings (e.g., /study/ without the quotes) cause MultiFieldQueryParser.parse() to throw an AssertionError, if asserts are enabled. In production, parse() returns a Query, but it seems to be corrupt. using it to search my index results in an NPE. This seems related to regular expressions. That query string is probably invalid regex syntax. but shouldn't MultiFieldQueryParser to throw a ParseException in this case? Here's a simple example that reproduces the assertion: // Turn on asserts ClassLoader loader = ClassLoader.getSystemClassLoader(); loader.setDefaultAssertionStatus(true); try { Analyzer analyzer = new WhitespaceAnalyzer(Version.LUCENE_41); QueryParser parser = new MultiFieldQueryParser(Version.LUCENE_41, new String[]{title, body}, analyzer); Query query = parser.parse(/study/); } catch (ParseException e) { System.out.println(Syntax error, please rephrase your query); } This produces: Exception in thread main java.lang.AssertionError at org.apache.lucene.search.MultiTermQuery.init(MultiTermQuery.java:252) at org.apache.lucene.search.AutomatonQuery.init(AutomatonQuery.java:65) at org.apache.lucene.search.RegexpQuery.init(RegexpQuery.java:90) at org.apache.lucene.search.RegexpQuery.init(RegexpQuery.java:79) at org.apache.lucene.search.RegexpQuery.init(RegexpQuery.java:69) at org.apache.lucene.queryparser.classic.QueryParserBase.newRegexpQuery(QueryPa rserBase.java:790) at org.apache.lucene.queryparser.classic.QueryParserBase.getRegexpQuery(QueryPa rserBase.java:1005) at org.apache.lucene.queryparser.classic.QueryParserBase.handleBareTokenQuery(Q ueryParserBase.java:1075) at org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:359) at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:25 8) at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:182 ) at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser. java:171) at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase. java:120) at QueryParserException.main(QueryParserException.java:21) Turn off the asserts and parse() returns successfully. but subsequent use of that Query instance results in NPEs such as: java.lang.NullPointerException at java.util.TreeMap.getEntry(TreeMap.java:342) at java.util.TreeMap.get(TreeMap.java:273) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.terms( PerFieldPostingsFormat.java:215) at org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRe write.java:58) at org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoR ewrite.java:95) at org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(Mul tiTermQuery.java:220) at org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:286) at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:429) at org.apache.lucene.search.FilteredQuery.rewrite(FilteredQuery.java:334) at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:616) at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher. java:663) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281) at org.labkey.search.model.LuceneSearchServiceImpl.search(LuceneSearchServiceIm pl.java:1160) This is appearing on production deployments with reasonable (from a user's perspective) search queries (e.g., http://labkey.org/study/xml; without the quotes). I'd like to either turn off regex parsing altogether or detect the syntax error at parse time so I can provide my standard syntax guidance back to the user. Thanks, Adam - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Using AnalyzingSuggester with stopwords
Hey there, I am trying to get up some working example with the AnalyzingSuggester and stopwords - like it is done in the corresponding unit test. I thought, I could build the AnalyzingSuggester from a HighfrequencyDictionary using a non_analyzed field - and then use a stopwordsanalyzer in the constructor. This has however not worked out. I couldnt get any suggestions for stopword foo... I guess I am doing something totally wrong, due to not getting it right how the AnalyzingSuggester is working, so thanks for any help. --Alexander
Re: Using AnalyzingSuggester with stopwords
Alex, did you try to get it working with a single term like adding the foobar and then drawing suggestions for the foo ? simon On Sun, Mar 24, 2013 at 8:51 PM, Alexander Reelsen a...@spinscale.de wrote: Hey there, I am trying to get up some working example with the AnalyzingSuggester and stopwords - like it is done in the corresponding unit test. I thought, I could build the AnalyzingSuggester from a HighfrequencyDictionary using a non_analyzed field - and then use a stopwordsanalyzer in the constructor. This has however not worked out. I couldnt get any suggestions for stopword foo... I guess I am doing something totally wrong, due to not getting it right how the AnalyzingSuggester is working, so thanks for any help. --Alexander - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: Accent insensitive analyzer
ISOLatin1AccentFilter has been deprecated for quite some time, ASCIIFoldingFilter is preferred Best Erick On Fri, Mar 22, 2013 at 2:59 PM, Jerome Blouin jblo...@expedia.com wrote: Thanks. I'll check that later. -Original Message- From: Sujit Pal [mailto:sujitatgt...@gmail.com] On Behalf Of SUJIT PAL Sent: Friday, March 22, 2013 2:52 PM To: java-user@lucene.apache.org Subject: Re: Accent insensitive analyzer Hi Jerome, How about this one? http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ISOLatin1AccentFilterFactory Regards, Sujit On Mar 22, 2013, at 9:22 AM, Jerome Blouin wrote: Hello, I'm looking for an analyzer that allows performing accent insensitive search in latin languages. I'm currently using the StandardAnalyzer but it doesn't fulfill this need. Could you please point me to the one I need to use? I've checked the javadoc for the various analyzer packages but can't find one. Do I need to implement my own analyzer? Regards, Jerome - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: PayloadFunctions don't work the same since 4.1
I appreciate your help with this! I was attempting to follow your advice when I noticed another odd behavior that leads me to believe they are not being stored correctly. If I add two documents at once using solr's update handler like so: [{id:1,foo_ap:bar|50}},{id:2,foo_ap:bar|75}] It appears to be storing the 50 value for BOTH documents. This doesn't happen when I send them in one at a time. I'm going to try to root this problem out, but I'm really not sure where to start looking. Would this happen in the DelimitedPayloadTokenFilter ? Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/PayloadFunctions-don-t-work-the-same-since-4-1-tp4049947p4050963.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: Update DocValues and Query Time Join performance on DocValues
Unfortunately updateable docvalues is not supported yet, but it future AFAIK it could be available using stack update approach https://issues.apache.org/jira/browse/SOLR-3855 https://issues.apache.org/jira/browse/LUCENE-4258 On Sat, Mar 23, 2013 at 9:52 AM, Pablo Guerrero sir...@gmail.com wrote: Hello everyone, I've seen in a couple of old presentations that DocValues will be updatable (without updating the whole document) but I cannot find anything recent on this. Is this currently possible on 4.2? Is there any example on how to do it? Also, I have the impression that Query Time Joins should be really fast using DocValues, as you have random access, but I would like to know what's the real cost of traversing a relationship, is it constant?, logarithmic?, linear? Thanks in advance, Pablo
[ANNOUNCE] Wiki editing change
The wiki at http://wiki.apache.org/lucene-java/ has come under attack by spammers more frequently of late, so the PMC has decided to lock it down in an attempt to reduce the work involved in tracking and removing spam. From now on, only people who appear on http://wiki.apache.org/lucene-java/ContributorsGroup will be able to create/modify/delete wiki pages. Please request either on the java-user@lucene.apache.org or on d...@lucene.apache.org to have your wiki username added to the ContributorsGroup page - this is a one-time step. Steve - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org