Token Stream with Offsets (Token Sources class)
Hi, I've the following snippet code where I'm trying to extract weighted span terms from the query (I do have term vectors enabled on the fields): File path = new File( ""); FSDirectory directory = FSDirectory.open(path); IndexReader indexReader = DirectoryReader.open(directory); Map allWeightedSpanTerms = new HashMap(); WeightedSpanTermExtractor extractor = null; extractor = new WeightedSpanTermExtractor(); TokenStream tokenStream = null; tokenStream = TokenSources.getTokenStreamWithOffsets(indexReader, 0, "name"); allWeightedSpanTerms.putAll(extractor.getWeightedSpanTerms(q, tokenStream)); In the end, if I look at the map "allWeightedSpanTerms" - I don't have any weighted span terms & when I tried to debug the code I found that when it is trying to build the TermContext the statement "fields.terms(field);" is returning "null" which I don't understand. My query is : "Running Apple" (a phrase query) my doc contents are : name : Running Apple 60 GB iPod with Video Playback Black - Apple Please let me know on what I'm doing anything wrong. Thanks. Phani. -- View this message in context: http://lucene.472066.n3.nabble.com/Token-Stream-with-Offsets-Token-Sources-class-tp4054384.html Sent from the Solr - User mailing list archive at Nabble.com.
To get Term Offsets of a term per document
Hello, Is there a way to get Term Offsets of a given term per document without enabling the termVectors ? Is it that Lucene index stores the positions but not the offsets by default - is it correct ? Thanks, Phani. -- View this message in context: http://lucene.472066.n3.nabble.com/To-get-Term-Offsets-of-a-term-per-document-tp4041696.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr-UIMA integration : analyzing multi-fields
Hello all, how to analyze multiple fields using UIMA when we add the UIMA update chain to the update handler ? and how to map which field gets analyzed to which field. For instance, lets say there are two text fields, text1 & text2 for which I need to generate pos-tags using UIMA. In the fields section I can definitely do this : false text1 text2 and in the fieldMappings : org.apache.uima.TokenAnnotation posTag postags1 but how to specify that I need pos-tags for field text2 too and that too in postags2 field. If there is any schema/DTD for these configuration settings - please let me know. Also, how can I change the code or is there a way to specify to generate pos-tags after getting the token stream from an analyzer. Currently, the update processor gets the text from the input field and generates pos-tags into postags1 field using WhitespaceTokenizer defined in the xml configuration files by default. how can I change the tokenizer such that it uses a Solr Analyzer/ Tokenizer ? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-UIMA-integration-analyzing-multi-fields-tp4017890.html Sent from the Solr - User mailing list archive at Nabble.com.
StandardTokenizer generation from JFlex grammar
Hello, I'm trying to generate the standard tokenizer again using the jflex specification (StandardTokenizerImpl.jflex) but I'm not able to do so due to some errors (I would like to create my own jflex file using the standard tokenizer which is why I'm trying to first generate using that to get a hang of things). I'm using jflex 1.4.3 and I ran into the following error: Error in file "" (line 64): Syntax error. HangulEx = (!(!\p{Script:Hangul}|!\p{WB:ALetter})) ({Format} | {Extend})* Also, I tried installing an eclipse plugin from http://cup-lex-eclipse.sourceforge.net/ which I thought would provide options similar to JavaCC (http://eclipse-javacc.sourceforge.net/) through which we can generate classes within eclipse - but had a hard luck. Any help would be very helpful. Regards, Phani. -- View this message in context: http://lucene.472066.n3.nabble.com/StandardTokenizer-generation-from-JFlex-grammar-tp4011941.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SpanNearQuery distance issue
Shoot me. Thanks, I did not notice that the doc has ".. e a .." in the content. Thanks again for immediate reply :) -- View this message in context: http://lucene.472066.n3.nabble.com/SpanNearQuery-distance-issue-tp4008973p4008978.html Sent from the Solr - User mailing list archive at Nabble.com.
SpanNearQuery distance issue
Hello All, I've a issue with respect to the distance measure of SpanNearQuery in Lucene. Let's say I've following two documents: DocID: 6, cotent:"1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1001 1002 1003 1004 1005 1006 1007 1008 1009 1100", DocID: 7, content:"a b c d e a b c f g h i j k l m l k j z z z" If my span query is : a) "3n(a,e)" - It matches doc 7 But, if it is: b) "3n(1,5)" - It does not match doc 6 If query is: c) "4n(1,5)" - it matches doc 6 I have no clue why a) works rather not b). I tried to debug the code, but couldn't figure it out. Any help ? -- View this message in context: http://lucene.472066.n3.nabble.com/SpanNearQuery-distance-issue-tp4008973.html Sent from the Solr - User mailing list archive at Nabble.com.
Grammar for ComplexPhraseQueryParser
Hello, Does anyone have the grammar file (.jj file) for the complex phrase query parser. The patch from https://issues.apache.org/jira/browse/SOLR-1604 does not have the grammar file as part of it. Thanks, Phani. -- View this message in context: http://lucene.472066.n3.nabble.com/Grammar-for-ComplexPhraseQueryParser-tp4002263.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: solr.xml entries got deleted when powered off
No, I'm not keeping them in /tmp -- View this message in context: http://lucene.472066.n3.nabble.com/solr-xml-entries-got-deleted-when-powered-off-tp4001496p4001506.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr.xml entries got deleted when powered off
It's happening when I'm not doing a clean shutdown. Are there any more scenarios it might happen ? -- View this message in context: http://lucene.472066.n3.nabble.com/solr-xml-entries-got-deleted-when-powered-off-tp4001496p4001503.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr.xml entries got deleted when powered off
nopes .. there is good amount of space left on disk -- View this message in context: http://lucene.472066.n3.nabble.com/solr-xml-entries-got-deleted-when-powered-off-tp4001496p4001502.html Sent from the Solr - User mailing list archive at Nabble.com.
solr.xml entries got deleted when powered off
Hello, I created an index => all the schema.xml & solrconfig.xml files are created with content (I checked that they have contents in the xml files). But, if I poweroff the system & restart again - the contents of the files are gone. It's like 0 bytes files. Even, the solr.xml file which got updated when I created a new index (with a core) has 0 bytes & all the previous entries are lost too. I'm using Solr 4.0 Does anyone has any idea about the scenarios where it might happen. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-xml-entries-got-deleted-when-powered-off-tp4001496.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Ping Request Handler Response problem
Hello, I've a problem with SOLR 4.0 Alpha ping request handler. If there are many cores & if I do start all the solr instances and they are up & running successfully, when I do a create index it fails with logs saying that one of the instances is down. I really donno why it is happening as starting all the solr instances passed successfully. But, if I allow couple of seconds wait time & then I'm able to create index without any problem. When I did a little debugging into Solr code, the response from the PingRequestHandler seems to be empty when it fails creating an index just after successfully starting all the solr instances. I absolutely have no idea why this happens. Is it absolutely necessary that I've to wait for about few seconds before I try to create an index just the moment after successfully starting all the solr instances. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Ping-Request-Handler-Response-problem-tp3999694.html Sent from the Solr - User mailing list archive at Nabble.com.