Token Stream with Offsets (Token Sources class)

2013-04-07 Thread vempap
Hi,

  I've the following snippet code where I'm trying to extract weighted span
terms from the query (I do have term vectors enabled on the fields):

File path = new File(
"");
FSDirectory directory = FSDirectory.open(path);
IndexReader indexReader = DirectoryReader.open(directory);

Map allWeightedSpanTerms = new 
HashMap();

WeightedSpanTermExtractor extractor = null;
extractor = new WeightedSpanTermExtractor();
TokenStream tokenStream = null;
tokenStream = 
TokenSources.getTokenStreamWithOffsets(indexReader, 0,
"name");
allWeightedSpanTerms.putAll(extractor.getWeightedSpanTerms(q,
tokenStream));

In the end, if I look at the map "allWeightedSpanTerms" - I don't have any
weighted span terms & when I tried to debug the code I found that when it is
trying to build the TermContext the statement "fields.terms(field);" is
returning "null" which I don't understand.

My query is : "Running Apple" (a phrase query)
my doc contents are :
name : Running Apple 60 GB iPod with Video Playback Black - Apple

Please let me know on what I'm doing anything wrong.

Thanks.
Phani.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Token-Stream-with-Offsets-Token-Sources-class-tp4054384.html
Sent from the Solr - User mailing list archive at Nabble.com.


To get Term Offsets of a term per document

2013-02-20 Thread vempap
Hello,

  Is there a way to get Term Offsets of a given term per document without
enabling the termVectors ?

Is it that Lucene index stores the positions but not the offsets by default
- is it correct ?

Thanks,
Phani.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/To-get-Term-Offsets-of-a-term-per-document-tp4041696.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr-UIMA integration : analyzing multi-fields

2012-11-02 Thread vempap
Hello all,

  how to analyze multiple fields using UIMA when we add the UIMA update
chain to the update handler ? and how to map which field gets analyzed to
which field.

For instance,

lets say there are two text fields, text1 & text2 for which I need to
generate pos-tags using UIMA. In the fields section I can definitely do this
:


false

text1
text2



and in the fieldMappings :


org.apache.uima.TokenAnnotation

posTag
postags1



but how to specify that I need pos-tags for field text2 too and that too in
postags2 field. If there is any schema/DTD for these configuration settings
- please let me know.

Also, how can I change the code or is there a way to specify to generate
pos-tags after getting the token stream from an analyzer. Currently, the
update processor gets the text from the input field and generates pos-tags
into postags1 field using WhitespaceTokenizer defined in the xml
configuration files by default. how can I change the tokenizer such that it
uses a Solr Analyzer/ Tokenizer ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-UIMA-integration-analyzing-multi-fields-tp4017890.html
Sent from the Solr - User mailing list archive at Nabble.com.


StandardTokenizer generation from JFlex grammar

2012-10-04 Thread vempap
Hello,

  I'm trying to generate the standard tokenizer again using the jflex
specification (StandardTokenizerImpl.jflex) but I'm not able to do so due to
some errors (I would like to create my own jflex file using the standard
tokenizer which is why I'm trying to first generate using that to get a hang
of things).

I'm using jflex 1.4.3 and I ran into the following error:

Error in file "" (line 64): 
Syntax error.
HangulEx   = (!(!\p{Script:Hangul}|!\p{WB:ALetter})) ({Format} |
{Extend})*


Also, I tried installing an eclipse plugin from
http://cup-lex-eclipse.sourceforge.net/ which I thought would provide
options similar to JavaCC (http://eclipse-javacc.sourceforge.net/) through
which we can generate classes within eclipse - but had a hard luck.

Any help would be very helpful.

Regards,
Phani.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/StandardTokenizer-generation-from-JFlex-grammar-tp4011941.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SpanNearQuery distance issue

2012-09-19 Thread vempap
Shoot me. Thanks, I did not notice that the doc has ".. e a .." in the
content. Thanks again for immediate reply :)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SpanNearQuery-distance-issue-tp4008973p4008978.html
Sent from the Solr - User mailing list archive at Nabble.com.


SpanNearQuery distance issue

2012-09-19 Thread vempap
Hello All,

I've a issue with respect to the distance measure of SpanNearQuery in
Lucene. Let's say I've following two documents:

DocID: 6, cotent:"1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1001
1002 1003 1004 1005 1006 1007 1008 1009 1100", 
DocID: 7, content:"a b c d e a b c f g h i j k l m l k j z z z"

If my span query is :
a) "3n(a,e)" - It matches doc 7
But, if it is:
b) "3n(1,5)" - It does not match doc 6
If query is:
c) "4n(1,5)" - it matches doc 6

I have no clue why a) works rather not b). I tried to debug the code, but
couldn't figure it out.

Any help ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SpanNearQuery-distance-issue-tp4008973.html
Sent from the Solr - User mailing list archive at Nabble.com.


Grammar for ComplexPhraseQueryParser

2012-08-20 Thread vempap
Hello,

   Does anyone have the grammar file (.jj file) for the complex phrase query
parser. The patch from https://issues.apache.org/jira/browse/SOLR-1604 does
not have the grammar file as part of it.

Thanks,
Phani.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Grammar-for-ComplexPhraseQueryParser-tp4002263.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: solr.xml entries got deleted when powered off

2012-08-15 Thread vempap
No, I'm not keeping them in /tmp



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-xml-entries-got-deleted-when-powered-off-tp4001496p4001506.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr.xml entries got deleted when powered off

2012-08-15 Thread vempap
It's happening when I'm not doing a clean shutdown. Are there any more
scenarios it might happen ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-xml-entries-got-deleted-when-powered-off-tp4001496p4001503.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr.xml entries got deleted when powered off

2012-08-15 Thread vempap
nopes .. there is good amount of space left on disk



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-xml-entries-got-deleted-when-powered-off-tp4001496p4001502.html
Sent from the Solr - User mailing list archive at Nabble.com.


solr.xml entries got deleted when powered off

2012-08-15 Thread vempap
Hello,

  I created an index => all the schema.xml & solrconfig.xml files are
created with content (I checked that they have contents in the xml files).
But, if I poweroff the system & restart again - the contents of the files
are gone. It's like 0 bytes files.

Even, the solr.xml file which got updated when I created a new index (with a
core) has 0 bytes & all the previous entries are lost too.

I'm using Solr 4.0

Does anyone has any idea about the scenarios where it might happen.

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-xml-entries-got-deleted-when-powered-off-tp4001496.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr Ping Request Handler Response problem

2012-08-07 Thread vempap
Hello,

  I've a problem with SOLR 4.0 Alpha ping request handler. If there are many
cores & if I do start all the solr instances and they are up & running
successfully, when I do a create index it fails with logs saying that one of
the instances is down. I really donno why it is happening as starting all
the solr instances passed successfully. But, if I allow couple of seconds
wait time & then I'm able to create index without any problem. When I did a
little debugging into Solr code, the response from the PingRequestHandler
seems to be empty when it fails creating an index just after successfully
starting all the solr instances.

I absolutely have no idea why this happens. Is it absolutely necessary that
I've to wait for about few seconds before I try to create an index just the
moment after successfully starting all the solr instances.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Ping-Request-Handler-Response-problem-tp3999694.html
Sent from the Solr - User mailing list archive at Nabble.com.