Re: Multiphrase Query in Lucene 4.3

VIGNESH S Wed, 02 Oct 2013 22:56:29 -0700

Hi Ian,

In Lucene Is there any Default Analyzer we can use which will ignore only
Spaces.
All other numbers,punctuation,dates everything it should preserve.


I created my analyzer  with tokenizer which returns Character.isDefined(cn)
&& (!Character.isWhitespace(cn)).
My analyzer will use a lowe case filter on top of the tokenizer.This Woks
Perfect in case of 3.6
In 4.3 it is creating problems in offsets of tokens.




On Mon, Sep 30, 2013 at 8:21 PM, Ian Lea <[email protected]> wrote:

> Whenever someone says they are using a custom analyzer that has to be
> a suspect.  Does it work if you use one of the core lucene analyzers
> instead?  Have you used Luke to verify that the index holds what you
> think it does?
>
>
> --
> Ian.
>
>
> On Mon, Sep 30, 2013 at 3:21 PM, VIGNESH S <[email protected]>
> wrote:
> > Hi,
> >
> > It is not the problem with case..Because Iam using LowercaseFilter.
> >
> > My Analyzer is a custom analyzer which will ignore just white spaces.All
> > other numbers date and other special characters it will consider.The Same
> > analyzer works for Lucene 3.6.
> >
> >
> > When i do a single term query for "Geoffrey" it is giving hits..But when
> > given as a part of multiphrase query ,it is not able to find..When the
> > below code is Executed with say word ="Geoffrey",it is not finding the
> word
> > itself ..
> >
> > if(TermsEnum.SeekStatus.FOUND ==trm.seekCeil(new BytesRef(word)))
> >  {                            do {
> >                                   String s = trm.term().utf8ToString();
> >                                   if (s.equals(word)) {
> >                                     termsWithPrefix.add(new
> Term("content",
> > s));
> >                                   } else {
> >                                     break;
> >                                   }
> >                                 }
> >  while (trm.next() != null);
> >  }
> >
> >
> >
> > On Mon, Sep 30, 2013 at 3:01 PM, Ian Lea <[email protected]> wrote:
> >
> >> Whenever someone says something along the lines of a search for
> >> "geoffrey" not matching "Geoffrey" the case difference springs out,
> >> Can't recall what if anything you said about the analysis side of
> >> things but that could be the cause.  See
> >>
> >>
> http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2F_incorrect_hits.3F
> >>
> >> If on the other hand the problem is more obscure, and only related to
> >> the multi phrase stuff, I suggest you build a tiny but complete
> >> RAMDirectory based program or test case that shows the problem and
> >> post it here.
> >>
> >>
> >> --
> >> Ian.
> >>
> >>
> >>
> >> On Mon, Sep 30, 2013 at 6:46 AM, VIGNESH S <[email protected]>
> >> wrote:
> >> > Hi,
> >> >
> >> > Thanks for your Reply.The Problem I face is there is a word called
> >> Geoffrey
> >> > Romer in my Field.
> >> >
> >> > I am Forming a Multiphrase query object properly like " Geoffrey
> >> Romer".But
> >> > When i do a Search,it is not returning Hits.This Problem I am facing
> is
> >> not
> >> > for all phrases
> >> > This Problem happens for only few Phrases.
> >> >
> >> > When i do a single query like Geoffrey it is giving a Hit..But when i
> do
> >> it
> >> > in MultiphraseQuery it is not able to find "geoffrey". I confirmed
> this
> >> by
> >> > doing trm.seekCeil(new BytesRef("Geoffrey"))  and then and then when i
> >> > do String s = trm.term().utf8ToString().It is pointing to a diffrent
> word
> >> > instead of geoffrey.seekceil is working properly for many phrases
> though.
> >> >
> >> > What could be the problem..please kindly suggest.
> >> >
> >> >
> >> >
> >> > On Fri, Sep 27, 2013 at 6:58 PM, Allison, Timothy B. <
> [email protected]
> >> >wrote:
> >> >
> >> >> 1) An alternate method to your original question would be to do
> >> something
> >> >> like this (I haven't compiled or tested this!):
> >> >>
> >> >> Query q = new PrefixQuery(new Term("field", "app"));
> >> >>
> >> >> q = q.rewrite(indexReader) ;
> >> >> Set<Term> terms = new HashSet<Term>();
> >> >> q.extractTerms(terms);
> >> >> Term[] arr = terms.toArray(new Term[terms.size()]);
> >> >> MultiPhraseQuery mpq = new MultiPhraseQuery();
> >> >> mpq.add(new Term("field", "microsoft");
> >> >> mpq.add(arr);
> >> >>
> >> >>
> >> >> 2) At a higher level, do you need to generate your query
> >> programmatically?
> >> >>  Here are three parsers that could handle this:
> >> >>   a) ComplexPhraseQueryParser
> >> >>   b) SurroundQueryParser: oal.queryparser.surround.parser.QueryParser
> >> >>   c) experimental: <self_promotion degree="shameless">
> >> >> http://issues.apache.org/jira/browse/LUCENE-5205</self_promotion>
> >> >>
> >> >>
> >> >> -----Original Message-----
> >> >> From: VIGNESH S [mailto:[email protected]]
> >> >> Sent: Friday, September 27, 2013 3:33 AM
> >> >> To: [email protected]
> >> >> Subject: Re: Multiphrase Query in Lucene 4.3
> >> >>
> >> >> Hi,
> >> >>
> >> >> The word i am giving is "Romer Geoffrey ".The Word is in the Field.
> >> >>
> >> >>  trm.seekCeil(new BytesRef("Geoffrey")) and then when i do String s =
> >> >> trm.term().utf8ToString(); and hence
> >> >>
> >> >> It is giving a diffrent word..I think this is why my
> multiphrasequery is
> >> >> not giving desired results.
> >> >>
> >> >> What may be the reason..
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On Fri, Sep 27, 2013 at 11:49 AM, VIGNESH S <[email protected]
> >
> >> >> wrote:
> >> >>
> >> >> > Hi Lan,
> >> >> >
> >> >> > Thanks for your Reply.
> >> >> >
> >> >> > I am doing similar to this only..In MultiPhraseQuery object actual
> >> phrase
> >> >> > is going proper but it is not returning any hits..
> >> >> >
> >> >> > In Lucene 3.6,I implemented the same logic and it is working.
> >> >> >
> >> >> > In Lucene 4.3,I implemented the Index for that  using
> >> >> >
> >> >> >  FieldType offsetsType = new FieldType(TextField.TYPE_STORED);
> >> >> >
> >> >> >
> >> >>
> >>
>  
> offsetsType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
> >> >> >
> >> >> > For MultiphraseQuery, whether I need to add any other parameter in
> >> >> > addition to this while indexing?
> >> >> >
> >> >> > Is there any MultiPhraseQueryTest java file for Lucene 4.3? I
> checked
> >> in
> >> >> > Lucene branch and i was not able to find..Please kindly help.
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Thu, Sep 26, 2013 at 2:55 PM, Ian Lea <[email protected]>
> wrote:
> >> >> >
> >> >> >> I use the code below to do something like this.  Not exactly what
> you
> >> >> >> want but should be easy to adapt.
> >> >> >>
> >> >> >>
> >> >> >> public List<String> findTerms(IndexReader _reader,
> >> >> >>                               String _field) throws IOException {
> >> >> >>   List<String> l = new ArrayList<String>();
> >> >> >>   Fields ff = MultiFields.getFields(_reader);
> >> >> >>   Terms trms = ff.terms(_field);
> >> >> >>   TermsEnum te = trms.iterator(null);
> >> >> >>   BytesRef br;
> >> >> >>   while ((br = te.next()) != null) {
> >> >> >>     l.add(br.utf8ToString());
> >> >> >>   }
> >> >> >>   return l;
> >> >> >> }
> >> >> >>
> >> >> >> --
> >> >> >> Ian.
> >> >> >>
> >> >> >> On Wed, Sep 25, 2013 at 3:04 PM, VIGNESH S <
> [email protected]>
> >> >> >> wrote:
> >> >> >> > Hi,
> >> >> >> >
> >> >> >> > In the Example of Multiphrase Query it is mentioned
> >> >> >> >
> >> >> >> > "To use this class, to search for the phrase "Microsoft app*"
> first
> >> >> use
> >> >> >> > add(Term) on the term "Microsoft", then find all terms that have
> >> "app"
> >> >> >> as
> >> >> >> > prefix using IndexReader.terms(Term), and use
> >> >> >> MultiPhraseQuery.add(Term[]
> >> >> >> > terms) to add them to the query"
> >> >> >> >
> >> >> >> >
> >> >> >> > How can i replicate the Same in Lucene 4.3 since
> >> >> >> IndexReader.terms(Term) is
> >> >> >> > no more used
> >> >> >> >
> >> >> >> > --
> >> >> >> > Thanks and Regards
> >> >> >> > Vignesh Srinivasan
> >> >> >>
> >> >> >>
> ---------------------------------------------------------------------
> >> >> >> To unsubscribe, e-mail: [email protected]
> >> >> >> For additional commands, e-mail: [email protected]
> >> >> >>
> >> >> >>
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Thanks and Regards
> >> >> > Vignesh Srinivasan
> >> >> > 9739135640
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Thanks and Regards
> >> >> Vignesh Srinivasan
> >> >> 9739135640
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: [email protected]
> >> >> For additional commands, e-mail: [email protected]
> >> >>
> >> >>
> >> >
> >> >
> >> > --
> >> > Thanks and Regards
> >> > Vignesh Srinivasan
> >> > 9739135640
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> >>
> >
> >
> > --
> > Thanks and Regards
> > Vignesh Srinivasan
> > 9739135640
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>


-- 
Thanks and Regards
Vignesh Srinivasan
9739135640

Re: Multiphrase Query in Lucene 4.3

Reply via email to