subject:"Auto complete"

auto complete search - slow response

2019-07-22 Thread Satya Marivada

Hi All,

We have solr 6.3.0 solr cloud running stable. We use an auto complete
search functionality where the search is being done after 3 key strokes.
Recently we ran into an issue (very slow response from solr) when one of
the user hit an auto complete search by typing initial letters of week days
and in repetitions.

Something like "S M T W T F S S M T W T F S .."

Is there a recommended approach from solr perspective to prevent such
scenarios? Thinking of preventing any searches with more than 4 or 5 single
letter searches as that would not be the use case for us.

Looking for any recommendations?

Thanks,
Satya

Solr suggest, auto complete & spellcheck

2016-01-04 Thread Steven White

Hi,

I'm trying to understand what are the differences between Solr suggest,
auto complete & spellcheck?  Isn't each a function of the UI?  If not, can
you provide me with links that show end-to-end example setting up Solr to
get all of the 3 features?

I'm on Solr 5.2.

Thanks

Steve

Re: Solr suggest, auto complete & spellcheck

2016-01-04 Thread Erick Erickson

Here's a writeup on suggester:
https://lucidworks.com/blog/2015/03/04/solr-suggester/

The biggest difference is that spellcheck returns individual _terms_
whereas suggesters can return entire fields.

Neither are "a function of the UI" any more than searching is a
function of the UI. In both cases you have to do something
user-friendly with the return.

Best,
Erick

On Mon, Jan 4, 2016 at 2:06 PM, Steven White <swhite4...@gmail.com> wrote:
> Hi,
>
> I'm trying to understand what are the differences between Solr suggest,
> auto complete & spellcheck?  Isn't each a function of the UI?  If not, can
> you provide me with links that show end-to-end example setting up Solr to
> get all of the 3 features?
>
> I'm on Solr 5.2.
>
> Thanks
>
> Steve

Re: Solr Auto-Complete

2015-12-08 Thread Salman Ansari

Thanks Alexandre. I think it is clear.

On Sun, Dec 6, 2015 at 5:21 PM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> For suffix matches, you copy text the field and in the different type add
> string reversal for both index and query portions. So you are doing prefix
> matching algorithm but on reversed strings.
>
> I can dig up an example if it is not clear.
> On 6 Dec 2015 8:06 am, "Salman Ansari" <salman.rah...@gmail.com> wrote:
>
> > That is right. I am actually looking for phrase prefixes not each term
> > prefix within the phrase. That satisfies my requirements. However, my
> > additional question was how do I manipulate the filedType to later allow
> > for suffix matches as well? or will that be a completely different
> > fieldType definition?
> >
> > Regards,
> > Salman
> >
> >
> > On Sun, Dec 6, 2015 at 2:12 PM, Andrea Gazzarini <a.gazzar...@gmail.com>
> > wrote:
> >
> > > Sorry, my damned mobile: "Is that close to what you were looking for?"
> > >
> > > 2015-12-06 12:07 GMT+01:00 Andrea Gazzarini <a.gazzar...@gmail.com>:
> > >
> > > > Do you mean "phrase" or "term" prefixes? If you try to put a field
> > value
> > > > (two or more terms) in the analysis page you will see what the index
> > > > analyzer chain (of my example field type) is doing. The whole value
> is
> > > > managed as a single-ngrammed token, so you will get only a phrase
> > prefix
> > > > search, as in your request.
> > > >
> > > > If you want to manage also terms prefixes, I would also index another
> > > > field (similar to the example you posted); then, the search handler
> > with
> > > > e(dismax) would have something like this:
> > > >
> > > >
> > > >>
> > > > text_suggestion_phrase_prefix_search^b1
> > > > text_suggestion_terms_prefix_search^b2
> > > >
> > > > 
> > > >
> > > >
> > > > b1 and b2 values strictly depend on your search logic.
> > > >
> > > > Is that close that what you were looking for?
> > > >
> > > > Best,
> > > > Andrea
> > > >
> > > >
> > > >
> > > > 2015-12-06 11:53 GMT+01:00 Salman Ansari <salman.rah...@gmail.com>:
> > > >
> > > >> Thanks a lot Andrea. It did work.
> > > >>
> > > >> However, just for my understanding, can you please explain more how
> > did
> > > >> you
> > > >> make it work for prefixes. I know you mentioned using another
> > Tokenizer
> > > >> but
> > > >> for example, if I want to tweak it later on to work on suffixes or
> > > within
> > > >> phrases how should I go about that?
> > > >>
> > > >> Thanks again for your help.
> > > >>
> > > >> Regards,
> > > >> Salman
> > > >>
> > > >>
> > > >> On Sun, Dec 6, 2015 at 1:24 PM, Andrea Gazzarini <
> > a.gazzar...@gmail.com
> > > >
> > > >> wrote:
> > > >>
> > > >> > Hi Salman,
> > > >> > that's because you're using a StandardTokenizer. Try with
> something
> > > like
> > > >> > this (copied, pasted and changed using my phone so probably with a
> > lot
> > > >> of
> > > >> > mistakes ;) but you should be able to get what I mean). BTW I
> don't
> > > >> know if
> > > >> > that's the case but I would also put a MappingCharFilterFactory
> > > >> >
> > > >> >  > > >> > positionIncrementGap="100">
> > > >> > 
> > > >> > * > > >> > mapping="mapping-FoldToASCII.txt"/>*
> > > >> > 
> > > >> > 
> > > >> >  > > >> > generateWordParts="0" generateNumberParts="0" catenateAll="1"
> > > >> > splitOnCaseChange="0" />
> > > >> >  > > >> > maxGramSize="20"/>
> > > >> > 
> > > >> > 
> > > >> > * > > >&g

Re: Solr Auto-Complete

2015-12-06 Thread Alexandre Rafalovitch

For suffix matches, you copy text the field and in the different type add
string reversal for both index and query portions. So you are doing prefix
matching algorithm but on reversed strings.

I can dig up an example if it is not clear.
On 6 Dec 2015 8:06 am, "Salman Ansari" <salman.rah...@gmail.com> wrote:

> That is right. I am actually looking for phrase prefixes not each term
> prefix within the phrase. That satisfies my requirements. However, my
> additional question was how do I manipulate the filedType to later allow
> for suffix matches as well? or will that be a completely different
> fieldType definition?
>
> Regards,
> Salman
>
>
> On Sun, Dec 6, 2015 at 2:12 PM, Andrea Gazzarini <a.gazzar...@gmail.com>
> wrote:
>
> > Sorry, my damned mobile: "Is that close to what you were looking for?"
> >
> > 2015-12-06 12:07 GMT+01:00 Andrea Gazzarini <a.gazzar...@gmail.com>:
> >
> > > Do you mean "phrase" or "term" prefixes? If you try to put a field
> value
> > > (two or more terms) in the analysis page you will see what the index
> > > analyzer chain (of my example field type) is doing. The whole value is
> > > managed as a single-ngrammed token, so you will get only a phrase
> prefix
> > > search, as in your request.
> > >
> > > If you want to manage also terms prefixes, I would also index another
> > > field (similar to the example you posted); then, the search handler
> with
> > > e(dismax) would have something like this:
> > >
> > >
> > >>
> > > text_suggestion_phrase_prefix_search^b1
> > > text_suggestion_terms_prefix_search^b2
> > >
> > > 
> > >
> > >
> > > b1 and b2 values strictly depend on your search logic.
> > >
> > > Is that close that what you were looking for?
> > >
> > > Best,
> > > Andrea
> > >
> > >
> > >
> > > 2015-12-06 11:53 GMT+01:00 Salman Ansari <salman.rah...@gmail.com>:
> > >
> > >> Thanks a lot Andrea. It did work.
> > >>
> > >> However, just for my understanding, can you please explain more how
> did
> > >> you
> > >> make it work for prefixes. I know you mentioned using another
> Tokenizer
> > >> but
> > >> for example, if I want to tweak it later on to work on suffixes or
> > within
> > >> phrases how should I go about that?
> > >>
> > >> Thanks again for your help.
> > >>
> > >> Regards,
> > >> Salman
> > >>
> > >>
> > >> On Sun, Dec 6, 2015 at 1:24 PM, Andrea Gazzarini <
> a.gazzar...@gmail.com
> > >
> > >> wrote:
> > >>
> > >> > Hi Salman,
> > >> > that's because you're using a StandardTokenizer. Try with something
> > like
> > >> > this (copied, pasted and changed using my phone so probably with a
> lot
> > >> of
> > >> > mistakes ;) but you should be able to get what I mean). BTW I don't
> > >> know if
> > >> > that's the case but I would also put a MappingCharFilterFactory
> > >> >
> > >> >  > >> > positionIncrementGap="100">
> > >> > 
> > >> > * > >> > mapping="mapping-FoldToASCII.txt"/>*
> > >> > 
> > >> > 
> > >> >  > >> > generateWordParts="0" generateNumberParts="0" catenateAll="1"
> > >> > splitOnCaseChange="0" />
> > >> >  > >> > maxGramSize="20"/>
> > >> > 
> > >> > 
> > >> > * > >> > mapping="mapping-FoldToASCII.txt"/>*
> > >> > 
> > >> > 
> > >> >  > >> > generateWordParts="0" generateNumberParts="0" catenateAll="1"
> > >> > splitOnCaseChange="0" />
> > >> > 
> > >> > 
> > >> >
> > >> >
> > >> > 2015-12-06 9:36 GMT+01:00 Salman Ansari <salman.rah...@gmail.com>:
> > >> >
> > >> > > Hi,
> > >> > >
> > >> > >
&g

Re: Solr Auto-Complete

2015-12-06 Thread Andrea Gazzarini

Do you mean "phrase" or "term" prefixes? If you try to put a field value
(two or more terms) in the analysis page you will see what the index
analyzer chain (of my example field type) is doing. The whole value is
managed as a single-ngrammed token, so you will get only a phrase prefix
search, as in your request.

If you want to manage also terms prefixes, I would also index another field
(similar to the example you posted); then, the search handler with
e(dismax) would have something like this:

   
>
text_suggestion_phrase_prefix_search^b1
text_suggestion_terms_prefix_search^b2




b1 and b2 values strictly depend on your search logic.

Is that close that what you were looking for?

Best,
Andrea



2015-12-06 11:53 GMT+01:00 Salman Ansari <salman.rah...@gmail.com>:

> Thanks a lot Andrea. It did work.
>
> However, just for my understanding, can you please explain more how did you
> make it work for prefixes. I know you mentioned using another Tokenizer but
> for example, if I want to tweak it later on to work on suffixes or within
> phrases how should I go about that?
>
> Thanks again for your help.
>
> Regards,
> Salman
>
>
> On Sun, Dec 6, 2015 at 1:24 PM, Andrea Gazzarini <a.gazzar...@gmail.com>
> wrote:
>
> > Hi Salman,
> > that's because you're using a StandardTokenizer. Try with something like
> > this (copied, pasted and changed using my phone so probably with a lot of
> > mistakes ;) but you should be able to get what I mean). BTW I don't know
> if
> > that's the case but I would also put a MappingCharFilterFactory
> >
> >  > positionIncrementGap="100">
> > 
> > * > mapping="mapping-FoldToASCII.txt"/>*
> > 
> > 
> >  > generateWordParts="0" generateNumberParts="0" catenateAll="1"
> > splitOnCaseChange="0" />
> >  > maxGramSize="20"/>
> > 
> > 
> > * > mapping="mapping-FoldToASCII.txt"/>*
> > 
> > 
> >  > generateWordParts="0" generateNumberParts="0" catenateAll="1"
> > splitOnCaseChange="0" />
> > 
> > 
> >
> >
> > 2015-12-06 9:36 GMT+01:00 Salman Ansari <salman.rah...@gmail.com>:
> >
> > > Hi,
> > >
> > >
> > >
> > > I have updated my schema.xml as mentioned in the previous posts using
> > >
> > >
> > >
> > >  > > positionIncrementGap="100">
> > > 
> > > 
> > > 
> > >  > > maxGramSize="20"/>
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > >
> > >
> > >
> > > This does the auto-complete, but it does it at every portion of the
> text
> > > (not just at the beginning) (prefix). So searching for "And" in my
> field
> > > for locations returns both of the following documents.
> > >
> > >
> > >
> > > 
> > >
> > > 1
> > >
> > > AD
> > >
> > > *And*orra
> > >
> > > أندورا
> > >
> > > 1519794717684924416
> > >
> > > 
> > >
> > > 
> > >
> > > 5
> > >
> > > AG
> > >
> > > Antigua *and* Barbuda
> > >
> > > أنتيجوا وبربودا
> > >
> > > 1519794717701701633
> > >
> > > 
> > >
> > >
> > >
> > > I have read about this and at first I thought I need to add
> side="front"
> > > but after adding that, Solr returned an error (when creating a
> > collection)
> > > indicating "Unknown parameters

Re: Solr Auto-Complete

2015-12-06 Thread Salman Ansari

Hi,



I have updated my schema.xml as mentioned in the previous posts using

















This does the auto-complete, but it does it at every portion of the text
(not just at the beginning) (prefix). So searching for "And" in my field
for locations returns both of the following documents.





1

AD

*And*orra

أندورا

1519794717684924416





5

AG

Antigua *and* Barbuda

أنتيجوا وبربودا

1519794717701701633





I have read about this and at first I thought I need to add side="front"
but after adding that, Solr returned an error (when creating a collection)
indicating "Unknown parameters

Re: Solr Auto-Complete

2015-12-06 Thread Andrea Gazzarini

Hi Salman,
that's because you're using a StandardTokenizer. Try with something like
this (copied, pasted and changed using my phone so probably with a lot of
mistakes ;) but you should be able to get what I mean). BTW I don't know if
that's the case but I would also put a MappingCharFilterFactory



**






**







2015-12-06 9:36 GMT+01:00 Salman Ansari <salman.rah...@gmail.com>:

> Hi,
>
>
>
> I have updated my schema.xml as mentioned in the previous posts using
>
>
>
>  positionIncrementGap="100">
> 
> 
> 
>  maxGramSize="20"/>
> 
>     
>     
> 
> 
> 
>
>
>
> This does the auto-complete, but it does it at every portion of the text
> (not just at the beginning) (prefix). So searching for "And" in my field
> for locations returns both of the following documents.
>
>
>
> 
>
> 1
>
> AD
>
> *And*orra
>
> أندورا
>
> 1519794717684924416
>
> 
>
> 
>
> 5
>
> AG
>
> Antigua *and* Barbuda
>
> أنتيجوا وبربودا
>
> 1519794717701701633
>
> 
>
>
>
> I have read about this and at first I thought I need to add side="front"
> but after adding that, Solr returned an error (when creating a collection)
> indicating "Unknown parameters

Re: Solr Auto-Complete

2015-12-06 Thread Salman Ansari

Thanks a lot Andrea. It did work.

However, just for my understanding, can you please explain more how did you
make it work for prefixes. I know you mentioned using another Tokenizer but
for example, if I want to tweak it later on to work on suffixes or within
phrases how should I go about that?

Thanks again for your help.

Regards,
Salman


On Sun, Dec 6, 2015 at 1:24 PM, Andrea Gazzarini <a.gazzar...@gmail.com>
wrote:

> Hi Salman,
> that's because you're using a StandardTokenizer. Try with something like
> this (copied, pasted and changed using my phone so probably with a lot of
> mistakes ;) but you should be able to get what I mean). BTW I don't know if
> that's the case but I would also put a MappingCharFilterFactory
>
>  positionIncrementGap="100">
> 
> * mapping="mapping-FoldToASCII.txt"/>*
> 
> 
>  generateWordParts="0" generateNumberParts="0" catenateAll="1"
> splitOnCaseChange="0" />
>  maxGramSize="20"/>
> 
> 
> * mapping="mapping-FoldToASCII.txt"/>*
> 
> 
>  generateWordParts="0" generateNumberParts="0" catenateAll="1"
> splitOnCaseChange="0" />
> 
> 
>
>
> 2015-12-06 9:36 GMT+01:00 Salman Ansari <salman.rah...@gmail.com>:
>
> > Hi,
> >
> >
> >
> > I have updated my schema.xml as mentioned in the previous posts using
> >
> >
> >
> >  > positionIncrementGap="100">
> > 
> > 
> > 
> >  > maxGramSize="20"/>
> > 
> > 
> > 
> > 
> > 
> > 
> >
> >
> >
> > This does the auto-complete, but it does it at every portion of the text
> > (not just at the beginning) (prefix). So searching for "And" in my field
> > for locations returns both of the following documents.
> >
> >
> >
> > 
> >
> > 1
> >
> > AD
> >
> > *And*orra
> >
> > أندورا
> >
> > 1519794717684924416
> >
> > 
> >
> > 
> >
> > 5
> >
> > AG
> >
> > Antigua *and* Barbuda
> >
> > أنتيجوا وبربودا
> >
> > 1519794717701701633
> >
> > 
> >
> >
> >
> > I have read about this and at first I thought I need to add side="front"
> > but after adding that, Solr returned an error (when creating a
> collection)
> > indicating "Unknown parameters

Re: Solr Auto-Complete

2015-12-06 Thread Andrea Gazzarini

Sorry, my damned mobile: "Is that close to what you were looking for?"

2015-12-06 12:07 GMT+01:00 Andrea Gazzarini <a.gazzar...@gmail.com>:

> Do you mean "phrase" or "term" prefixes? If you try to put a field value
> (two or more terms) in the analysis page you will see what the index
> analyzer chain (of my example field type) is doing. The whole value is
> managed as a single-ngrammed token, so you will get only a phrase prefix
> search, as in your request.
>
> If you want to manage also terms prefixes, I would also index another
> field (similar to the example you posted); then, the search handler with
> e(dismax) would have something like this:
>
>
>>
> text_suggestion_phrase_prefix_search^b1
> text_suggestion_terms_prefix_search^b2
>
> 
>
>
> b1 and b2 values strictly depend on your search logic.
>
> Is that close that what you were looking for?
>
> Best,
> Andrea
>
>
>
> 2015-12-06 11:53 GMT+01:00 Salman Ansari <salman.rah...@gmail.com>:
>
>> Thanks a lot Andrea. It did work.
>>
>> However, just for my understanding, can you please explain more how did
>> you
>> make it work for prefixes. I know you mentioned using another Tokenizer
>> but
>> for example, if I want to tweak it later on to work on suffixes or within
>> phrases how should I go about that?
>>
>> Thanks again for your help.
>>
>> Regards,
>> Salman
>>
>>
>> On Sun, Dec 6, 2015 at 1:24 PM, Andrea Gazzarini <a.gazzar...@gmail.com>
>> wrote:
>>
>> > Hi Salman,
>> > that's because you're using a StandardTokenizer. Try with something like
>> > this (copied, pasted and changed using my phone so probably with a lot
>> of
>> > mistakes ;) but you should be able to get what I mean). BTW I don't
>> know if
>> > that's the case but I would also put a MappingCharFilterFactory
>> >
>> > > > positionIncrementGap="100">
>> > 
>> > *> > mapping="mapping-FoldToASCII.txt"/>*
>> > 
>> > 
>> > > > generateWordParts="0" generateNumberParts="0" catenateAll="1"
>> > splitOnCaseChange="0" />
>> > > > maxGramSize="20"/>
>> > 
>> > 
>> > *> > mapping="mapping-FoldToASCII.txt"/>*
>> > 
>> > 
>> > > > generateWordParts="0" generateNumberParts="0" catenateAll="1"
>> > splitOnCaseChange="0" />
>> > 
>> > 
>> >
>> >
>> > 2015-12-06 9:36 GMT+01:00 Salman Ansari <salman.rah...@gmail.com>:
>> >
>> > > Hi,
>> > >
>> > >
>> > >
>> > > I have updated my schema.xml as mentioned in the previous posts using
>> > >
>> > >
>> > >
>> > > > > > positionIncrementGap="100">
>> > > 
>> > > 
>> > > 
>> > > > minGramSize="1"
>> > > maxGramSize="20"/>
>> > > 
>> > > 
>> > > 
>> > > 
>> > > 
>> > > 
>> > >
>> > >
>> > >
>> > > This does the auto-complete, but it does it at every portion of the
>> text
>> > > (not just at the beginning) (prefix). So searching for "And" in my
>> field
>> > > for locations returns both of the following documents.
>> > >
>> > >
>> > >
>> > > 
>> > >
>> > > 1
>> > >
>> > > AD
>> > >
>> > > *And*orra
>> > >
>> > > أندورا
>> > >
>> > > 1519794717684924416
>> > >
>> > > 
>> > >
>> > > 
>> > >
>> > > 5
>> > >
>> > > AG
>> > >
>> > > Antigua *and* Barbuda
>> > >
>> > > أنتيجوا وبربودا
>> > >
>> > > 1519794717701701633
>> > >
>> > > 
>> > >
>> > >
>> > >
>> > > I have read about this and at first I thought I need to add
>> side="front"
>> > > but after adding that, Solr returned an error (when creating a
>> > collection)
>> > > indicating "Unknown parameters

Re: Solr Auto-Complete

2015-12-06 Thread Salman Ansari

That is right. I am actually looking for phrase prefixes not each term
prefix within the phrase. That satisfies my requirements. However, my
additional question was how do I manipulate the filedType to later allow
for suffix matches as well? or will that be a completely different
fieldType definition?

Regards,
Salman


On Sun, Dec 6, 2015 at 2:12 PM, Andrea Gazzarini <a.gazzar...@gmail.com>
wrote:

> Sorry, my damned mobile: "Is that close to what you were looking for?"
>
> 2015-12-06 12:07 GMT+01:00 Andrea Gazzarini <a.gazzar...@gmail.com>:
>
> > Do you mean "phrase" or "term" prefixes? If you try to put a field value
> > (two or more terms) in the analysis page you will see what the index
> > analyzer chain (of my example field type) is doing. The whole value is
> > managed as a single-ngrammed token, so you will get only a phrase prefix
> > search, as in your request.
> >
> > If you want to manage also terms prefixes, I would also index another
> > field (similar to the example you posted); then, the search handler with
> > e(dismax) would have something like this:
> >
> >
> >>
> > text_suggestion_phrase_prefix_search^b1
> > text_suggestion_terms_prefix_search^b2
> >
> > 
> >
> >
> > b1 and b2 values strictly depend on your search logic.
> >
> > Is that close that what you were looking for?
> >
> > Best,
> > Andrea
> >
> >
> >
> > 2015-12-06 11:53 GMT+01:00 Salman Ansari <salman.rah...@gmail.com>:
> >
> >> Thanks a lot Andrea. It did work.
> >>
> >> However, just for my understanding, can you please explain more how did
> >> you
> >> make it work for prefixes. I know you mentioned using another Tokenizer
> >> but
> >> for example, if I want to tweak it later on to work on suffixes or
> within
> >> phrases how should I go about that?
> >>
> >> Thanks again for your help.
> >>
> >> Regards,
> >> Salman
> >>
> >>
> >> On Sun, Dec 6, 2015 at 1:24 PM, Andrea Gazzarini <a.gazzar...@gmail.com
> >
> >> wrote:
> >>
> >> > Hi Salman,
> >> > that's because you're using a StandardTokenizer. Try with something
> like
> >> > this (copied, pasted and changed using my phone so probably with a lot
> >> of
> >> > mistakes ;) but you should be able to get what I mean). BTW I don't
> >> know if
> >> > that's the case but I would also put a MappingCharFilterFactory
> >> >
> >> >  >> > positionIncrementGap="100">
> >> > 
> >> > * >> > mapping="mapping-FoldToASCII.txt"/>*
> >> > 
> >> > 
> >> >  >> > generateWordParts="0" generateNumberParts="0" catenateAll="1"
> >> > splitOnCaseChange="0" />
> >> >  >> > maxGramSize="20"/>
> >> > 
> >> > 
> >> > * >> > mapping="mapping-FoldToASCII.txt"/>*
> >> >     
> >> > 
> >> >  >> > generateWordParts="0" generateNumberParts="0" catenateAll="1"
> >> > splitOnCaseChange="0" />
> >> > 
> >> > 
> >> >
> >> >
> >> > 2015-12-06 9:36 GMT+01:00 Salman Ansari <salman.rah...@gmail.com>:
> >> >
> >> > > Hi,
> >> > >
> >> > >
> >> > >
> >> > > I have updated my schema.xml as mentioned in the previous posts
> using
> >> > >
> >> > >
> >> > >
> >> > >  >> > > positionIncrementGap="100">
> >> > > 
> >> > > 
> >> > > 
> >> > >  >> minGramSize="1"
> >> > > maxGramSize="20"/>
> >> > > 
> >> > > 
> >> > > 
> >> > > 
> >> > > 
> >> > > 
> >> > >
> >> > >
> >> > >
> >> > > This does the auto-complete, but it does it at every portion of the
> >> text
> >> > > (not just at the beginning) (prefix). So searching for "And" in my
> >> field
> >> > > for locations returns both of the following documents.
> >> > >
> >> > >
> >> > >
> >> > > 
> >> > >
> >> > > 1
> >> > >
> >> > > AD
> >> > >
> >> > > *And*orra
> >> > >
> >> > > أندورا
> >> > >
> >> > > 1519794717684924416
> >> > >
> >> > > 
> >> > >
> >> > > 
> >> > >
> >> > > 5
> >> > >
> >> > > AG
> >> > >
> >> > > Antigua *and* Barbuda
> >> > >
> >> > > أنتيجوا وبربودا
> >> > >
> >> > > 1519794717701701633
> >> > >
> >> > > 
> >> > >
> >> > >
> >> > >
> >> > > I have read about this and at first I thought I need to add
> >> side="front"
> >> > > but after adding that, Solr returned an error (when creating a
> >> > collection)
> >> > > indicating "Unknown parameters

Re: Solr Auto-Complete

2015-12-04 Thread Alexandre Rafalovitch

You can see an example of similar use at:
http://www.solr-start.com/javadoc/solr-lucene/index.html (search box).

The corresponding schema is here:
https://github.com/arafalov/Solr-Javadoc/blob/master/JavadocIndex/JavadocCollection/conf/schema.xml#L24
. It does have some extra special-case stuff to allow to search by the
fragments, but the general use case is the same.

Regards,
   Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 4 December 2015 at 10:11, Salman Ansari <salman.rah...@gmail.com> wrote:
> Thanks Alan, Alessandaro and Andrea for your great explanations. I will
> follow the path of adding edge ngrams to the field type for my use case.
>
> Regards,
> Salman
>
> On Thu, Dec 3, 2015 at 12:23 PM, Alessandro Benedetti <abenede...@apache.org
>> wrote:
>
>> "Sounds good but I heard "/suggest" component is the recommended way of
>> doing auto-complete"
>>
>> This sounds fantastic :)
>> We "heard" that as well, we know what the suggest component does.
>> The point is that you would like to retrieve the suggestions + some
>> consistent payload in different fields.
>> Current suggest component offers some effort in providing a payload, but
>> almost all the suggester implementation are based on an FST approach which
>> aim to be as fast and memory efficient as possible.
>> Honestly you could experiment and even contribute a customisation if you
>> want to add a new feature to the suggest component able to return complex
>> payloads together with the suggestions.
>> Apart that, it strictly depends of how you want to provide the
>> autocompletion, there are plenty of different lookups implementation and
>> plenty of tokenizer/token filters to combine .
>> So I would confirm what we already said and that Andrea confirmed.
>>
>> If anyone has played with the suggester suggestions payload, his feedback
>> is welcome!
>>
>> Cheers
>>
>>
>> On 3 December 2015 at 06:21, Andrea Gazzarini <a.gazzar...@gmail.com>
>> wrote:
>>
>> > Hi Salman,
>> > few months ago I have been involved in a project similar to
>> > map.geoadmin.ch
>> > and there, I had your same need (I also sent an email to this list).
>> >
>> > From my side I can furtherly confirm what Alan and Alessandro already
>> > explained, I followed that approach.
>> >
>> > IMHO, that is the "recommended way" if the component's features meet your
>> > needs (i.e. do not reinvent the wheel) but it seems you're out of those
>> > bounds.
>> >
>> > Best,
>> > Andrea
>> > On 2 Dec 2015 21:51, "Salman Ansari" <salman.rah...@gmail.com> wrote:
>> >
>> > > Sounds good but I heard "/suggest" component is the recommended way of
>> > > doing auto-complete in the new versions of Solr. Something along the
>> > lines
>> > > of this article
>> > > https://cwiki.apache.org/confluence/display/solr/Suggester
>> > >
>> > > 
>> > >   
>> > > mySuggester
>> > > FuzzyLookupFactory
>> > > DocumentDictionaryFactory
>> > > cat
>> > > price
>> > > string
>> > > false
>> > >   
>> > > 
>> > >
>> > > Can someone confirm this?
>> > >
>> > > Regards,
>> > > Salman
>> > >
>> > >
>> > > On Wed, Dec 2, 2015 at 1:14 PM, Alessandro Benedetti <
>> > > abenede...@apache.org>
>> > > wrote:
>> > >
>> > > > Hi Salman,
>> > > > I agree with Alan.
>> > > > Just configure your schema with the proper analysers .
>> > > > For the field you want to use for suggestions you are likely to need
>> > > simply
>> > > > this fieldType :
>> > > >
>> > > > > > > > positionIncrementGap="100">
>> > > > 
>> > > > 
>> > > > 
>> > > > > minGramSize="1"
>> > > > maxGramSize="20"/>
>> > > > 
>> > > > 
>> > > > 
>> > > > 
>> > > > 
>> > > > 
>> > > >
>> > > > This is a very sample example, please adapt it to your use case.
>> > &

Re: Solr Auto-Complete

2015-12-04 Thread Salman Ansari

Thanks Alan, Alessandaro and Andrea for your great explanations. I will
follow the path of adding edge ngrams to the field type for my use case.

Regards,
Salman

On Thu, Dec 3, 2015 at 12:23 PM, Alessandro Benedetti <abenede...@apache.org
> wrote:

> "Sounds good but I heard "/suggest" component is the recommended way of
> doing auto-complete"
>
> This sounds fantastic :)
> We "heard" that as well, we know what the suggest component does.
> The point is that you would like to retrieve the suggestions + some
> consistent payload in different fields.
> Current suggest component offers some effort in providing a payload, but
> almost all the suggester implementation are based on an FST approach which
> aim to be as fast and memory efficient as possible.
> Honestly you could experiment and even contribute a customisation if you
> want to add a new feature to the suggest component able to return complex
> payloads together with the suggestions.
> Apart that, it strictly depends of how you want to provide the
> autocompletion, there are plenty of different lookups implementation and
> plenty of tokenizer/token filters to combine .
> So I would confirm what we already said and that Andrea confirmed.
>
> If anyone has played with the suggester suggestions payload, his feedback
> is welcome!
>
> Cheers
>
>
> On 3 December 2015 at 06:21, Andrea Gazzarini <a.gazzar...@gmail.com>
> wrote:
>
> > Hi Salman,
> > few months ago I have been involved in a project similar to
> > map.geoadmin.ch
> > and there, I had your same need (I also sent an email to this list).
> >
> > From my side I can furtherly confirm what Alan and Alessandro already
> > explained, I followed that approach.
> >
> > IMHO, that is the "recommended way" if the component's features meet your
> > needs (i.e. do not reinvent the wheel) but it seems you're out of those
> > bounds.
> >
> > Best,
> > Andrea
> > On 2 Dec 2015 21:51, "Salman Ansari" <salman.rah...@gmail.com> wrote:
> >
> > > Sounds good but I heard "/suggest" component is the recommended way of
> > > doing auto-complete in the new versions of Solr. Something along the
> > lines
> > > of this article
> > > https://cwiki.apache.org/confluence/display/solr/Suggester
> > >
> > > 
> > >   
> > > mySuggester
> > > FuzzyLookupFactory
> > > DocumentDictionaryFactory
> > > cat
> > > price
> > > string
> > > false
> > >   
> > > 
> > >
> > > Can someone confirm this?
> > >
> > > Regards,
> > > Salman
> > >
> > >
> > > On Wed, Dec 2, 2015 at 1:14 PM, Alessandro Benedetti <
> > > abenede...@apache.org>
> > > wrote:
> > >
> > > > Hi Salman,
> > > > I agree with Alan.
> > > > Just configure your schema with the proper analysers .
> > > > For the field you want to use for suggestions you are likely to need
> > > simply
> > > > this fieldType :
> > > >
> > > >  > > > positionIncrementGap="100">
> > > > 
> > > > 
> > > > 
> > > >  minGramSize="1"
> > > > maxGramSize="20"/>
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > >
> > > > This is a very sample example, please adapt it to your use case.
> > > >
> > > > Cheers
> > > >
> > > > On 2 December 2015 at 09:41, Alan Woodward <a...@flax.co.uk> wrote:
> > > >
> > > > > Hi Salman,
> > > > >
> > > > > It sounds as though you want to do a normal search against a
> special
> > > > > 'suggest' field, that's been indexed with edge ngrams.
> > > > >
> > > > > Alan Woodward
> > > > > www.flax.co.uk
> > > > >
> > > > >
> > > > > On 2 Dec 2015, at 09:31, Salman Ansari wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I am looking for auto-complete in Solr but on top of just auto
> > > > complete I
> > > > > > want as well to re

Re: Solr Auto-Complete

2015-12-03 Thread Alessandro Benedetti

"Sounds good but I heard "/suggest" component is the recommended way of
doing auto-complete"

This sounds fantastic :)
We "heard" that as well, we know what the suggest component does.
The point is that you would like to retrieve the suggestions + some
consistent payload in different fields.
Current suggest component offers some effort in providing a payload, but
almost all the suggester implementation are based on an FST approach which
aim to be as fast and memory efficient as possible.
Honestly you could experiment and even contribute a customisation if you
want to add a new feature to the suggest component able to return complex
payloads together with the suggestions.
Apart that, it strictly depends of how you want to provide the
autocompletion, there are plenty of different lookups implementation and
plenty of tokenizer/token filters to combine .
So I would confirm what we already said and that Andrea confirmed.

If anyone has played with the suggester suggestions payload, his feedback
is welcome!

Cheers


On 3 December 2015 at 06:21, Andrea Gazzarini <a.gazzar...@gmail.com> wrote:

> Hi Salman,
> few months ago I have been involved in a project similar to
> map.geoadmin.ch
> and there, I had your same need (I also sent an email to this list).
>
> From my side I can furtherly confirm what Alan and Alessandro already
> explained, I followed that approach.
>
> IMHO, that is the "recommended way" if the component's features meet your
> needs (i.e. do not reinvent the wheel) but it seems you're out of those
> bounds.
>
> Best,
> Andrea
> On 2 Dec 2015 21:51, "Salman Ansari" <salman.rah...@gmail.com> wrote:
>
> > Sounds good but I heard "/suggest" component is the recommended way of
> > doing auto-complete in the new versions of Solr. Something along the
> lines
> > of this article
> > https://cwiki.apache.org/confluence/display/solr/Suggester
> >
> > 
> >   
> > mySuggester
> > FuzzyLookupFactory
> > DocumentDictionaryFactory
> > cat
> > price
> > string
> > false
> >   
> > 
> >
> > Can someone confirm this?
> >
> > Regards,
> > Salman
> >
> >
> > On Wed, Dec 2, 2015 at 1:14 PM, Alessandro Benedetti <
> > abenede...@apache.org>
> > wrote:
> >
> > > Hi Salman,
> > > I agree with Alan.
> > > Just configure your schema with the proper analysers .
> > > For the field you want to use for suggestions you are likely to need
> > simply
> > > this fieldType :
> > >
> > >  > > positionIncrementGap="100">
> > > 
> > > 
> > > 
> > >  > > maxGramSize="20"/>
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > >
> > > This is a very sample example, please adapt it to your use case.
> > >
> > > Cheers
> > >
> > > On 2 December 2015 at 09:41, Alan Woodward <a...@flax.co.uk> wrote:
> > >
> > > > Hi Salman,
> > > >
> > > > It sounds as though you want to do a normal search against a special
> > > > 'suggest' field, that's been indexed with edge ngrams.
> > > >
> > > > Alan Woodward
> > > > www.flax.co.uk
> > > >
> > > >
> > > > On 2 Dec 2015, at 09:31, Salman Ansari wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I am looking for auto-complete in Solr but on top of just auto
> > > complete I
> > > > > want as well to return the data completely (not just suggestions),
> > so I
> > > > > want to get back the ids, and other fields in the whole document. I
> > > tried
> > > > > the following 2 approaches but each had issues
> > > > >
> > > > > 1) Used the /suggest component but that returns a very specific
> > format
> > > > > which looks like I cannot customize. I want to return the whole
> > > document
> > > > > that has a matching field and not only the suggestion list. So for
> > > > example,
> > > > > if I write "hard" it returns the results in a specific format as
> > > follows
> > > > >
> > > > >   hard drive
> > > > > hard disk
> > > > >
> > > > > Is there a way to get back additional fields with suggestions?
> > > > >
> > > > > 2) Tried the normal /select component but that does not do
> > > auto-complete
> > > > on
> > > > > portion of the word. So, for example, if I write the query as
> "bara"
> > it
> > > > > DOES NOT return "barack obama". Any suggestions how to solve this?
> > > > >
> > > > >
> > > > > Regards,
> > > > > Salman
> > > >
> > > >
> > >
> > >
> > > --
> > > --
> > >
> > > Benedetti Alessandro
> > > Visiting card : http://about.me/alessandro_benedetti
> > >
> > > "Tyger, tyger burning bright
> > > In the forests of the night,
> > > What immortal hand or eye
> > > Could frame thy fearful symmetry?"
> > >
> > > William Blake - Songs of Experience -1794 England
> > >
> >
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: Solr Auto-Complete

2015-12-02 Thread Salman Ansari

Sounds good but I heard "/suggest" component is the recommended way of
doing auto-complete in the new versions of Solr. Something along the lines
of this article
https://cwiki.apache.org/confluence/display/solr/Suggester


  
mySuggester
FuzzyLookupFactory
DocumentDictionaryFactory
cat
price
string
false
  


Can someone confirm this?

Regards,
Salman


On Wed, Dec 2, 2015 at 1:14 PM, Alessandro Benedetti <abenede...@apache.org>
wrote:

> Hi Salman,
> I agree with Alan.
> Just configure your schema with the proper analysers .
> For the field you want to use for suggestions you are likely to need simply
> this fieldType :
>
>  positionIncrementGap="100">
> 
> 
> 
>  maxGramSize="20"/>
> 
> 
> 
> 
> 
> 
>
> This is a very sample example, please adapt it to your use case.
>
> Cheers
>
> On 2 December 2015 at 09:41, Alan Woodward <a...@flax.co.uk> wrote:
>
> > Hi Salman,
> >
> > It sounds as though you want to do a normal search against a special
> > 'suggest' field, that's been indexed with edge ngrams.
> >
> > Alan Woodward
> > www.flax.co.uk
> >
> >
> > On 2 Dec 2015, at 09:31, Salman Ansari wrote:
> >
> > > Hi,
> > >
> > > I am looking for auto-complete in Solr but on top of just auto
> complete I
> > > want as well to return the data completely (not just suggestions), so I
> > > want to get back the ids, and other fields in the whole document. I
> tried
> > > the following 2 approaches but each had issues
> > >
> > > 1) Used the /suggest component but that returns a very specific format
> > > which looks like I cannot customize. I want to return the whole
> document
> > > that has a matching field and not only the suggestion list. So for
> > example,
> > > if I write "hard" it returns the results in a specific format as
> follows
> > >
> > >   hard drive
> > > hard disk
> > >
> > > Is there a way to get back additional fields with suggestions?
> > >
> > > 2) Tried the normal /select component but that does not do
> auto-complete
> > on
> > > portion of the word. So, for example, if I write the query as "bara" it
> > > DOES NOT return "barack obama". Any suggestions how to solve this?
> > >
> > >
> > > Regards,
> > > Salman
> >
> >
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>

Re: Solr Auto-Complete

2015-12-02 Thread Andrea Gazzarini

Hi Salman,
few months ago I have been involved in a project similar to map.geoadmin.ch
and there, I had your same need (I also sent an email to this list).

>From my side I can furtherly confirm what Alan and Alessandro already
explained, I followed that approach.

IMHO, that is the "recommended way" if the component's features meet your
needs (i.e. do not reinvent the wheel) but it seems you're out of those
bounds.

Best,
Andrea
On 2 Dec 2015 21:51, "Salman Ansari" <salman.rah...@gmail.com> wrote:

> Sounds good but I heard "/suggest" component is the recommended way of
> doing auto-complete in the new versions of Solr. Something along the lines
> of this article
> https://cwiki.apache.org/confluence/display/solr/Suggester
>
> 
>   
> mySuggester
> FuzzyLookupFactory
> DocumentDictionaryFactory
> cat
> price
> string
> false
>   
> 
>
> Can someone confirm this?
>
> Regards,
> Salman
>
>
> On Wed, Dec 2, 2015 at 1:14 PM, Alessandro Benedetti <
> abenede...@apache.org>
> wrote:
>
> > Hi Salman,
> > I agree with Alan.
> > Just configure your schema with the proper analysers .
> > For the field you want to use for suggestions you are likely to need
> simply
> > this fieldType :
> >
> >  > positionIncrementGap="100">
> > 
> > 
> > 
> >  > maxGramSize="20"/>
> > 
> > 
> > 
> > 
> > 
> > 
> >
> > This is a very sample example, please adapt it to your use case.
> >
> > Cheers
> >
> > On 2 December 2015 at 09:41, Alan Woodward <a...@flax.co.uk> wrote:
> >
> > > Hi Salman,
> > >
> > > It sounds as though you want to do a normal search against a special
> > > 'suggest' field, that's been indexed with edge ngrams.
> > >
> > > Alan Woodward
> > > www.flax.co.uk
> > >
> > >
> > > On 2 Dec 2015, at 09:31, Salman Ansari wrote:
> > >
> > > > Hi,
> > > >
> > > > I am looking for auto-complete in Solr but on top of just auto
> > complete I
> > > > want as well to return the data completely (not just suggestions),
> so I
> > > > want to get back the ids, and other fields in the whole document. I
> > tried
> > > > the following 2 approaches but each had issues
> > > >
> > > > 1) Used the /suggest component but that returns a very specific
> format
> > > > which looks like I cannot customize. I want to return the whole
> > document
> > > > that has a matching field and not only the suggestion list. So for
> > > example,
> > > > if I write "hard" it returns the results in a specific format as
> > follows
> > > >
> > > >   hard drive
> > > > hard disk
> > > >
> > > > Is there a way to get back additional fields with suggestions?
> > > >
> > > > 2) Tried the normal /select component but that does not do
> > auto-complete
> > > on
> > > > portion of the word. So, for example, if I write the query as "bara"
> it
> > > > DOES NOT return "barack obama". Any suggestions how to solve this?
> > > >
> > > >
> > > > Regards,
> > > > Salman
> > >
> > >
> >
> >
> > --
> > --
> >
> > Benedetti Alessandro
> > Visiting card : http://about.me/alessandro_benedetti
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
> >
>

Solr Auto-Complete

2015-12-02 Thread Salman Ansari

Hi,

I am looking for auto-complete in Solr but on top of just auto complete I
want as well to return the data completely (not just suggestions), so I
want to get back the ids, and other fields in the whole document. I tried
the following 2 approaches but each had issues

1) Used the /suggest component but that returns a very specific format
which looks like I cannot customize. I want to return the whole document
that has a matching field and not only the suggestion list. So for example,
if I write "hard" it returns the results in a specific format as follows

  hard drive
hard disk

 Is there a way to get back additional fields with suggestions?

2) Tried the normal /select component but that does not do auto-complete on
portion of the word. So, for example, if I write the query as "bara" it
DOES NOT return "barack obama". Any suggestions how to solve this?


Regards,
Salman

Re: Solr Auto-Complete

2015-12-02 Thread Alan Woodward

Hi Salman,

It sounds as though you want to do a normal search against a special 'suggest' 
field, that's been indexed with edge ngrams.

Alan Woodward
www.flax.co.uk


On 2 Dec 2015, at 09:31, Salman Ansari wrote:

> Hi,
> 
> I am looking for auto-complete in Solr but on top of just auto complete I
> want as well to return the data completely (not just suggestions), so I
> want to get back the ids, and other fields in the whole document. I tried
> the following 2 approaches but each had issues
> 
> 1) Used the /suggest component but that returns a very specific format
> which looks like I cannot customize. I want to return the whole document
> that has a matching field and not only the suggestion list. So for example,
> if I write "hard" it returns the results in a specific format as follows
> 
>   hard drive
> hard disk
> 
> Is there a way to get back additional fields with suggestions?
> 
> 2) Tried the normal /select component but that does not do auto-complete on
> portion of the word. So, for example, if I write the query as "bara" it
> DOES NOT return "barack obama". Any suggestions how to solve this?
> 
> 
> Regards,
> Salman

Re: Solr Auto-Complete

2015-12-02 Thread Alessandro Benedetti

Hi Salman,
I agree with Alan.
Just configure your schema with the proper analysers .
For the field you want to use for suggestions you are likely to need simply
this fieldType :













This is a very sample example, please adapt it to your use case.

Cheers

On 2 December 2015 at 09:41, Alan Woodward <a...@flax.co.uk> wrote:

> Hi Salman,
>
> It sounds as though you want to do a normal search against a special
> 'suggest' field, that's been indexed with edge ngrams.
>
> Alan Woodward
> www.flax.co.uk
>
>
> On 2 Dec 2015, at 09:31, Salman Ansari wrote:
>
> > Hi,
> >
> > I am looking for auto-complete in Solr but on top of just auto complete I
> > want as well to return the data completely (not just suggestions), so I
> > want to get back the ids, and other fields in the whole document. I tried
> > the following 2 approaches but each had issues
> >
> > 1) Used the /suggest component but that returns a very specific format
> > which looks like I cannot customize. I want to return the whole document
> > that has a matching field and not only the suggestion list. So for
> example,
> > if I write "hard" it returns the results in a specific format as follows
> >
> >   hard drive
> > hard disk
> >
> > Is there a way to get back additional fields with suggestions?
> >
> > 2) Tried the normal /select component but that does not do auto-complete
> on
> > portion of the word. So, for example, if I write the query as "bara" it
> > DOES NOT return "barack obama". Any suggestions how to solve this?
> >
> >
> > Regards,
> > Salman
>
>


-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: How to implement Auto complete, suggestion client side

2015-01-28 Thread Olivier Austina

Hi,

Thank you Dan Davis and Alexandre Rafalovitch. This is very helpful for me.

Regards
Olivier


2015-01-27 0:51 GMT+01:00 Alexandre Rafalovitch arafa...@gmail.com:

 You've got a lot of options depending on what you want. But since you
 seem to just want _an_ example, you can use mine from
 http://www.solr-start.com/javadoc/solr-lucene/index.html (gray search
 box there).

 You can see the source for the test screen (using Spring Boot and
 Spring Data Solr as a middle-layer) and Select2 for the UI at:
 https://github.com/arafalov/Solr-Javadoc/tree/master/SearchServer.
 The Solr definition is at:

 https://github.com/arafalov/Solr-Javadoc/tree/master/JavadocIndex/JavadocCollection/conf

 Other implementation pieces are in that (and another) public
 repository as well, but it's all in Java. You'll probably want to do
 something similar in PHP.

 Regards,
Alex.
 
 Sign up for my Solr resources newsletter at http://www.solr-start.com/


 On 26 January 2015 at 17:11, Olivier Austina olivier.aust...@gmail.com
 wrote:
  Hi All,
 
  I would say I am new to web technology.
 
  I would like to implement auto complete/suggestion in the user search box
  as the user type in the search box (like Google for example). I am using
  Solr as database. Basically I am  familiar with Solr and I can formulate
  suggestion queries.
 
  But now I don't know how to implement suggestion in the User Interface.
  Which technologies should I need. The website is in PHP. Any suggestions,
  examples, basic tutorial is welcome. Thank you.
 
 
 
  Regards
  Olivier

How to implement Auto complete, suggestion client side

2015-01-26 Thread Olivier Austina

Hi All,

I would say I am new to web technology.

I would like to implement auto complete/suggestion in the user search box
as the user type in the search box (like Google for example). I am using
Solr as database. Basically I am  familiar with Solr and I can formulate
suggestion queries.

But now I don't know how to implement suggestion in the User Interface.
Which technologies should I need. The website is in PHP. Any suggestions,
examples, basic tutorial is welcome. Thank you.



Regards
Olivier

Re: How to implement Auto complete, suggestion client side

2015-01-26 Thread Alexandre Rafalovitch

You've got a lot of options depending on what you want. But since you
seem to just want _an_ example, you can use mine from
http://www.solr-start.com/javadoc/solr-lucene/index.html (gray search
box there).

You can see the source for the test screen (using Spring Boot and
Spring Data Solr as a middle-layer) and Select2 for the UI at:
https://github.com/arafalov/Solr-Javadoc/tree/master/SearchServer.
The Solr definition is at:
https://github.com/arafalov/Solr-Javadoc/tree/master/JavadocIndex/JavadocCollection/conf

Other implementation pieces are in that (and another) public
repository as well, but it's all in Java. You'll probably want to do
something similar in PHP.

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 26 January 2015 at 17:11, Olivier Austina olivier.aust...@gmail.com wrote:
 Hi All,

 I would say I am new to web technology.

 I would like to implement auto complete/suggestion in the user search box
 as the user type in the search box (like Google for example). I am using
 Solr as database. Basically I am  familiar with Solr and I can formulate
 suggestion queries.

 But now I don't know how to implement suggestion in the User Interface.
 Which technologies should I need. The website is in PHP. Any suggestions,
 examples, basic tutorial is welcome. Thank you.



 Regards
 Olivier

Re: How to implement Auto complete, suggestion client side

2015-01-26 Thread Dan Davis

Cannot get any easier than jquery-ui's autocomplete widget -
http://jqueryui.com/autocomplete/

Basically, you set some classes and implement a javascript that calls the
server to get the autocomplete data.   I never would expose Solr to
browsers, so I would have the AJAX call go to a php script (or
function/method if you are using a web framework such as CakePHP or
Symfony).

Then, on the server, you make a request to Solr /suggest or /spell with
wt=json, and then you reformulate this into a simple JSON response that is
a simple array of options.

You can do this in stages:

   - Constant suggestions - you change your html and implement Javascript
   that shows constant suggestions after for instance 2 seconds.
   - Constant suggestions from the server - you change your JavaScript to
   call the server, and have the server return a constant list.
   - Dynamic suggestions from the server - you implement the server-side to
   query Solr and turn the return from /suggest or /spell into a JSON array.
   - Tuning, tuning, tuning - you work hard on tuning it so that you get
   high quality suggestions for a wide variety of inputs.

Note that the autocomplete I've described for you is basically the simplest
thing possible, as you suggest you are new to it.   It is not based on data
mining of query and click-through logs, which is a very common pattern
these days.   There is no bolding of the portion of the words that are new.
  It is just a basic autocomplete widget with a delay.

On Mon, Jan 26, 2015 at 5:11 PM, Olivier Austina olivier.aust...@gmail.com
wrote:

 Hi All,

 I would say I am new to web technology.

 I would like to implement auto complete/suggestion in the user search box
 as the user type in the search box (like Google for example). I am using
 Solr as database. Basically I am  familiar with Solr and I can formulate
 suggestion queries.

 But now I don't know how to implement suggestion in the User Interface.
 Which technologies should I need. The website is in PHP. Any suggestions,
 examples, basic tutorial is welcome. Thank you.



 Regards
 Olivier

Re: Auto Complete

2014-08-06 Thread benjelloun

Hello thanks for the tutorial i test all schema but its not what i need.
what i need is to auto complete with an autocorrection like i said before:
q=gene --autocomplete genève with accent


2014-08-05 18:03 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
ml-node+s472066n4151261...@n3.nabble.com:

 In this case, I recommend using the approach that this tutorial uses:

 http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/

 Basically the idea is you index the data a few different ways and then use
 edismax to query them all with different boosts. You'd use the stored
 version of you field for display, so your accented characters would not
 get
 stripped.

 Michael Della Bitta

 Applications Developer

 o: +1 646 532 3062

 appinions inc.

 “The Science of Influence Marketing”

 18 East 41st Street

 New York, NY 10017

 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinions
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts

 w: appinions.com http://www.appinions.com/


 On Tue, Aug 5, 2014 at 9:32 AM, benjelloun [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4151261i=0 wrote:

  yeah thats true i creat this index just for auto complete
  here is my schema:
 
  dynamicField name=*_en type=text_en indexed=true stored=false
  required=false multiValued=true/
  dynamicField name=*_fr type=text_fr indexed=true stored=false
  required=false multiValued=true/
  dynamicField name=*_ar type=text_ar indexed=true stored=false
  required=false multiValued=true/
 
  copyField source=*_en dest=suggestField/
  copyField source=*_fr dest=suggestField/
  copyField source=*_ar dest=suggestField/
 
  the i use suggestField for autocomplet like i mentioned above
  do you have any other configuration which can do what i need ?
 
 
 
  2014-08-05 15:19 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
  [hidden email] http://user/SendEmail.jtp?type=nodenode=4151261i=1:
 
   Unless I'm mistaken, it seems like you've created this index
 specifically
   for autocomplete? Or is this index used for general search also?
  
   The easy way to understand this question: Is there one entry in your
  index
   for each term you want to autocomplete? Or are there multiple entries
  that
   might contain the same term?
  
   Michael Della Bitta
  
   Applications Developer
  
   o: +1 646 532 3062
  
   appinions inc.
  
   “The Science of Influence Marketing”
  
   18 East 41st Street
  
   New York, NY 10017
  
   t: @appinions https://twitter.com/Appinions | g+:
   plus.google.com/appinions
   
  
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
  
  
   w: appinions.com http://www.appinions.com/
  
  
   On Tue, Aug 5, 2014 at 9:10 AM, benjelloun [hidden email]
   http://user/SendEmail.jtp?type=nodenode=4151216i=0 wrote:
  
hello,
   
did you find any solution to this problem ?
   
regards
   
   
2014-08-04 16:16 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
[hidden email] http://user/SendEmail.jtp?type=nodenode=4151216i=1

  :
   
 How are you implementing autosuggest? I'm assuming you're querying
 an
 indexed field and getting a stored value back. But there are a
 wide
 variety
 of ways of doing it.

 Michael Della Bitta

 Applications Developer

 o: +1 646 532 3062

 appinions inc.

 “The Science of Influence Marketing”

 18 East 41st Street

 New York, NY 10017

 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinions
 

   
  
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts


 w: appinions.com http://www.appinions.com/


 On Mon, Aug 4, 2014 at 10:10 AM, benjelloun [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4150990i=0 wrote:

  hello you didnt enderstand well my probleme,
 
  i give exemple: i have document contain genève with accent
  when i do q=gene -- autoSuggest geneve because of
  ASCIIFoldingFilterFactory preserveOriginal=true
  when i do q=genè -- autoSuggest genève
  but what i need to is:
  q=gene without accent and get this result: genève with
 accent
 
 
 
  --
  View this message in context:
 
   
  http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150989.html

  Sent from the Solr - User mailing list archive at Nabble.com.
 


 --
  If you reply to this email, your message will be added to the
   discussion
 below:

  
 http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150990.html
  To unsubscribe from Auto Complete, click here
 
   
   
 .
 NAML
 
   
  
 
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace

Re: Auto Complete

2014-08-06 Thread Michael Della Bitta

You'd still need to modify that schema to use the ASCII folding filter.

Alternatively, if you want something off the shelf, you might check out
Sematext's autocomplete product:
http://www.sematext.com/products/autocomplete/index.html

Michael Della Bitta

Applications Developer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinions
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/


On Wed, Aug 6, 2014 at 10:56 AM, benjelloun anass@gmail.com wrote:

 Hello thanks for the tutorial i test all schema but its not what i need.
 what i need is to auto complete with an autocorrection like i said before:
 q=gene --autocomplete genève with accent


 2014-08-05 18:03 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
 ml-node+s472066n4151261...@n3.nabble.com:

  In this case, I recommend using the approach that this tutorial uses:
 
 
 http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/
 
  Basically the idea is you index the data a few different ways and then
 use
  edismax to query them all with different boosts. You'd use the stored
  version of you field for display, so your accented characters would not
  get
  stripped.
 
  Michael Della Bitta
 
  Applications Developer
 
  o: +1 646 532 3062
 
  appinions inc.
 
  “The Science of Influence Marketing”
 
  18 East 41st Street
 
  New York, NY 10017
 
  t: @appinions https://twitter.com/Appinions | g+:
  plus.google.com/appinions
  
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 
 
  w: appinions.com http://www.appinions.com/
 
 
  On Tue, Aug 5, 2014 at 9:32 AM, benjelloun [hidden email]
  http://user/SendEmail.jtp?type=nodenode=4151261i=0 wrote:
 
   yeah thats true i creat this index just for auto complete
   here is my schema:
  
   dynamicField name=*_en type=text_en indexed=true stored=false
   required=false multiValued=true/
   dynamicField name=*_fr type=text_fr indexed=true stored=false
   required=false multiValued=true/
   dynamicField name=*_ar type=text_ar indexed=true stored=false
   required=false multiValued=true/
  
   copyField source=*_en dest=suggestField/
   copyField source=*_fr dest=suggestField/
   copyField source=*_ar dest=suggestField/
  
   the i use suggestField for autocomplet like i mentioned above
   do you have any other configuration which can do what i need ?
  
  
  
   2014-08-05 15:19 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
   [hidden email] http://user/SendEmail.jtp?type=nodenode=4151261i=1
 :
  
Unless I'm mistaken, it seems like you've created this index
  specifically
for autocomplete? Or is this index used for general search also?
   
The easy way to understand this question: Is there one entry in your
   index
for each term you want to autocomplete? Or are there multiple entries
   that
might contain the same term?
   
Michael Della Bitta
   
Applications Developer
   
o: +1 646 532 3062
   
appinions inc.
   
“The Science of Influence Marketing”
   
18 East 41st Street
   
New York, NY 10017
   
t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinions

   
  
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
   
   
w: appinions.com http://www.appinions.com/
   
   
On Tue, Aug 5, 2014 at 9:10 AM, benjelloun [hidden email]
http://user/SendEmail.jtp?type=nodenode=4151216i=0 wrote:
   
 hello,

 did you find any solution to this problem ?

 regards


 2014-08-04 16:16 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
 [hidden email] 
 http://user/SendEmail.jtp?type=nodenode=4151216i=1
 
   :

  How are you implementing autosuggest? I'm assuming you're
 querying
  an
  indexed field and getting a stored value back. But there are a
  wide
  variety
  of ways of doing it.
 
  Michael Della Bitta
 
  Applications Developer
 
  o: +1 646 532 3062
 
  appinions inc.
 
  “The Science of Influence Marketing”
 
  18 East 41st Street
 
  New York, NY 10017
 
  t: @appinions https://twitter.com/Appinions | g+:
  plus.google.com/appinions
  
 

   
  
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 
 
  w: appinions.com http://www.appinions.com/
 
 
  On Mon, Aug 4, 2014 at 10:10 AM, benjelloun [hidden email]
  http://user/SendEmail.jtp?type=nodenode=4150990i=0 wrote:
 
   hello you didnt enderstand well my probleme,
  
   i give exemple: i have document contain genève with accent
   when i do q=gene -- autoSuggest geneve because of
   ASCIIFoldingFilterFactory preserveOriginal=true
   when i do q=genè -- autoSuggest

Re: Auto Complete

2014-08-05 Thread benjelloun

hello,

did you find any solution to this problem ?

regards


2014-08-04 16:16 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
ml-node+s472066n4150990...@n3.nabble.com:

 How are you implementing autosuggest? I'm assuming you're querying an
 indexed field and getting a stored value back. But there are a wide
 variety
 of ways of doing it.

 Michael Della Bitta

 Applications Developer

 o: +1 646 532 3062

 appinions inc.

 “The Science of Influence Marketing”

 18 East 41st Street

 New York, NY 10017

 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinions
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts

 w: appinions.com http://www.appinions.com/


 On Mon, Aug 4, 2014 at 10:10 AM, benjelloun [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4150990i=0 wrote:

  hello you didnt enderstand well my probleme,
 
  i give exemple: i have document contain genève with accent
  when i do q=gene -- autoSuggest geneve because of
  ASCIIFoldingFilterFactory preserveOriginal=true
  when i do q=genè -- autoSuggest genève
  but what i need to is:
  q=gene without accent and get this result: genève with accent
 
 
 
  --
  View this message in context:
  http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150989.html

  Sent from the Solr - User mailing list archive at Nabble.com.
 


 --
  If you reply to this email, your message will be added to the discussion
 below:
 http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150990.html
  To unsubscribe from Auto Complete, click here
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4150987code=YW5hc3MuYm5qQGdtYWlsLmNvbXw0MTUwOTg3fC0xMDQyNjMzMDgx
 .
 NAML
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4151211.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Auto Complete

2014-08-05 Thread Michael Della Bitta

Unless I'm mistaken, it seems like you've created this index specifically
for autocomplete? Or is this index used for general search also?

The easy way to understand this question: Is there one entry in your index
for each term you want to autocomplete? Or are there multiple entries that
might contain the same term?

Michael Della Bitta

Applications Developer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinions
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/


On Tue, Aug 5, 2014 at 9:10 AM, benjelloun anass@gmail.com wrote:

 hello,

 did you find any solution to this problem ?

 regards


 2014-08-04 16:16 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
 ml-node+s472066n4150990...@n3.nabble.com:

  How are you implementing autosuggest? I'm assuming you're querying an
  indexed field and getting a stored value back. But there are a wide
  variety
  of ways of doing it.
 
  Michael Della Bitta
 
  Applications Developer
 
  o: +1 646 532 3062
 
  appinions inc.
 
  “The Science of Influence Marketing”
 
  18 East 41st Street
 
  New York, NY 10017
 
  t: @appinions https://twitter.com/Appinions | g+:
  plus.google.com/appinions
  
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 
 
  w: appinions.com http://www.appinions.com/
 
 
  On Mon, Aug 4, 2014 at 10:10 AM, benjelloun [hidden email]
  http://user/SendEmail.jtp?type=nodenode=4150990i=0 wrote:
 
   hello you didnt enderstand well my probleme,
  
   i give exemple: i have document contain genève with accent
   when i do q=gene -- autoSuggest geneve because of
   ASCIIFoldingFilterFactory preserveOriginal=true
   when i do q=genè -- autoSuggest genève
   but what i need to is:
   q=gene without accent and get this result: genève with accent
  
  
  
   --
   View this message in context:
  
 http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150989.html
 
   Sent from the Solr - User mailing list archive at Nabble.com.
  
 
 
  --
   If you reply to this email, your message will be added to the discussion
  below:
  http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150990.html
   To unsubscribe from Auto Complete, click here
  
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4150987code=YW5hc3MuYm5qQGdtYWlsLmNvbXw0MTUwOTg3fC0xMDQyNjMzMDgx
 
  .
  NAML
  
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
 
 




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4151211.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Auto Complete

2014-08-05 Thread benjelloun

yeah thats true i creat this index just for auto complete
here is my schema:

dynamicField name=*_en type=text_en indexed=true stored=false
required=false multiValued=true/
dynamicField name=*_fr type=text_fr indexed=true stored=false
required=false multiValued=true/
dynamicField name=*_ar type=text_ar indexed=true stored=false
required=false multiValued=true/

copyField source=*_en dest=suggestField/
copyField source=*_fr dest=suggestField/
copyField source=*_ar dest=suggestField/

the i use suggestField for autocomplet like i mentioned above
do you have any other configuration which can do what i need ?



2014-08-05 15:19 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
ml-node+s472066n4151216...@n3.nabble.com:

 Unless I'm mistaken, it seems like you've created this index specifically
 for autocomplete? Or is this index used for general search also?

 The easy way to understand this question: Is there one entry in your index
 for each term you want to autocomplete? Or are there multiple entries that
 might contain the same term?

 Michael Della Bitta

 Applications Developer

 o: +1 646 532 3062

 appinions inc.

 “The Science of Influence Marketing”

 18 East 41st Street

 New York, NY 10017

 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinions
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts

 w: appinions.com http://www.appinions.com/


 On Tue, Aug 5, 2014 at 9:10 AM, benjelloun [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4151216i=0 wrote:

  hello,
 
  did you find any solution to this problem ?
 
  regards
 
 
  2014-08-04 16:16 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
  [hidden email] http://user/SendEmail.jtp?type=nodenode=4151216i=1:
 
   How are you implementing autosuggest? I'm assuming you're querying an
   indexed field and getting a stored value back. But there are a wide
   variety
   of ways of doing it.
  
   Michael Della Bitta
  
   Applications Developer
  
   o: +1 646 532 3062
  
   appinions inc.
  
   “The Science of Influence Marketing”
  
   18 East 41st Street
  
   New York, NY 10017
  
   t: @appinions https://twitter.com/Appinions | g+:
   plus.google.com/appinions
   
  
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
  
  
   w: appinions.com http://www.appinions.com/
  
  
   On Mon, Aug 4, 2014 at 10:10 AM, benjelloun [hidden email]
   http://user/SendEmail.jtp?type=nodenode=4150990i=0 wrote:
  
hello you didnt enderstand well my probleme,
   
i give exemple: i have document contain genève with accent
when i do q=gene -- autoSuggest geneve because of
ASCIIFoldingFilterFactory preserveOriginal=true
when i do q=genè -- autoSuggest genève
but what i need to is:
q=gene without accent and get this result: genève with accent
   
   
   
--
View this message in context:
   
  http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150989.html
  
Sent from the Solr - User mailing list archive at Nabble.com.
   
  
  
   --
If you reply to this email, your message will be added to the
 discussion
   below:
  
 http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150990.html
To unsubscribe from Auto Complete, click here
   
 
 
   .
   NAML
   
 
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml

  
  
 
 
 
 
  --
  View this message in context:
  http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4151211.html

  Sent from the Solr - User mailing list archive at Nabble.com.
 


 --
  If you reply to this email, your message will be added to the discussion
 below:
 http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4151216.html
  To unsubscribe from Auto Complete, click here
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4150987code=YW5hc3MuYm5qQGdtYWlsLmNvbXw0MTUwOTg3fC0xMDQyNjMzMDgx
 .
 NAML
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4151222.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Auto Complete

2014-08-05 Thread benjelloun

i found this solution but when i test it nothing in suggestion

searchComponent class=solr.SpellCheckComponent name=fuzzySuggest
lst name=spellchecker
  str name=namefuzzySuggest/str
  str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str
name=lookupImplorg.apache.solr.spelling.suggest.fst.FuzzyLookupFactory/str
  str name=fieldsuggestField/str
  str name=storeDirsuggestFolders/str
  str name=buildOnCommittrue/str
  bool name=exactMatchFirsttrue/bool
  str name=suggestAnalyzerFieldTypetexts/str
  bool name=preserveSepfalse/bool
  int name=maxEdits2/int
  str name=sourceLocationsuggestFolders/fuzzysuggest.txt/str
  /lst
str name=queryAnalyzerFieldTypephrase_suggest/str
  /searchComponent

  requestHandler class=org.apache.solr.handler.component.SearchHandler
name=/fuzzySuggest
lst name=defaults
  str name=namefuzzySuggest/str
  str name=spellchecktrue/str
  str name=spellcheck.dictionaryfuzzySuggest/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count10/str
  str name=spellcheck.collatetrue/str
  str name=spellcheck.maxCollations10/str
  str name=spellcheck.collateExtendedResultstrue/str
/lst
arr name=components
  strfuzzySuggest/str
/arr
  /requestHandler


2014-08-05 15:32 GMT+02:00 anass benjelloun anass@gmail.com:

 yeah thats true i creat this index just for auto complete
 here is my schema:

 dynamicField name=*_en type=text_en indexed=true stored=false
 required=false multiValued=true/
 dynamicField name=*_fr type=text_fr indexed=true stored=false
 required=false multiValued=true/
 dynamicField name=*_ar type=text_ar indexed=true stored=false
 required=false multiValued=true/

 copyField source=*_en dest=suggestField/
 copyField source=*_fr dest=suggestField/
 copyField source=*_ar dest=suggestField/

 the i use suggestField for autocomplet like i mentioned above
 do you have any other configuration which can do what i need ?



 2014-08-05 15:19 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
 ml-node+s472066n4151216...@n3.nabble.com:

  Unless I'm mistaken, it seems like you've created this index specifically
 for autocomplete? Or is this index used for general search also?

 The easy way to understand this question: Is there one entry in your
 index
 for each term you want to autocomplete? Or are there multiple entries
 that
 might contain the same term?

 Michael Della Bitta

 Applications Developer

 o: +1 646 532 3062

 appinions inc.

 “The Science of Influence Marketing”

 18 East 41st Street

 New York, NY 10017

 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinions
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts

 w: appinions.com http://www.appinions.com/


 On Tue, Aug 5, 2014 at 9:10 AM, benjelloun [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4151216i=0 wrote:

  hello,
 
  did you find any solution to this problem ?
 
  regards
 
 
  2014-08-04 16:16 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
  [hidden email] http://user/SendEmail.jtp?type=nodenode=4151216i=1:

 
   How are you implementing autosuggest? I'm assuming you're querying an
   indexed field and getting a stored value back. But there are a wide
   variety
   of ways of doing it.
  
   Michael Della Bitta
  
   Applications Developer
  
   o: +1 646 532 3062
  
   appinions inc.
  
   “The Science of Influence Marketing”
  
   18 East 41st Street
  
   New York, NY 10017
  
   t: @appinions https://twitter.com/Appinions | g+:
   plus.google.com/appinions
   
  
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
  
  
   w: appinions.com http://www.appinions.com/
  
  
   On Mon, Aug 4, 2014 at 10:10 AM, benjelloun [hidden email]
   http://user/SendEmail.jtp?type=nodenode=4150990i=0 wrote:
  
hello you didnt enderstand well my probleme,
   
i give exemple: i have document contain genève with accent
when i do q=gene -- autoSuggest geneve because of
ASCIIFoldingFilterFactory preserveOriginal=true
when i do q=genè -- autoSuggest genève
but what i need to is:
q=gene without accent and get this result: genève with accent
   
   
   
--
View this message in context:
   
  http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150989.html
  
Sent from the Solr - User mailing list archive at Nabble.com.
   
  
  
   --
If you reply to this email, your message will be added to the
 discussion
   below:
  
 http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150990.html
To unsubscribe from Auto Complete, click here
   
 
 
   .
   NAML
   
 
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers

Re: Auto Complete

2014-08-05 Thread Michael Della Bitta

In this case, I recommend using the approach that this tutorial uses:

http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/

Basically the idea is you index the data a few different ways and then use
edismax to query them all with different boosts. You'd use the stored
version of you field for display, so your accented characters would not get
stripped.

Michael Della Bitta

Applications Developer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinions
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/


On Tue, Aug 5, 2014 at 9:32 AM, benjelloun anass@gmail.com wrote:

 yeah thats true i creat this index just for auto complete
 here is my schema:

 dynamicField name=*_en type=text_en indexed=true stored=false
 required=false multiValued=true/
 dynamicField name=*_fr type=text_fr indexed=true stored=false
 required=false multiValued=true/
 dynamicField name=*_ar type=text_ar indexed=true stored=false
 required=false multiValued=true/

 copyField source=*_en dest=suggestField/
 copyField source=*_fr dest=suggestField/
 copyField source=*_ar dest=suggestField/

 the i use suggestField for autocomplet like i mentioned above
 do you have any other configuration which can do what i need ?



 2014-08-05 15:19 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
 ml-node+s472066n4151216...@n3.nabble.com:

  Unless I'm mistaken, it seems like you've created this index specifically
  for autocomplete? Or is this index used for general search also?
 
  The easy way to understand this question: Is there one entry in your
 index
  for each term you want to autocomplete? Or are there multiple entries
 that
  might contain the same term?
 
  Michael Della Bitta
 
  Applications Developer
 
  o: +1 646 532 3062
 
  appinions inc.
 
  “The Science of Influence Marketing”
 
  18 East 41st Street
 
  New York, NY 10017
 
  t: @appinions https://twitter.com/Appinions | g+:
  plus.google.com/appinions
  
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 
 
  w: appinions.com http://www.appinions.com/
 
 
  On Tue, Aug 5, 2014 at 9:10 AM, benjelloun [hidden email]
  http://user/SendEmail.jtp?type=nodenode=4151216i=0 wrote:
 
   hello,
  
   did you find any solution to this problem ?
  
   regards
  
  
   2014-08-04 16:16 GMT+02:00 Michael Della Bitta-2 [via Lucene] 
   [hidden email] http://user/SendEmail.jtp?type=nodenode=4151216i=1
 :
  
How are you implementing autosuggest? I'm assuming you're querying an
indexed field and getting a stored value back. But there are a wide
variety
of ways of doing it.
   
Michael Della Bitta
   
Applications Developer
   
o: +1 646 532 3062
   
appinions inc.
   
“The Science of Influence Marketing”
   
18 East 41st Street
   
New York, NY 10017
   
t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinions

   
  
 
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
   
   
w: appinions.com http://www.appinions.com/
   
   
On Mon, Aug 4, 2014 at 10:10 AM, benjelloun [hidden email]
http://user/SendEmail.jtp?type=nodenode=4150990i=0 wrote:
   
 hello you didnt enderstand well my probleme,

 i give exemple: i have document contain genève with accent
 when i do q=gene -- autoSuggest geneve because of
 ASCIIFoldingFilterFactory preserveOriginal=true
 when i do q=genè -- autoSuggest genève
 but what i need to is:
 q=gene without accent and get this result: genève with accent



 --
 View this message in context:

  
 http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150989.html
   
 Sent from the Solr - User mailing list archive at Nabble.com.

   
   
--
 If you reply to this email, your message will be added to the
  discussion
below:
   
  http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150990.html
 To unsubscribe from Auto Complete, click here

  
  
.
NAML

  
 
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
 
   
   
  
  
  
  
   --
   View this message in context:
  
 http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4151211.html
 
   Sent from the Solr - User mailing list archive at Nabble.com.
  
 
 
  --
   If you reply to this email, your message will be added to the discussion
  below:
  http://lucene.472066.n3

Auto Complete

2014-08-04 Thread benjelloun

Hello,

I have an index which contain genève
I need to do this query q=gene and get in auto complete this result :
genève  (e - è)
I'm using StandardTokenizerFactory for field and SpellCheckComponent for
searchCompenent.
All solutions are welcome,

Thanks,
Best regards,
Anass BENJELLOUN 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Auto Complete

2014-08-04 Thread Michael Della Bitta

You need to use this filter in your analysis chain:

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ASCIIFoldingFilterFactory

Michael Della Bitta

Applications Developer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinions
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/


On Mon, Aug 4, 2014 at 9:59 AM, benjelloun anass@gmail.com wrote:

 Hello,

 I have an index which contain genève
 I need to do this query q=gene and get in auto complete this result :
 genève  (e - è)
 I'm using StandardTokenizerFactory for field and SpellCheckComponent for
 searchCompenent.
 All solutions are welcome,

 Thanks,
 Best regards,
 Anass BENJELLOUN





 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Auto Complete

2014-08-04 Thread benjelloun

hello you didnt enderstand well my probleme,

i give exemple: i have document contain genève with accent
when i do q=gene -- autoSuggest geneve because of
ASCIIFoldingFilterFactory preserveOriginal=true
when i do q=genè -- autoSuggest genève 
but what i need to is:
q=gene without accent and get this result: genève with accent



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150989.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Auto Complete

2014-08-04 Thread Michael Della Bitta

How are you implementing autosuggest? I'm assuming you're querying an
indexed field and getting a stored value back. But there are a wide variety
of ways of doing it.

Michael Della Bitta

Applications Developer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinions
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/


On Mon, Aug 4, 2014 at 10:10 AM, benjelloun anass@gmail.com wrote:

 hello you didnt enderstand well my probleme,

 i give exemple: i have document contain genève with accent
 when i do q=gene -- autoSuggest geneve because of
 ASCIIFoldingFilterFactory preserveOriginal=true
 when i do q=genè -- autoSuggest genève
 but what i need to is:
 q=gene without accent and get this result: genève with accent



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150989.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Auto Complete

2014-08-04 Thread benjelloun

here is my configuration:
searchComponent class=solr.SpellCheckComponent name=suggests
lst name=spellchecker
  str name=namesuggestDic/str
  str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str
name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory/str
  
  str name=storeDirsuggestFolder/str
  str name=fieldsuggestField/str  
  str name=buildOnCommittrue/str
  bool name=exactMatchFirsttrue/bool
   str name=sourceLocation suggestFolder/emptyDic.txt/str
 
  
/lst
 
str name=queryAnalyzerFieldTypetextSuggest/str
  /searchComponent
  
  requestHandler class=org.apache.solr.handler.component.SearchHandler
name=/suggests
lst name=defaults
  str name=namesuggests/str
  str name=spellchecktrue/str
  
  str name=spellcheck.dictionarysuggestDic/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count6/str   
  str name=spellcheck.collatetrue/str
  str name=spellcheck.maxCollations6/str 
  str name=spellcheck.collateExtendedResultstrue/str  
/lst
arr name=components
  strsuggests/str
/arr
  /requestHandler



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4150992.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Auto Complete

2014-08-04 Thread benjelloun

if you have another configuration to can solve this problem please share it,
thanks




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Complete-tp4150987p4151002.html
Sent from the Solr - User mailing list archive at Nabble.com.

Auto complete with 50TB of data - Need your inputs?

2014-06-05 Thread bbi123

We have a requirement to for large data set like Billing data for example. 
The Business wants to do sorting and type ahead functions for it.  For
example, when I start typing “8164…” they want to list ALL the unique number
and the associated attributes displayed (name, description, etc). 
 
We have about 50TB of files that needs to be indexed. I haven't indexed this
much data before hence thought of getting your valuable inputs. I am
thinking of using SOLR cloud and use SSD for faster IO. I might need your
inputs on hardware requirements too.

I assume there is no limitations in terms of the maximum number of documents
that can be indexed in latest version of SOLR (4.8). Am I right?

Note: 
I don't have the exact business requirement or data but I assume that we
will be indexing just couple of field. I will try to get more information on
the document size soon.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-complete-with-50TB-of-data-Need-your-inputs-tp4140115.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Auto complete with 50TB of data - Need your inputs?

2014-06-05 Thread Shawn Heisey

On 6/5/2014 10:55 AM, bbi123 wrote:
We have a requirement to for large data set like Billing data for example.
The Business wants to do sorting and type ahead functions for it. For
example, when I start typing “8164…” they want to list ALL the unique number
and the associated attributes displayed (name, description, etc).

We have about 50TB of files that needs to be indexed. I haven't indexed this
much data before hence thought of getting your valuable inputs. I am
thinking of using SOLR cloud and use SSD for faster IO. I might need your
inputs on hardware requirements too.

It's nearly impossible to give you a hardware requirement projection.
There are simply too many variables. One variable is that we cannot
know how much of that 50TB of data will actually end up in the Solr
index. The archive for my data is getting close to 300TB, but because
that is mostly photos and video, the total size of the resulting Solr
index is about 100GB. My actual data source is a MySQL database that's
probably about 250GB.

http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

The one thing that I can say is that RAM is king with Solr. Once you
know how big the Solr index contained on each server will actually be,
you'll have some idea of how much RAM you might need. Add up the Solr
heap size and the total index size on disk for each server. That is the
ideal total memory size for each server. You might not actually need
that much RAM, but if you have it, we can *almost guarantee* good
performance.

http://wiki.apache.org/solr/SolrPerformanceProblems

SSD will help performance, but it is not a complete substitute for RAM.
If you have the ideal RAM size, SSD is not required, because all the
important data will be in RAM, which is much faster than SSD.

I assume there is no limitations in terms of the maximum number of documents
that can be indexed in latest version of SOLR (4.8). Am I right?

Each shard has a limit of just over two billion documents. The actual
number is 2147483647, the maximum number a 32bit java integer can hold.
This includes deleted documents, so we recommend not going over 1
billion. SolrCloud has no limits, because the collection can have many
shards.

Thanks,
Shawn

RE: Auto complete with 50TB of data - Need your inputs?

2014-06-05 Thread Toke Eskildsen

bbi123 [bbar...@gmail.com] wrote:
 We have a requirement to for large data set like Billing data for example.
 The Business wants to do sorting and type ahead functions for it.  For
 example, when I start typing “8164…” they want to list ALL the unique number
 and the associated attributes displayed (name, description, etc).

So either a search for prefix or a lookup with TermsComponent? I do not like 
the ALL in the requirements though. What if the prefix matches 5M documents?

 We have about 50TB of files that needs to be indexed. I haven't indexed this
 much data before hence thought of getting your valuable inputs. I am
 thinking of using SOLR cloud and use SSD for faster IO. I might need your
 inputs on hardware requirements too.

The index size it next to impossible to predict without more knowledge. Try and 
acquire just a few GB of content and experiment, so that you can get an idea of 
the final index size. The estimated number of documents and unique values in 
your lookup field are also very valuable to know.

As for storage, the question these days should be Are there any reasons not to 
use SSDs for index storage? The amount of RAM needed will have to be 
determined experimentally: Type-ahead does require very low latency and might 
need more caching than normally.

- Toke Eskildsen, State and University Library, Denmark

Limit the number of words in Auto complete using RE - not working

2014-01-27 Thread Developer

Hi,

I have a fieldtype as below configured to index the autocomplete phrase.
Everything worked fine except for the fact that some of the phrases were too
long so we had to limit the maximum number of words in a phrase hence I
added a regular expression which will remove all other words except the
first 3 words. 

*filter class=solr.PatternReplaceFilterFactory
pattern=^((?:\S+\s+){2}\S+).* replacement=$1/
*

This regular expression works fine when I use analyze the field using the
SOLR dashboard but doesnt actually limit the words during indexing. I am not
sure if I am doing anything wrong.. Can someone help me figure out the
issue?

*Field Type:*

fieldType class=solr.TextField name=textSpell
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
*filter class=solr.PatternReplaceFilterFactory
pattern=^((?:\S+\s+){2}\S+).* replacement=$1/
*filter class=solr.StopFilterFactory
enablePositionIncrements=true ignoreCase=true words=stopwords.txt/

filter class=solr.PatternReplaceFilterFactory
pattern=^(\p{Punct}*)(.*?)(\p{Punct}*)$ replacement=$2/
filter class=solr.StandardFilterFactory/
filter class=solr.LowerCaseFilterFactory/

*Field:*
field indexed=true multiValued=true name=autocomplete_phrase
stored=true type=textSpell/

*Copy Field:*
  copyField dest=autocomplete_phrase source=displayName/ 
  copyField dest=autocomplete_phrase source=manufacturer/




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Limit-the-number-of-words-in-Auto-complete-using-RE-not-working-tp4113790.html
Sent from the Solr - User mailing list archive at Nabble.com.

SOLR 3.6.1 auto complete sorting

2013-09-06 Thread Poornima Jay

Hi, 

We had implemented Auto Complete feature in our site. Below are the solr config 
details.

schema.xml

 fieldType class=solr.TextField name=text_auto positionIncrementGap=100
         analyzer type=index
            filter class=solr.ASCIIFoldingFilterFactory /
            tokenizer class=solr.KeywordTokenizerFactory /
            filter class=solr.WordDelimiterFilterFactory 
generateWordParts=1 generateNumberParts=1 catenateWords=0 
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1 preserveOriginal=1 
/
            filter class=solr.LowerCaseFilterFactory /
            filter class=solr.EdgeNGramFilterFactory maxGramSize=30 
minGramSize=1 /
         /analyzer
         analyzer type=query
            filter class=solr.ASCIIFoldingFilterFactory /
            tokenizer class=solr.KeywordTokenizerFactory /
            filter class=solr.WordDelimiterFilterFactory 
generateWordParts=1 generateNumberParts=1 catenateWords=0 
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1 preserveOriginal=1 
/
            filter class=solr.LowerCaseFilterFactory /
         /analyzer
      /fieldType

field name=dams_id type=string indexed=true stored=true /

 field name=published_date type=date indexed=true stored=false  /

field name=ph_su type=text_auto indexed=true stored=true 
multiValued=true /


 !-- Copy fields Auto Complete --
   copyField source=title dest=ph_su /
   copyField source=product_catalogue dest=ph_su /
   copyField source=product_category_name dest=ph_su /
  
solrquery is  
q=ph_su%3Aepub+start=0rows=10fl=dams_idwt=jsonindent=onhl=truehl.fl=ph_suhl.simple.pre=bhl.simple.post=/b

the requirement is to sort the results based on releavance and latest published 
products for the search term.

I have the below parameters but nothing worked

sort = dams_id desc,published_date desc
order_by = dams_id desc,published_date desc

Please let me know how to sort the results with relevance and published date 
descending.

Thanks,
Poornima

Re: auto-complete with typo fuzzy suggests

2013-02-13 Thread Jack Krupansky


Try the spellchecker rather than the suggester/auto-complete:

http://wiki.apache.org/solr/SpellCheckComponent

-- Jack Krupansky

-Original Message- 
From: ALEX PKB 
Sent: Wednesday, February 13, 2013 2:34 PM 
To: solr-user@lucene.apache.org 
Subject: auto-complete with typo fuzzy suggests 


Hi,
I tried to implement auto-complete with some fuzzy matches, I've
tried phonetic, ngram, the results are too fuzzy, Is there any analyzer
to handle typo.
Thanks!

Re: Auto-complete phrase

2012-03-28 Thread Rémy Loubradou

Thanks Otis but that's not an option for me. Should be pretty easy to do
this with Solr, I will still continue to work on it.

Great William I will give a try with this method, thanks.

On 28 March 2012 06:11, William Bell billnb...@gmail.com wrote:

 I am also very confused at the use case for the Suggester component.
 With collate on, it will try to combine random words together not the
 actual phrases that are there.

 I get better mileage out of EDGE grams and tokenize on whitespace...
 Left to right... Since that is how most people think.

 However, I would like Suggester to work as follows:

 Index:
 Chris Smith
 Tony Dawson
 Chris Leaf
 Daddy Golucky

 Query:
 1. Chris it returns Chris Leaf but not both Chris Smith and Chris Leaf.
 2. I seem to get collated (take first work and combine with second
 word). SO I would see things like Smith Leaf Very strange and
 not what we expect. These are formal names.

 When I use Ngrams I can index:

 C
 Ch
 Chr
 Chri
 Chris
 S
 Sm
 Smi
 Smit
 Smith

 Thus if I search on Smi it will match Chris Smith and also Chris
 Leaf. Exactly what I want.




 On Tue, Mar 27, 2012 at 11:05 AM, Rémy Loubradou r...@hipsnip.com wrote:
  Hello, I am working on creating a auto-complete functionality for my
 field
  merchant_name present all over my documents. I am using the version 3.4
 of
  Solr and I am trying to take advantage of the Suggester functionality.
  Unfortunately so far I didn't figure out how to make it works as  I
  expected.
 
  If my list of merchants present in my documents is:(my real list is
 bigger
  than the following list, that's the reason why I don't use dictionnary
 and
  also because it will change often.)
  Redoute
  Suisse Trois
  Conforama
  But
  Cult Beauty
  Brother Trois
 
  I expect from the Suggester component to match words or part of them and
  return phrases where words or part of them have been matched.
  for example with /suggest?q=tro, I would like to get this:
 
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime0/int
  /lst
  lst name=spellcheck
  lst name=suggestions
  lst name=tro
  int name=numFound2/int
  int name=startOffset0/int
  int name=endOffsetx/int
  arr name=suggestion
  strBother Trois/str
  strSuisse Trois/str
  /arr
  /lst
  /lst
  /lst
  /response
 
  I experimented suggestion on a field configured with the tokenizer
  solr.KeywordTokenizerFactory or solr.WhitespaceTokenizerFactory.
  In my mind I have to find a way to handle 3 cases:
  /suggest?q=bo -(should return) bother trois
  /suggest?q=tro -(should return) bother trois, suisse trois
  /suggest?q=bo%20tro -(should return) bother trois
 
  With the solr.KeywordTokenizerFactory I get:
  /suggest?q=bo - bother trois
  /suggest?q=tro - nothing
  /suggest?q=bo%20tro - nothing
 
  With the solr.WhitespaceTokenizerFactory I get:
  /suggest?q=bo - bother
  /suggest?q=troi - trois
  /suggest?q=bo%20tro - bother, trois
 
  Not exactly what I want ... :(
 
  My configuration in the file solrconfig.xml for the suggester component:
 
  searchComponent class=solr.SpellCheckComponent name=suggestMerchant
 lst name=spellchecker
   str name=namesuggestMerchant/str
   str
 name=classnameorg.apache.solr.spelling.suggest.Suggester/str
   str
  name=lookupImplorg.apache.solr.spelling.suggest.fst.FSTLookup/str
   !-- Alternatives to lookupImpl:
org.apache.solr.spelling.suggest.fst.FSTLookup   [finite state
  automaton]
org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
 [weighted
  finite state automaton]
org.apache.solr.spelling.suggest.jaspell.JaspellLookup
 [default,
  jaspell-based]
org.apache.solr.spelling.suggest.tst.TSTLookup   [ternary
 trees]
   --
   str name=fieldmerchant_name_autocomplete/str  !-- the indexed
  field to derive suggestions from --
   float name=threshold0.0/float
   str name=buildOnCommittrue/str
  !--
   str name=sourceLocationamerican-english/str
  --
 /lst
   /searchComponent
   requestHandler class=org.apache.solr.handler.component.SearchHandler
  name=/suggest/merchant
 lst name=defaults
   str name=spellchecktrue/str
   str name=spellcheck.dictionarysuggestMerchant/str
   str name=spellcheck.onlyMorePopulartrue/str
   str name=spellcheck.count10/str
   str name=spellcheck.collatetrue/str
   int name=spellcheck.maxCollations10/int
 /lst
 arr name=components
   strsuggestMerchant/str
 /arr
   /requestHandler
 
  How can I implement autocomplete with the Suggester component to get
 what I
  expect? Thanks for your help, I really appreciate.



 --
 Bill Bell
 billnb...@gmail.com
 cell 720-256-8076

Auto-complete phrase

2012-03-27 Thread Rémy Loubradou

Hello, I am working on creating a auto-complete functionality for my field
merchant_name present all over my documents. I am using the version 3.4 of
Solr and I am trying to take advantage of the Suggester functionality.
Unfortunately so far I didn't figure out how to make it works as  I
expected.

If my list of merchants present in my documents is:(my real list is bigger
than the following list, that's the reason why I don't use dictionnary and
also because it will change often.)
Redoute
Suisse Trois
Conforama
But
Cult Beauty
Brother Trois

I expect from the Suggester component to match words or part of them and
return phrases where words or part of them have been matched.
for example with /suggest?q=tro, I would like to get this:

response
lst name=responseHeader
int name=status0/int
int name=QTime0/int
/lst
lst name=spellcheck
lst name=suggestions
lst name=tro
int name=numFound2/int
int name=startOffset0/int
int name=endOffsetx/int
arr name=suggestion
strBother Trois/str
strSuisse Trois/str
/arr
/lst
/lst
/lst
/response

I experimented suggestion on a field configured with the tokenizer
solr.KeywordTokenizerFactory or solr.WhitespaceTokenizerFactory.
In my mind I have to find a way to handle 3 cases:
/suggest?q=bo -(should return) bother trois
/suggest?q=tro -(should return) bother trois, suisse trois
/suggest?q=bo%20tro -(should return) bother trois

With the solr.KeywordTokenizerFactory I get:
/suggest?q=bo - bother trois
/suggest?q=tro - nothing
/suggest?q=bo%20tro - nothing

With the solr.WhitespaceTokenizerFactory I get:
/suggest?q=bo - bother
/suggest?q=troi - trois
/suggest?q=bo%20tro - bother, trois

Not exactly what I want ... :(

My configuration in the file solrconfig.xml for the suggester component:

searchComponent class=solr.SpellCheckComponent name=suggestMerchant
lst name=spellchecker
  str name=namesuggestMerchant/str
  str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str
name=lookupImplorg.apache.solr.spelling.suggest.fst.FSTLookup/str
  !-- Alternatives to lookupImpl:
   org.apache.solr.spelling.suggest.fst.FSTLookup   [finite state
automaton]
   org.apache.solr.spelling.suggest.fst.WFSTLookupFactory [weighted
finite state automaton]
   org.apache.solr.spelling.suggest.jaspell.JaspellLookup [default,
jaspell-based]
   org.apache.solr.spelling.suggest.tst.TSTLookup   [ternary trees]
  --
  str name=fieldmerchant_name_autocomplete/str  !-- the indexed
field to derive suggestions from --
  float name=threshold0.0/float
  str name=buildOnCommittrue/str
!--
  str name=sourceLocationamerican-english/str
--
/lst
  /searchComponent
  requestHandler class=org.apache.solr.handler.component.SearchHandler
name=/suggest/merchant
lst name=defaults
  str name=spellchecktrue/str
  str name=spellcheck.dictionarysuggestMerchant/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count10/str
  str name=spellcheck.collatetrue/str
  int name=spellcheck.maxCollations10/int
/lst
arr name=components
  strsuggestMerchant/str
/arr
  /requestHandler

How can I implement autocomplete with the Suggester component to get what I
expect? Thanks for your help, I really appreciate.

Re: Auto-complete phrase

2012-03-27 Thread William Bell

I am also very confused at the use case for the Suggester component.
With collate on, it will try to combine random words together not the
actual phrases that are there.

I get better mileage out of EDGE grams and tokenize on whitespace...
Left to right... Since that is how most people think.

However, I would like Suggester to work as follows:

Index:
Chris Smith
Tony Dawson
Chris Leaf
Daddy Golucky

Query:
1. Chris it returns Chris Leaf but not both Chris Smith and Chris Leaf.
2. I seem to get collated (take first work and combine with second
word). SO I would see things like Smith Leaf Very strange and
not what we expect. These are formal names.

When I use Ngrams I can index:

C
Ch
Chr
Chri
Chris
S
Sm
Smi
Smit
Smith

Thus if I search on Smi it will match Chris Smith and also Chris
Leaf. Exactly what I want.




On Tue, Mar 27, 2012 at 11:05 AM, Rémy Loubradou r...@hipsnip.com wrote:
 Hello, I am working on creating a auto-complete functionality for my field
 merchant_name present all over my documents. I am using the version 3.4 of
 Solr and I am trying to take advantage of the Suggester functionality.
 Unfortunately so far I didn't figure out how to make it works as  I
 expected.

 If my list of merchants present in my documents is:(my real list is bigger
 than the following list, that's the reason why I don't use dictionnary and
 also because it will change often.)
 Redoute
 Suisse Trois
 Conforama
 But
 Cult Beauty
 Brother Trois

 I expect from the Suggester component to match words or part of them and
 return phrases where words or part of them have been matched.
 for example with /suggest?q=tro, I would like to get this:

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime0/int
 /lst
 lst name=spellcheck
 lst name=suggestions
 lst name=tro
 int name=numFound2/int
 int name=startOffset0/int
 int name=endOffsetx/int
 arr name=suggestion
 strBother Trois/str
 strSuisse Trois/str
 /arr
 /lst
 /lst
 /lst
 /response

 I experimented suggestion on a field configured with the tokenizer
 solr.KeywordTokenizerFactory or solr.WhitespaceTokenizerFactory.
 In my mind I have to find a way to handle 3 cases:
 /suggest?q=bo -(should return) bother trois
 /suggest?q=tro -(should return) bother trois, suisse trois
 /suggest?q=bo%20tro -(should return) bother trois

 With the solr.KeywordTokenizerFactory I get:
 /suggest?q=bo - bother trois
 /suggest?q=tro - nothing
 /suggest?q=bo%20tro - nothing

 With the solr.WhitespaceTokenizerFactory I get:
 /suggest?q=bo - bother
 /suggest?q=troi - trois
 /suggest?q=bo%20tro - bother, trois

 Not exactly what I want ... :(

 My configuration in the file solrconfig.xml for the suggester component:

 searchComponent class=solr.SpellCheckComponent name=suggestMerchant
    lst name=spellchecker
      str name=namesuggestMerchant/str
      str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
      str
 name=lookupImplorg.apache.solr.spelling.suggest.fst.FSTLookup/str
      !-- Alternatives to lookupImpl:
           org.apache.solr.spelling.suggest.fst.FSTLookup   [finite state
 automaton]
           org.apache.solr.spelling.suggest.fst.WFSTLookupFactory [weighted
 finite state automaton]
           org.apache.solr.spelling.suggest.jaspell.JaspellLookup [default,
 jaspell-based]
           org.apache.solr.spelling.suggest.tst.TSTLookup   [ternary trees]
      --
      str name=fieldmerchant_name_autocomplete/str  !-- the indexed
 field to derive suggestions from --
      float name=threshold0.0/float
      str name=buildOnCommittrue/str
 !--
      str name=sourceLocationamerican-english/str
 --
    /lst
  /searchComponent
  requestHandler class=org.apache.solr.handler.component.SearchHandler
 name=/suggest/merchant
    lst name=defaults
      str name=spellchecktrue/str
      str name=spellcheck.dictionarysuggestMerchant/str
      str name=spellcheck.onlyMorePopulartrue/str
      str name=spellcheck.count10/str
      str name=spellcheck.collatetrue/str
      int name=spellcheck.maxCollations10/int
    /lst
    arr name=components
      strsuggestMerchant/str
    /arr
  /requestHandler

 How can I implement autocomplete with the Suggester component to get what I
 expect? Thanks for your help, I really appreciate.



-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076

Re: phrase auto-complete with suggester component

2012-01-25 Thread O. Klein


Tommy Chheng-2 wrote
 
 Thanks, I'll try out the custom class file. Any possibilities this
 class can be merged into solr? It seems like an expected behavior.
 
 
 On Tue, Jan 24, 2012 at 11:29 AM, O. Klein lt;klein@gt; wrote:
 You might wanna read
 http://lucene.472066.n3.nabble.com/suggester-issues-td3262718.html#a3264740
 which contains the solution to your problem.

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/phrase-auto-complete-with-suggester-component-tp3685572p3685730.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 -- 
 Tommy Chheng
 

I agree. Suggester could use some attention. Looking at Wiki there were some
features planned, but not much has happened lately.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/phrase-auto-complete-with-suggester-component-tp3685572p3687495.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: phrase auto-complete with suggester component

2012-01-25 Thread O. Klein


O. Klein wrote
 
 I agree. Suggester could use some attention. Looking at Wiki there were
 some features planned, but not much has happened lately.
 

Or check out this post
http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/
looking very promising as an alternative.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/phrase-auto-complete-with-suggester-component-tp3685572p3689240.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: phrase auto-complete with suggester component

2012-01-25 Thread Tommy Chheng

Thanks for link, that's the approach I'm going to try.

On Wed, Jan 25, 2012 at 2:39 PM, O. Klein kl...@octoweb.nl wrote:

 O. Klein wrote

 I agree. Suggester could use some attention. Looking at Wiki there were
 some features planned, but not much has happened lately.


 Or check out this post
 http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/
 looking very promising as an alternative.

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/phrase-auto-complete-with-suggester-component-tp3685572p3689240.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Tommy Chheng

phrase auto-complete with suggester component

2012-01-24 Thread Tommy Chheng

I'm testing out the various auto-complete functionalities on the
wikipedia dataset.

I first tried the facet.prefix and found it slow at times. I'm now
looking at the Suggester component. Given a query like new york, I
would like to get results like New York or New York City.

When I tried using the suggest component, it suggest entries for each
word rather then phrase(even if i add quotes). How can I change my
config to get title matches and not have the query broken into each
word?

lst name=spellcheck
lst name=suggestions
lst name=new
int name=numFound5/int
int name=startOffset0/int
int name=endOffset3/int
arr name=suggestion
strnewt/str
strnewwy patitta/str
strnewyddion/str
strnewyorker/str
strnewyork–presbyterian hospital/str
/arr
/lst
lst name=york
int name=numFound5/int
int name=startOffset4/int
int name=endOffset8/int
arr name=suggestion
stryork/str
stryork–dauphin (septa station)/str
stryork—humber/str
stryork—scarborough/str
stryork—simcoe/str
/arr
/lst
str name=collationnewt york/str
/lst
/lst

/solr/suggest?q=new%20yorkomitHeader=truespellcheck.count=5spellcheck.collate=true

solrconfig.xml:
  searchComponent name=suggest class=solr.SpellCheckComponent
   lst name=spellchecker
    str name=namesuggest/str
    str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
    str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str
    str name=fieldtitle_autocomplete/str
    str name=buildOnCommittrue/str
   /lst
  /searchComponent

  requestHandler name=/suggest
class=org.apache.solr.handler.component.SearchHandler
   lst name=defaults
    str name=spellchecktrue/str
    str name=spellcheck.dictionarysuggest/str
    str name=spellcheck.count10/str
   /lst
   arr name=components
    strsuggest/str
   /arr
  /requestHandler

schema.xml:
    fieldType name=text_auto class=solr.TextField
     analyzer
      tokenizer class=solr.KeywordTokenizerFactory/
      filter class=solr.LowerCaseFilterFactory/
     /analyzer
    /fieldType

   field name=title_autocomplete type=text_auto indexed=true
stored=false multiValued=false /


-- 
Tommy Chheng

Re: phrase auto-complete with suggester component

2012-01-24 Thread O. Klein

You might wanna read
http://lucene.472066.n3.nabble.com/suggester-issues-td3262718.html#a3264740
which contains the solution to your problem.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/phrase-auto-complete-with-suggester-component-tp3685572p3685730.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: phrase auto-complete with suggester component

2012-01-24 Thread Tommy Chheng

Thanks, I'll try out the custom class file. Any possibilities this
class can be merged into solr? It seems like an expected behavior.


On Tue, Jan 24, 2012 at 11:29 AM, O. Klein kl...@octoweb.nl wrote:
 You might wanna read
 http://lucene.472066.n3.nabble.com/suggester-issues-td3262718.html#a3264740
 which contains the solution to your problem.

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/phrase-auto-complete-with-suggester-component-tp3685572p3685730.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Tommy Chheng

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2011-02-09 Thread pravin


Hello,
Andy, so did you get final answer to your quetion?
I am also trying to do something similar. Please give me pointers if you
have any.
Basically even I need to use Ngram with WhitespaceTokenizer any help will be
appreciated.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/NGramFilterFactory-for-auto-complete-that-matches-the-middle-of-multi-lingual-tags-tp1619234p2459466.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-04 Thread Andy

  1) hyphens - if user types ema or e-ma I want to
  suggest email
  
  2) accents - if user types herme  want to suggest
  Hermès
 
 Accents can be removed with using MappingCharFilterFactory
 before the tokenizer. (both index and query time)
 
 charFilter class=solr.MappingCharFilterFactory
 mapping=mapping-ISOLatin1Accent.txt/
 
 I am not sure if this is most elegant solution but you can
 replace - with  uing MappingCharFilterFactory too. It
 satisfies what you describe in 1.
 
 But generally NGramFilterFactory produces a lot of tokens.
 I mean query er can return hermes. May be
 EdgeNGramFilterFactory can be more suitable for
 auto-complete task. At least it guarantees that some word is
 starting with that character sequence.

Thanks.

I agree with the issues with NGramFilterFactory you pointed out and I really 
want to avoid using it. But the problem is that I have Chinese tags like 电吉他 
and multi-lingual tags like electric吉他.

For tags like that WhitespaceTokenizerFactory wouldn't work. And if I use 
ChineseFilterFactory would it recognize that the electric in electric吉他 
isn't Chinese and shouldn't be split into individual characters?

Any ideas here are greatly appreciated.

In a related matter, I checked out 
http://lucene.apache.org/solr/api/org/apache/solr/analysis/package-tree.html 
and saw that there are:

EdgeNGramFilterFactory  EdgeNGramTokenizerFactory
NGramFilterFactory  NGramTokenizerFactory

What are the differences between *FilterFactory and *TokenizerFactory? In my 
case which one should I be using?

Thanks.

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-04 Thread Andy

 
 I got your point. You want to retrieve electric吉他
 with the query 吉他. That's why you don't want EdgeNGram.
 If this is the only reason for NGram, I think you can
 transform electric吉他 into two tokens electric
 吉他 in TokenFilter(s) and apply EdgeNGram approach.
 

What TokenFilters would split electric吉他 into electric  吉他?

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-04 Thread Ahmet Arslan

 What TokenFilters would split electric吉他 into
 electric  吉他?

Is it possible to write a regex to capture Chinese text? (Unicode range?)

If yes, you can use PatternReplaceFilter to transform electric吉他 into 
electric_吉他.

filter class=solr.PatternReplaceFilter
pattern=(latin)(chineese) replacement=$1_$2/

After that WordDelimeterFilterFactory can produce two adjacent tokens.

But may be using a custom filter can be more easy.

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-04 Thread Ahmet Arslan

 I agree with the issues with NGramFilterFactory you pointed
 out and I really want to avoid using it. But the problem is
 that I have Chinese tags like 电吉他 and multi-lingual
 tags like electric吉他.

I got your point. You want to retrieve electric吉他 with the query 吉他. That's 
why you don't want EdgeNGram.
If this is the only reason for NGram, I think you can transform electric吉他 
into two tokens electric 吉他 in TokenFilter(s) and apply EdgeNGram approach.

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-03 Thread Andy



--- On Sat, 10/2/10, Ahmet Arslan iori...@yahoo.com wrote:

  I don't understand. Many tags like electric吉他
 or
  古典吉他 have no whitespace at all, so how does
  WhitespaceTokenizer help?
 
 It makes sense for tags having more than one words. i.e.
 electric guitar
 
 If you tokenize this using whitespacetokenizer, you obtain
 two tokens.
 If you use keywordtokenizer, you obtain only one token,
 always.
 
 In other words, if you want query qui to return electric
 guitar you need whitespacetokenizer.


But I thought NGramFilterFactory would generate substrings that start in the 
middle, hence ensuring autocomplete matching in the middle.

So in the case of electric guitar, keywordtokenizer would create one token - 
electric guitar

NGramFilterFactory would then take that one toke (electric guitar) and 
generate N-grams out of it. One of the ngrams would be guit because guit is 
a substring of electric guitar.

Or did I misunderstand how NGramFilterFactory work?

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-03 Thread Gert Brinkmann


On 03.10.2010 09:20, Andy wrote:

NGramFilterFactory would then take that one toke (electric guitar)
and generate N-grams out of it. One of the ngrams would be guit
because guit is a substring of electric guitar.


AFAIK it only produces prefix-strings like

gui
guit
guita
guitar

etc.
So that you can do a prefix search without a wildcard. So it is enough 
to search for guit and you do not need to search for guit*. The 
latter wildcard string can make trouble with stopwordfiltering and (at 
least in solr 1.3) with text snippet generating.


Greetings,
Gert

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-03 Thread Ahmet Arslan

 But I thought NGramFilterFactory would generate substrings
 that start in the middle, hence ensuring autocomplete
 matching in the middle.
 
 So in the case of electric guitar, keywordtokenizer would
 create one token - electric guitar
 
 NGramFilterFactory would then take that one toke (electric
 guitar) and generate N-grams out of it. One of the ngrams
 would be guit because guit is a substring of electric
 guitar.
 

Ups. You are correct, I am sorry. I mixed it with *Edge*NGramFilterFActory.

RE: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-03 Thread Jonathan Rochkind

Huh, the NGramFilterFactory itself isn't listed on the the analyzers wiki at: 
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

That wiki page seems to be protected to certain users only. Anyone know if 
there's a way to send a 'patch' to the maintainers for the wiki, or if there's 
a process for getting editing privileges on that page?  I'd like to help out by 
adding documentation when I come accross it. 

Jonahtan

From: Ahmet Arslan [iori...@yahoo.com]
Sent: Sunday, October 03, 2010 6:26 AM
To: solr-user@lucene.apache.org
Subject: Re: NGramFilterFactory for auto-complete that matches the middle of 
multi-lingual tags?

 But I thought NGramFilterFactory would generate substrings
 that start in the middle, hence ensuring autocomplete
 matching in the middle.

 So in the case of electric guitar, keywordtokenizer would
 create one token - electric guitar

 NGramFilterFactory would then take that one toke (electric
 guitar) and generate N-grams out of it. One of the ngrams
 would be guit because guit is a substring of electric
 guitar.


Ups. You are correct, I am sorry. I mixed it with *Edge*NGramFilterFActory.

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-03 Thread Robert Muir

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters?action=newaccount

On Sun, Oct 3, 2010 at 2:40 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

 Huh, the NGramFilterFactory itself isn't listed on the the analyzers wiki
 at: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

 That wiki page seems to be protected to certain users only. Anyone know if
 there's a way to send a 'patch' to the maintainers for the wiki, or if
 there's a process for getting editing privileges on that page?  I'd like to
 help out by adding documentation when I come accross it.

 Jonahtan
 
 From: Ahmet Arslan [iori...@yahoo.com]
 Sent: Sunday, October 03, 2010 6:26 AM
 To: solr-user@lucene.apache.org
 Subject: Re: NGramFilterFactory for auto-complete that matches the middle
 of multi-lingual tags?

  But I thought NGramFilterFactory would generate substrings
  that start in the middle, hence ensuring autocomplete
  matching in the middle.
 
  So in the case of electric guitar, keywordtokenizer would
  create one token - electric guitar
 
  NGramFilterFactory would then take that one toke (electric
  guitar) and generate N-grams out of it. One of the ngrams
  would be guit because guit is a substring of electric
  guitar.
 

 Ups. You are correct, I am sorry. I mixed it with *Edge*NGramFilterFActory.






-- 
Robert Muir
rcm...@gmail.com

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-03 Thread Dennis Gearon

What's the difference between the filter/anayzers that have 'factory' in their 
name, and the ones that don't?


Dennis Gearon

Signature Warning

EARTH has a Right To Life,
  otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Sun, 10/3/10, Ahmet Arslan iori...@yahoo.com wrote:

 From: Ahmet Arslan iori...@yahoo.com
 Subject: Re: NGramFilterFactory for auto-complete that matches the middle of 
 multi-lingual tags?
 To: solr-user@lucene.apache.org
 Date: Sunday, October 3, 2010, 3:26 AM
  But I thought NGramFilterFactory
 would generate substrings
  that start in the middle, hence ensuring
 autocomplete
  matching in the middle.
  
  So in the case of electric guitar, keywordtokenizer
 would
  create one token - electric guitar
  
  NGramFilterFactory would then take that one toke
 (electric
  guitar) and generate N-grams out of it. One of the
 ngrams
  would be guit because guit is a substring of
 electric
  guitar.
  
 
 Ups. You are correct, I am sorry. I mixed it with
 *Edge*NGramFilterFActory.

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-03 Thread Andy

Ah Thanks for clearing that up.

Does anyone know how to deal with these 2 issues when using NGramFilterFactory 
for autocomplete?

1) hyphens - if user types ema or e-ma I want to suggest email

2) accents - if user types herme  want to suggest Hermès

Thanks.

--- On Sun, 10/3/10, Ahmet Arslan iori...@yahoo.com wrote:

 From: Ahmet Arslan iori...@yahoo.com
 Subject: Re: NGramFilterFactory for auto-complete that matches the middle of 
 multi-lingual tags?
 To: solr-user@lucene.apache.org
 Date: Sunday, October 3, 2010, 6:26 AM
  But I thought NGramFilterFactory
 would generate substrings
  that start in the middle, hence ensuring
 autocomplete
  matching in the middle.
  
  So in the case of electric guitar, keywordtokenizer
 would
  create one token - electric guitar
  
  NGramFilterFactory would then take that one toke
 (electric
  guitar) and generate N-grams out of it. One of the
 ngrams
  would be guit because guit is a substring of
 electric
  guitar.
  
 
 Ups. You are correct, I am sorry. I mixed it with
 *Edge*NGramFilterFActory.

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-03 Thread Lance Norskog


Start a new thread.

Dennis Gearon wrote:

What's the difference between the filter/anayzers that have 'factory' in their 
name, and the ones that don't?


Dennis Gearon

Signature Warning

EARTH has a Right To Life,
   otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Sun, 10/3/10, Ahmet Arslaniori...@yahoo.com  wrote:

   

From: Ahmet Arslaniori...@yahoo.com
Subject: Re: NGramFilterFactory for auto-complete that matches the middle of 
multi-lingual tags?
To: solr-user@lucene.apache.org
Date: Sunday, October 3, 2010, 3:26 AM
 

But I thought NGramFilterFactory
   

would generate substrings
 

that start in the middle, hence ensuring
   

autocomplete
 

matching in the middle.

So in the case of electric guitar, keywordtokenizer
   

would
 

create one token - electric guitar

NGramFilterFactory would then take that one toke
   

(electric
 

guitar) and generate N-grams out of it. One of the
   

ngrams
 

would be guit because guit is a substring of
   

electric
 

guitar.

   

Ups. You are correct, I am sorry. I mixed it with
*Edge*NGramFilterFActory.

NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-02 Thread Andy

I working on a user-generated tagging feature. Some of the tags could be 
multi-lingual, mixng languages like English, Chinese, Japanese

I'd like to add auto-complete to help users to enter the tags. And I'd want to 
match in the middle of the tags as well.

For example, if a user types guit I want to suggest:
guitar
electric guitar
电动guitar
guitar英雄

And if a user types 吉他 I want to suggest:
吉他Hero
electric吉他
古典吉他


I'm thinking about using:

fieldType name=autocomplete class=solr.TextField 
positionIncrementGap=100
 analyzer type=index
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.NGramFilterFactory minGramSize=1 maxGramSize=15 /
 /analyzer
 analyzer type=query
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
 /analyzer
/fieldType

Would the above setup do what I want to do?

Also how would I deal with hyphens? For example I want an input or either 
wi-f or wif to match the tag wi-fi. 

Would adding WordDelimiterFilterFactory to both index and query accomplish 
that?


Thanks.

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-02 Thread Ahmet Arslan

 For example, if a user types guit I want to suggest:
 guitar
 electric guitar
 电动guitar
 guitar英雄
 
 And if a user types 吉他 I want to suggest:
 吉他Hero
 electric吉他
 古典吉他
 
 
 I'm thinking about using:
 
 fieldType name=autocomplete class=solr.TextField
 positionIncrementGap=100
  analyzer type=index
    tokenizer
 class=solr.KeywordTokenizerFactory/
    filter
 class=solr.LowerCaseFilterFactory/
    filter
 class=solr.NGramFilterFactory minGramSize=1
 maxGramSize=15 /
  /analyzer
  analyzer type=query
    tokenizer
 class=solr.KeywordTokenizerFactory/
    filter
 class=solr.LowerCaseFilterFactory/
  /analyzer
 /fieldType
 
 Would the above setup do what I want to do?

fieldType autocomplete will bring you only startsWith tags since it uses 
KeywordTokenizerFactory. You need WhitespaceTokenizer for your use case. 

Or you can use two different fields and types (using keywordtokenizer and 
whitespacetokenizer). So that beginsWith matches comes first.

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-02 Thread Andy



--- On Sat, 10/2/10, Ahmet Arslan iori...@yahoo.com wrote:

 From: Ahmet Arslan iori...@yahoo.com

  For example, if a user types
 guit I want to suggest:
  guitar
  electric guitar
  电动guitar
  guitar英雄
  
  And if a user types 吉他 I want to suggest:
  吉他Hero
  electric吉他
  古典吉他
  
  
  I'm thinking about using:
  
  fieldType name=autocomplete
 class=solr.TextField
  positionIncrementGap=100
   analyzer type=index
     tokenizer
  class=solr.KeywordTokenizerFactory/
     filter
  class=solr.LowerCaseFilterFactory/
     filter
  class=solr.NGramFilterFactory minGramSize=1
  maxGramSize=15 /
   /analyzer
   analyzer type=query
     tokenizer
  class=solr.KeywordTokenizerFactory/
     filter
  class=solr.LowerCaseFilterFactory/
   /analyzer
  /fieldType
  
  Would the above setup do what I want to do?
 
 fieldType autocomplete will bring you only startsWith tags
 since it uses KeywordTokenizerFactory. You need
 WhitespaceTokenizer for your use case. 
 
 Or you can use two different fields and types (using
 keywordtokenizer and whitespacetokenizer). So that
 beginsWith matches comes first.
 

I don't understand. Many tags like electric吉他 or 古典吉他 have no whitespace at 
all, so how does WhitespaceTokenizer help?

Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-02 Thread Ahmet Arslan


 I don't understand. Many tags like electric吉他 or
 古典吉他 have no whitespace at all, so how does
 WhitespaceTokenizer help?

It makes sense for tags having more than one words. i.e. electric guitar

If you tokenize this using whitespacetokenizer, you obtain two tokens.
If you use keywordtokenizer, you obtain only one token, always.

In other words, if you want query qui to return electric guitar you need 
whitespacetokenizer.

analysis.jsp visualizes analysis process step by step. You can observe it.

Multi-lingual auto-complete?

2010-09-27 Thread Andy

I want to provide auto-complete to users when they're inputting tags. The 
auto-complete tag suggestions would be based on tags that are already in the 
system.

Multiple tags are separated by commas. A single tag could contain multiple 
words such as Apple computer.

One issue is that a tag could be in multiple languages, including both 
languages (e.g. English, French) that use whitespace as word separator and 
languages that don't (e.g. CJK)

An example of such a multi-lingual tag is Apple 电脑.

If a user types apple, I'd like the autocomplete suggestions to include both 
Apple computer (ie. matches are case insensitive) and green apple (ie. 
matches aren't restricted to prefixes). And a user typing 电脑 should match 
Apple 电脑.

Is it possible to do that? I read the article:
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

In that article KeywordTokenizerFactor is used. If I changed it to CJKTokenizer 
would that work? 

With an input of Apple 电脑, what would CJKTokenizer produce?

-is it Apple, 电, 脑 ?
or
- is it A, p, p, l, e, 电, 脑 ?

Any help would be greatly appreciated.

Andy

Re: enhancing auto complete

2010-08-09 Thread Bhavnik Gajjar

Thanks Avlesh for sharing the info. Will try it!

In between, some another solution is also found
http://metaoptimize.com/qa/questions/17/stemming-problems-when-writing-search-auto-complete

Kind regards.

On 8/4/2010 9:13 PM, Avlesh Singh wrote:
I preferred to answer this question privately earlier. But I have received
innumerable requests to unveil the architecture. For the benefit of all, I
am posting it here (after hiding as much info as I should, in my company's
interest).

The context: Auto-suggest feature on http://askme.in

*Solr setup*: Underneath are some of the salient features -

1. TermsComponent is NOT used.
2. The index is made up of 4 fields of the following types -
autocomplete_full, autocomplete_token, string and text.
3. autocomplete_full uses KeywordTokenizerFactory and
EdgeNGramFilterFactory. autocomplete_token uses
WhitespaceTokenizerFactory
and EdgeNGramFilterFactory. Both of these are Solr text fields with
standard
filters like LowerCaseFilterFactory etc applied during querying and
indexing.
4. Standard DataImportHandler and a bunch of sql procedures are used to
derive all suggestable phrases from the system and index them in the
above
mentioned fields.

*Controller setup*: The controller (to handle suggest queries) is a typical
JAVA servlet using Solr as its backend (connecting via solrj). Based on the
incoming query string, a lucene query is created. It is BooleanQuery
comprising of TermQuery across all the above mentioned fields. The boost
factor to each of these term queries would determine (to an extent) what
kind of matches do you prefer to show up first. JSON is used as the data
exchange format.

*Frontend setup*: It is a home grown JS to address some specific use cases
of the project in question. One simple exercise with Firebug will spill all
the beans. However, I strongly recommend using jQuery to build (and extend)
the UI component.

Any help beyond this is available, but off the list.

Cheers
Avlesh
@avleshhttp://twitter.com/avlesh | http://webklipper.com

On Tue, Aug 3, 2010 at 10:04 AM, Bhavnik Gajjar
bhavnik.gaj...@gatewaynintec.com wrote:

Whoops!

table still not looks ok :(

trying to send once again

loremLorem ipsum dolor sit amet
Hieyed ddi lorem ipsum dolor
test lorem ipsume
test xyz lorem ipslili

lorem ipLorem ipsum dolor sit amet
Hieyed ddi lorem ipsum dolor
test lorem ipsume
test xyz lorem ipslili

lorem ipsltest xyz lorem ipslili

On 8/3/2010 10:00 AM, Bhavnik Gajjar wrote:

Avlesh,

Thanks for responding

The table mentioned below looks like,

lorem Lorem ipsum dolor sit amet
Hieyed ddi lorem ipsum
dolor
test lorem ipsume
test xyz lorem ipslili

lorem ip Lorem ipsum dolor sit amet
Hieyed ddi lorem ipsum
dolor
test lorem ipsume
test xyz lorem ipslili

lorem ipsl test xyz lorem ipslili

Yes, [http://askme.in] looks good!

I would like to know its designs/solr configurations etc.. Can you
please provide me detailed views of it?

In [http://askme.in], there is one thing to be noted. Search text like,
[business c] populates [Business Centre] which looks OK but, [Consultant
Business] looks bit odd. But, in general the pointer you suggested is
great to start with.

On 8/2/2010 8:39 PM, Avlesh Singh wrote:

From whatever I could read in your broken table of sample use cases, I
think

you are looking for something similar to what has been done here
-http://askme.in; if this is what you are looking do let me know.

Cheers
Avlesh
@avleshhttp://twitter.com/avlesh http://twitter.com/avlesh |
http://webklipper.com

On Mon, Aug 2, 2010 at 8:09 PM, Bhavnik
Gajjarbhavnik.gaj...@gatewaynintec.com wrote:

Hi,

I'm looking for a solution related to auto complete feature for one
application.

Below is a list of texts from which auto complete results would be
populated.

Lorem ipsum dolor sit amet
tincidunt ut laoreet
dolore eu feugiat nulla facilisis at vero eros et
te feugait nulla facilisi
Claritas est etiam processus
anteposuerit litterarum formas humanitatis
fiant sollemnes in futurum
Hieyed ddi lorem ipsum dolor
test lorem ipsume
test xyz lorem ipslili

Consider below table. First column describes user entered value and
second column describes expected result (list of auto complete terms
that should be populated from Solr)

lorem
*Lorem

Re: enhancing auto complete

2010-08-04 Thread Avlesh Singh

I preferred to answer this question privately earlier. But I have received
innumerable requests to unveil the architecture. For the benefit of all, I
am posting it here (after hiding as much info as I should, in my company's
interest).

The context: Auto-suggest feature on http://askme.in

*Solr setup*: Underneath are some of the salient features -

   1. TermsComponent is NOT used.
   2. The index is made up of 4 fields of the following types -
   autocomplete_full, autocomplete_token, string and text.
   3. autocomplete_full uses KeywordTokenizerFactory and
   EdgeNGramFilterFactory. autocomplete_token uses WhitespaceTokenizerFactory
   and EdgeNGramFilterFactory. Both of these are Solr text fields with standard
   filters like LowerCaseFilterFactory etc applied during querying and
   indexing.
   4. Standard DataImportHandler and a bunch of sql procedures are used to
   derive all suggestable phrases from the system and index them in the above
   mentioned fields.

*Controller setup*: The controller (to handle suggest queries) is a typical
JAVA servlet using Solr as its backend (connecting via solrj). Based on the
incoming query string, a lucene query is created. It is BooleanQuery
comprising of TermQuery across all the above mentioned fields. The boost
factor to each of these term queries would determine (to an extent) what
kind of matches do you prefer to show up first. JSON is used as the data
exchange format.

*Frontend setup*: It is a home grown JS to address some specific use cases
of the project in question. One simple exercise with Firebug will spill all
the beans. However, I strongly recommend using jQuery to build (and extend)
the UI component.

Any help beyond this is available, but off the list.

Cheers
Avlesh
@avlesh http://twitter.com/avlesh | http://webklipper.com

On Tue, Aug 3, 2010 at 10:04 AM, Bhavnik Gajjar 
bhavnik.gaj...@gatewaynintec.com wrote:

  Whoops!

 table still not looks ok :(

 trying to send once again


 loremLorem ipsum dolor sit amet
 Hieyed ddi lorem ipsum dolor
 test lorem ipsume
 test xyz lorem ipslili

 lorem ipLorem ipsum dolor sit amet
 Hieyed ddi lorem ipsum dolor
 test lorem ipsume
 test xyz lorem ipslili

 lorem ipsltest xyz lorem ipslili

 On 8/3/2010 10:00 AM, Bhavnik Gajjar wrote:

 Avlesh,

 Thanks for responding

 The table mentioned below looks like,

 lorem   Lorem ipsum dolor sit amet
  Hieyed ddi lorem ipsum
 dolor
  test lorem ipsume
  test xyz lorem ipslili

 lorem ip   Lorem ipsum dolor sit amet
  Hieyed ddi lorem ipsum
 dolor
  test lorem ipsume
  test xyz lorem ipslili

 lorem ipsl test xyz lorem ipslili


 Yes, [http://askme.in] looks good!

 I would like to know its designs/solr configurations etc.. Can you
 please provide me detailed views of it?

 In [http://askme.in], there is one thing to be noted. Search text like,
 [business c] populates [Business Centre] which looks OK but, [Consultant
 Business] looks bit odd. But, in general the pointer you suggested is
 great to start with.

 On 8/2/2010 8:39 PM, Avlesh Singh wrote:


  From whatever I could read in your broken table of sample use cases, I think


  you are looking for something similar to what has been done here 
 -http://askme.in; if this is what you are looking do let me know.

 Cheers
 Avlesh
 @avleshhttp://twitter.com/avlesh http://twitter.com/avlesh  | 
 http://webklipper.com

 On Mon, Aug 2, 2010 at 8:09 PM, Bhavnik 
 Gajjarbhavnik.gaj...@gatewaynintec.com  wrote:




  Hi,

 I'm looking for a solution related to auto complete feature for one
 application.

 Below is a list of texts from which auto complete results would be
 populated.

 Lorem ipsum dolor sit amet
 tincidunt ut laoreet
 dolore eu feugiat nulla facilisis at vero eros et
 te feugait nulla facilisi
 Claritas est etiam processus
 anteposuerit litterarum formas humanitatis
 fiant sollemnes in futurum
 Hieyed ddi lorem ipsum dolor
 test lorem ipsume
 test xyz lorem ipslili

 Consider below table. First column describes user entered value and
 second column describes expected result (list of auto complete terms
 that should be populated from Solr)

 lorem
 *Lorem* ipsum dolor sit amet
 Hieyed ddi *lorem* ipsum dolor
 test *lorem *ipsume
 test xyz *lorem *ipslili
 lorem ip
 *Lorem ip*sum dolor sit amet
 Hieyed ddi *lorem ip*sum dolor
 test *lorem ip*sume
 test xyz *lorem ip*slili
 lorem ipsl
 test xyz *lorem ipsl*ili



 Can anyone share ideas of how this can be achieved

enhancing auto complete

2010-08-02 Thread Bhavnik Gajjar

Hi,

I'm looking for a solution related to auto complete feature for one 
application.

Below is a list of texts from which auto complete results would be 
populated.

Lorem ipsum dolor sit amet
tincidunt ut laoreet
dolore eu feugiat nulla facilisis at vero eros et
te feugait nulla facilisi
Claritas est etiam processus
anteposuerit litterarum formas humanitatis
fiant sollemnes in futurum
Hieyed ddi lorem ipsum dolor
test lorem ipsume
test xyz lorem ipslili

Consider below table. First column describes user entered value and 
second column describes expected result (list of auto complete terms 
that should be populated from Solr)

lorem
*Lorem* ipsum dolor sit amet
Hieyed ddi *lorem* ipsum dolor
test *lorem *ipsume
test xyz *lorem *ipslili
lorem ip
*Lorem ip*sum dolor sit amet
Hieyed ddi *lorem ip*sum dolor
test *lorem ip*sume
test xyz *lorem ip*slili
lorem ipsl
test xyz *lorem ipsl*ili



Can anyone share ideas of how this can be achieved with Solr? Already 
tried with various tokenizers and filter factories like, 
WhiteSpaceTokenizer, KeywordTokenizer, EdgeNGramFilterFactory, 
ShingleFilterFactory etc. but no luck so far..

Note that, It would be excellent if terms populated from Solr can be 
highlighted by using Highlighting or any other component/mechanism of Solr.

*Note :* Standard autocomplete (like, 
facet.field=AutoCompletef.AutoComplete.facet.prefix=user entered 
termf.AutoComplete.facet.limit=10facet.sortrows=0) are already 
working fine with the application. but, nowadays, looking for enhancing 
the existing auto complete stuff with the above requirement.

Any thoughts?

Thanks in advance




The contents of this eMail including the contents of attachment(s) are 
privileged and confidential material of Gateway NINtec Pvt. Ltd. (GNPL) and 
should not be disclosed to, used by or copied in any manner by anyone other 
than the intended addressee(s). If this eMail has been received by error, 
please advise the sender immediately and delete it from your system. The views 
expressed in this eMail message are those of the individual sender, except 
where the sender expressly, and with authority, states them to be the views of 
GNPL. Any unauthorized review, use, disclosure, dissemination, forwarding, 
printing or copying of this eMail or any action taken in reliance on this eMail 
is strictly prohibited and may be unlawful. This eMail may contain viruses. 
GNPL has taken every reasonable precaution to minimize this risk, but is not 
liable for any damage you may sustain as a result of any virus in this eMail. 
You should carry out your own virus checks before opening the eMail or 
attachment(s). GNPL is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt. GNPL reserves the right to monitor and review the content of all 
messages sent to or from this eMail address and may be stored on the GNPL eMail 
system. In case this eMail has reached you in error, and you  would no longer 
like to receive eMails from us, then please send an eMail to 
d...@gatewaynintec.com

Re: enhancing auto complete

2010-08-02 Thread Avlesh Singh

From whatever I could read in your broken table of sample use cases, I think
you are looking for something similar to what has been done here -
http://askme.in; if this is what you are looking do let me know.

Cheers
Avlesh
@avlesh http://twitter.com/avlesh | http://webklipper.com

On Mon, Aug 2, 2010 at 8:09 PM, Bhavnik Gajjar 
bhavnik.gaj...@gatewaynintec.com wrote:

 Hi,

 I'm looking for a solution related to auto complete feature for one
 application.

 Below is a list of texts from which auto complete results would be
 populated.

 Lorem ipsum dolor sit amet
 tincidunt ut laoreet
 dolore eu feugiat nulla facilisis at vero eros et
 te feugait nulla facilisi
 Claritas est etiam processus
 anteposuerit litterarum formas humanitatis
 fiant sollemnes in futurum
 Hieyed ddi lorem ipsum dolor
 test lorem ipsume
 test xyz lorem ipslili

 Consider below table. First column describes user entered value and
 second column describes expected result (list of auto complete terms
 that should be populated from Solr)

 lorem
*Lorem* ipsum dolor sit amet
 Hieyed ddi *lorem* ipsum dolor
 test *lorem *ipsume
 test xyz *lorem *ipslili
 lorem ip
*Lorem ip*sum dolor sit amet
 Hieyed ddi *lorem ip*sum dolor
 test *lorem ip*sume
 test xyz *lorem ip*slili
 lorem ipsl
test xyz *lorem ipsl*ili



 Can anyone share ideas of how this can be achieved with Solr? Already
 tried with various tokenizers and filter factories like,
 WhiteSpaceTokenizer, KeywordTokenizer, EdgeNGramFilterFactory,
 ShingleFilterFactory etc. but no luck so far..

 Note that, It would be excellent if terms populated from Solr can be
 highlighted by using Highlighting or any other component/mechanism of Solr.

 *Note :* Standard autocomplete (like,
 facet.field=AutoCompletef.AutoComplete.facet.prefix=user entered
 termf.AutoComplete.facet.limit=10facet.sortrows=0) are already
 working fine with the application. but, nowadays, looking for enhancing
 the existing auto complete stuff with the above requirement.

 Any thoughts?

 Thanks in advance




 The contents of this eMail including the contents of attachment(s) are
 privileged and confidential material of Gateway NINtec Pvt. Ltd. (GNPL) and
 should not be disclosed to, used by or copied in any manner by anyone other
 than the intended addressee(s). If this eMail has been received by error,
 please advise the sender immediately and delete it from your system. The
 views expressed in this eMail message are those of the individual sender,
 except where the sender expressly, and with authority, states them to be the
 views of GNPL. Any unauthorized review, use, disclosure, dissemination,
 forwarding, printing or copying of this eMail or any action taken in
 reliance on this eMail is strictly prohibited and may be unlawful. This
 eMail may contain viruses. GNPL has taken every reasonable precaution to
 minimize this risk, but is not liable for any damage you may sustain as a
 result of any virus in this eMail. You should carry out your own virus
 checks before opening the eMail or attachment(s). GNPL is neither liable for
 the proper and complete transmission of the information contained in this
 communication nor for any delay in its receipt. GNPL reserves the right to
 monitor and review the content of all messages sent to or from this eMail
 address and may be stored on the GNPL eMail system. In case this eMail has
 reached you in error, and you  would no longer like to receive eMails from
 us, then please send an eMail to d...@gatewaynintec.com

Re: enhancing auto complete

2010-08-02 Thread scrapy


 Hi, I'm also interested of this feature... is it open source?

 


 

 

-Original Message-
From: Avlesh Singh avl...@gmail.com
To: solr-user@lucene.apache.org
Sent: Mon, Aug 2, 2010 5:09 pm
Subject: Re: enhancing auto complete


From whatever I could read in your broken table of sample use cases, I think

you are looking for something similar to what has been done here -

http://askme.in; if this is what you are looking do let me know.



Cheers

Avlesh

@avlesh http://twitter.com/avlesh | http://webklipper.com



On Mon, Aug 2, 2010 at 8:09 PM, Bhavnik Gajjar 

bhavnik.gaj...@gatewaynintec.com wrote:



 Hi,



 I'm looking for a solution related to auto complete feature for one

 application.



 Below is a list of texts from which auto complete results would be

 populated.



 Lorem ipsum dolor sit amet

 tincidunt ut laoreet

 dolore eu feugiat nulla facilisis at vero eros et

 te feugait nulla facilisi

 Claritas est etiam processus

 anteposuerit litterarum formas humanitatis

 fiant sollemnes in futurum

 Hieyed ddi lorem ipsum dolor

 test lorem ipsume

 test xyz lorem ipslili



 Consider below table. First column describes user entered value and

 second column describes expected result (list of auto complete terms

 that should be populated from Solr)



 lorem

*Lorem* ipsum dolor sit amet

 Hieyed ddi *lorem* ipsum dolor

 test *lorem *ipsume

 test xyz *lorem *ipslili

 lorem ip

*Lorem ip*sum dolor sit amet

 Hieyed ddi *lorem ip*sum dolor

 test *lorem ip*sume

 test xyz *lorem ip*slili

 lorem ipsl

test xyz *lorem ipsl*ili







 Can anyone share ideas of how this can be achieved with Solr? Already

 tried with various tokenizers and filter factories like,

 WhiteSpaceTokenizer, KeywordTokenizer, EdgeNGramFilterFactory,

 ShingleFilterFactory etc. but no luck so far..



 Note that, It would be excellent if terms populated from Solr can be

 highlighted by using Highlighting or any other component/mechanism of Solr.



 *Note :* Standard autocomplete (like,

 facet.field=AutoCompletef.AutoComplete.facet.prefix=user entered

 termf.AutoComplete.facet.limit=10facet.sortrows=0) are already

 working fine with the application. but, nowadays, looking for enhancing

 the existing auto complete stuff with the above requirement.



 Any thoughts?



 Thanks in advance









 The contents of this eMail including the contents of attachment(s) are

 privileged and confidential material of Gateway NINtec Pvt. Ltd. (GNPL) and

 should not be disclosed to, used by or copied in any manner by anyone other

 than the intended addressee(s). If this eMail has been received by error,

 please advise the sender immediately and delete it from your system. The

 views expressed in this eMail message are those of the individual sender,

 except where the sender expressly, and with authority, states them to be the

 views of GNPL. Any unauthorized review, use, disclosure, dissemination,

 forwarding, printing or copying of this eMail or any action taken in

 reliance on this eMail is strictly prohibited and may be unlawful. This

 eMail may contain viruses. GNPL has taken every reasonable precaution to

 minimize this risk, but is not liable for any damage you may sustain as a

 result of any virus in this eMail. You should carry out your own virus

 checks before opening the eMail or attachment(s). GNPL is neither liable for

 the proper and complete transmission of the information contained in this

 communication nor for any delay in its receipt. GNPL reserves the right to

 monitor and review the content of all messages sent to or from this eMail

 address and may be stored on the GNPL eMail system. In case this eMail has

 reached you in error, and you  would no longer like to receive eMails from

 us, then please send an eMail to d...@gatewaynintec.com

Re: enhancing auto complete

2010-08-02 Thread Avlesh Singh

Hahaha ... sorry its not. And there is no readymade code that I can give you
either. But yes, if you liked it, I can share the design of this feature
(solr, backend and frontend).

Cheers
Avlesh
@avlesh http://twitter.com/avlesh | http://webklipper.com

On Mon, Aug 2, 2010 at 8:47 PM, scr...@asia.com wrote:


  Hi, I'm also interested of this feature... is it open source?








 -Original Message-
 From: Avlesh Singh avl...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Mon, Aug 2, 2010 5:09 pm
 Subject: Re: enhancing auto complete


 From whatever I could read in your broken table of sample use cases, I
 think

 you are looking for something similar to what has been done here -

 http://askme.in; if this is what you are looking do let me know.



 Cheers

 Avlesh

 @avlesh http://twitter.com/avlesh | http://webklipper.com



 On Mon, Aug 2, 2010 at 8:09 PM, Bhavnik Gajjar 

 bhavnik.gaj...@gatewaynintec.com wrote:



  Hi,

 

  I'm looking for a solution related to auto complete feature for one

  application.

 

  Below is a list of texts from which auto complete results would be

  populated.

 

  Lorem ipsum dolor sit amet

  tincidunt ut laoreet

  dolore eu feugiat nulla facilisis at vero eros et

  te feugait nulla facilisi

  Claritas est etiam processus

  anteposuerit litterarum formas humanitatis

  fiant sollemnes in futurum

  Hieyed ddi lorem ipsum dolor

  test lorem ipsume

  test xyz lorem ipslili

 

  Consider below table. First column describes user entered value and

  second column describes expected result (list of auto complete terms

  that should be populated from Solr)

 

  lorem

 *Lorem* ipsum dolor sit amet

  Hieyed ddi *lorem* ipsum dolor

  test *lorem *ipsume

  test xyz *lorem *ipslili

  lorem ip

 *Lorem ip*sum dolor sit amet

  Hieyed ddi *lorem ip*sum dolor

  test *lorem ip*sume

  test xyz *lorem ip*slili

  lorem ipsl

 test xyz *lorem ipsl*ili

 

 

 

  Can anyone share ideas of how this can be achieved with Solr? Already

  tried with various tokenizers and filter factories like,

  WhiteSpaceTokenizer, KeywordTokenizer, EdgeNGramFilterFactory,

  ShingleFilterFactory etc. but no luck so far..

 

  Note that, It would be excellent if terms populated from Solr can be

  highlighted by using Highlighting or any other component/mechanism of
 Solr.

 

  *Note :* Standard autocomplete (like,

  facet.field=AutoCompletef.AutoComplete.facet.prefix=user entered

  termf.AutoComplete.facet.limit=10facet.sortrows=0) are already

  working fine with the application. but, nowadays, looking for enhancing

  the existing auto complete stuff with the above requirement.

 

  Any thoughts?

 

  Thanks in advance

 

 

 

 

  The contents of this eMail including the contents of attachment(s) are

  privileged and confidential material of Gateway NINtec Pvt. Ltd. (GNPL)
 and

  should not be disclosed to, used by or copied in any manner by anyone
 other

  than the intended addressee(s). If this eMail has been received by error,

  please advise the sender immediately and delete it from your system. The

  views expressed in this eMail message are those of the individual sender,

  except where the sender expressly, and with authority, states them to be
 the

  views of GNPL. Any unauthorized review, use, disclosure, dissemination,

  forwarding, printing or copying of this eMail or any action taken in

  reliance on this eMail is strictly prohibited and may be unlawful. This

  eMail may contain viruses. GNPL has taken every reasonable precaution to

  minimize this risk, but is not liable for any damage you may sustain as a

  result of any virus in this eMail. You should carry out your own virus

  checks before opening the eMail or attachment(s). GNPL is neither liable
 for

  the proper and complete transmission of the information contained in this

  communication nor for any delay in its receipt. GNPL reserves the right
 to

  monitor and review the content of all messages sent to or from this eMail

  address and may be stored on the GNPL eMail system. In case this eMail
 has

  reached you in error, and you  would no longer like to receive eMails
 from

  us, then please send an eMail to d...@gatewaynintec.com

Re: enhancing auto complete

2010-08-02 Thread scrapy

Ok i'm still interested of the design
 

 


 

 

-Original Message-
From: Avlesh Singh avl...@gmail.com
To: solr-user@lucene.apache.org
Sent: Mon, Aug 2, 2010 5:20 pm
Subject: Re: enhancing auto complete


Hahaha ... sorry its not. And there is no readymade code that I can give you

either. But yes, if you liked it, I can share the design of this feature

(solr, backend and frontend).



Cheers

Avlesh

@avlesh http://twitter.com/avlesh | http://webklipper.com



On Mon, Aug 2, 2010 at 8:47 PM, scr...@asia.com wrote:





  Hi, I'm also interested of this feature... is it open source?

















 -Original Message-

 From: Avlesh Singh avl...@gmail.com

 To: solr-user@lucene.apache.org

 Sent: Mon, Aug 2, 2010 5:09 pm

 Subject: Re: enhancing auto complete





 From whatever I could read in your broken table of sample use cases, I

 think



 you are looking for something similar to what has been done here -



 http://askme.in; if this is what you are looking do let me know.







 Cheers



 Avlesh



 @avlesh http://twitter.com/avlesh | http://webklipper.com







 On Mon, Aug 2, 2010 at 8:09 PM, Bhavnik Gajjar 



 bhavnik.gaj...@gatewaynintec.com wrote:







  Hi,



 



  I'm looking for a solution related to auto complete feature for one



  application.



 



  Below is a list of texts from which auto complete results would be



  populated.



 



  Lorem ipsum dolor sit amet



  tincidunt ut laoreet



  dolore eu feugiat nulla facilisis at vero eros et



  te feugait nulla facilisi



  Claritas est etiam processus



  anteposuerit litterarum formas humanitatis



  fiant sollemnes in futurum



  Hieyed ddi lorem ipsum dolor



  test lorem ipsume



  test xyz lorem ipslili



 



  Consider below table. First column describes user entered value and



  second column describes expected result (list of auto complete terms



  that should be populated from Solr)



 



  lorem



 *Lorem* ipsum dolor sit amet



  Hieyed ddi *lorem* ipsum dolor



  test *lorem *ipsume



  test xyz *lorem *ipslili



  lorem ip



 *Lorem ip*sum dolor sit amet



  Hieyed ddi *lorem ip*sum dolor



  test *lorem ip*sume



  test xyz *lorem ip*slili



  lorem ipsl



 test xyz *lorem ipsl*ili



 



 



 



  Can anyone share ideas of how this can be achieved with Solr? Already



  tried with various tokenizers and filter factories like,



  WhiteSpaceTokenizer, KeywordTokenizer, EdgeNGramFilterFactory,



  ShingleFilterFactory etc. but no luck so far..



 



  Note that, It would be excellent if terms populated from Solr can be



  highlighted by using Highlighting or any other component/mechanism of

 Solr.



 



  *Note :* Standard autocomplete (like,



  facet.field=AutoCompletef.AutoComplete.facet.prefix=user entered



  termf.AutoComplete.facet.limit=10facet.sortrows=0) are already



  working fine with the application. but, nowadays, looking for enhancing



  the existing auto complete stuff with the above requirement.



 



  Any thoughts?



 



  Thanks in advance



 



 



 



 



  The contents of this eMail including the contents of attachment(s) are



  privileged and confidential material of Gateway NINtec Pvt. Ltd. (GNPL)

 and



  should not be disclosed to, used by or copied in any manner by anyone

 other



  than the intended addressee(s). If this eMail has been received by error,



  please advise the sender immediately and delete it from your system. The



  views expressed in this eMail message are those of the individual sender,



  except where the sender expressly, and with authority, states them to be

 the



  views of GNPL. Any unauthorized review, use, disclosure, dissemination,



  forwarding, printing or copying of this eMail or any action taken in



  reliance on this eMail is strictly prohibited and may be unlawful. This



  eMail may contain viruses. GNPL has taken every reasonable precaution to



  minimize this risk, but is not liable for any damage you may sustain as a



  result of any virus in this eMail. You should carry out your own virus



  checks before opening the eMail or attachment(s). GNPL is neither liable

 for



  the proper and complete transmission of the information contained in this



  communication nor for any delay in its receipt. GNPL reserves the right

 to



  monitor and review the content of all messages sent to or from this eMail



  address and may be stored on the GNPL eMail system. In case this eMail

 has



  reached you in error, and you  would no longer like to receive eMails

 from



  us, then please send an eMail to d...@gatewaynintec.com

Re: enhancing auto complete

2010-08-02 Thread Bhavnik Gajjar

Avlesh,

Thanks for responding

The table mentioned below looks like,

lorem   Lorem ipsum dolor sit amet
 Hieyed ddi lorem ipsum 
dolor
 test lorem ipsume
 test xyz lorem ipslili

lorem ip   Lorem ipsum dolor sit amet
 Hieyed ddi lorem ipsum 
dolor
 test lorem ipsume
 test xyz lorem ipslili

lorem ipsl test xyz lorem ipslili


Yes, [http://askme.in] looks good!

I would like to know its designs/solr configurations etc.. Can you 
please provide me detailed views of it?

In [http://askme.in], there is one thing to be noted. Search text like, 
[business c] populates [Business Centre] which looks OK but, [Consultant 
Business] looks bit odd. But, in general the pointer you suggested is 
great to start with.

On 8/2/2010 8:39 PM, Avlesh Singh wrote:
  From whatever I could read in your broken table of sample use cases, I think
 you are looking for something similar to what has been done here -
 http://askme.in; if this is what you are looking do let me know.

 Cheers
 Avlesh
 @avleshhttp://twitter.com/avlesh  | http://webklipper.com

 On Mon, Aug 2, 2010 at 8:09 PM, Bhavnik Gajjar
 bhavnik.gaj...@gatewaynintec.com  wrote:


 Hi,

 I'm looking for a solution related to auto complete feature for one
 application.

 Below is a list of texts from which auto complete results would be
 populated.

 Lorem ipsum dolor sit amet
 tincidunt ut laoreet
 dolore eu feugiat nulla facilisis at vero eros et
 te feugait nulla facilisi
 Claritas est etiam processus
 anteposuerit litterarum formas humanitatis
 fiant sollemnes in futurum
 Hieyed ddi lorem ipsum dolor
 test lorem ipsume
 test xyz lorem ipslili

 Consider below table. First column describes user entered value and
 second column describes expected result (list of auto complete terms
 that should be populated from Solr)

 lorem
 *Lorem* ipsum dolor sit amet
 Hieyed ddi *lorem* ipsum dolor
 test *lorem *ipsume
 test xyz *lorem *ipslili
 lorem ip
 *Lorem ip*sum dolor sit amet
 Hieyed ddi *lorem ip*sum dolor
 test *lorem ip*sume
 test xyz *lorem ip*slili
 lorem ipsl
 test xyz *lorem ipsl*ili



 Can anyone share ideas of how this can be achieved with Solr? Already
 tried with various tokenizers and filter factories like,
 WhiteSpaceTokenizer, KeywordTokenizer, EdgeNGramFilterFactory,
 ShingleFilterFactory etc. but no luck so far..

 Note that, It would be excellent if terms populated from Solr can be
 highlighted by using Highlighting or any other component/mechanism of Solr.

 *Note :* Standard autocomplete (like,
 facet.field=AutoCompletef.AutoComplete.facet.prefix=user entered
 termf.AutoComplete.facet.limit=10facet.sortrows=0) are already
 working fine with the application. but, nowadays, looking for enhancing
 the existing auto complete stuff with the above requirement.

 Any thoughts?

 Thanks in advance
  




The contents of this eMail including the contents of attachment(s) are 
privileged and confidential material of Gateway NINtec Pvt. Ltd. (GNPL) and 
should not be disclosed to, used by or copied in any manner by anyone other 
than the intended addressee(s). If this eMail has been received by error, 
please advise the sender immediately and delete it from your system. The views 
expressed in this eMail message are those of the individual sender, except 
where the sender expressly, and with authority, states them to be the views of 
GNPL. Any unauthorized review, use, disclosure, dissemination, forwarding, 
printing or copying of this eMail or any action taken in reliance on this eMail 
is strictly prohibited and may be unlawful. This eMail may contain viruses. 
GNPL has taken every reasonable precaution to minimize this risk, but is not 
liable for any damage you may sustain as a result of any virus in this eMail. 
You should carry out your own virus checks before opening the eMail or 
attachment(s). GNPL is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt. GNPL reserves the right to monitor and review the content of all 
messages sent to or from this eMail address and may be stored on the GNPL eMail 
system. In case this eMail has reached you in error, and you  would no longer 
like to receive eMails from us, then please send an eMail to 
d...@gatewaynintec.com

Re: enhancing auto complete

2010-08-02 Thread Bhavnik Gajjar

Whoops!

table still not looks ok :(

trying to send once again

loremLorem ipsum dolor sit amet
 Hieyed ddi lorem ipsum dolor
 test lorem ipsume
 test xyz lorem ipslili

lorem ipLorem ipsum dolor sit amet
 Hieyed ddi lorem ipsum dolor
 test lorem ipsume
 test xyz lorem ipslili

lorem ipsltest xyz lorem ipslili

On 8/3/2010 10:00 AM, Bhavnik Gajjar wrote:
 Avlesh,

 Thanks for responding

 The table mentioned below looks like,

 lorem   Lorem ipsum dolor sit amet
   Hieyed ddi lorem ipsum
 dolor
   test lorem ipsume
   test xyz lorem ipslili

 lorem ip   Lorem ipsum dolor sit amet
   Hieyed ddi lorem ipsum
 dolor
   test lorem ipsume
   test xyz lorem ipslili

 lorem ipsl test xyz lorem ipslili


 Yes, [http://askme.in] looks good!

 I would like to know its designs/solr configurations etc.. Can you
 please provide me detailed views of it?

 In [http://askme.in], there is one thing to be noted. Search text like,
 [business c] populates [Business Centre] which looks OK but, [Consultant
 Business] looks bit odd. But, in general the pointer you suggested is
 great to start with.

 On 8/2/2010 8:39 PM, Avlesh Singh wrote:

  From whatever I could read in your broken table of sample use cases, I 
 think

 you are looking for something similar to what has been done here -
 http://askme.in; if this is what you are looking do let me know.

 Cheers
 Avlesh
 @avleshhttp://twitter.com/avlesh   | http://webklipper.com

 On Mon, Aug 2, 2010 at 8:09 PM, Bhavnik Gajjar
 bhavnik.gaj...@gatewaynintec.com   wrote:


  
 Hi,

 I'm looking for a solution related to auto complete feature for one
 application.

 Below is a list of texts from which auto complete results would be
 populated.

 Lorem ipsum dolor sit amet
 tincidunt ut laoreet
 dolore eu feugiat nulla facilisis at vero eros et
 te feugait nulla facilisi
 Claritas est etiam processus
 anteposuerit litterarum formas humanitatis
 fiant sollemnes in futurum
 Hieyed ddi lorem ipsum dolor
 test lorem ipsume
 test xyz lorem ipslili

 Consider below table. First column describes user entered value and
 second column describes expected result (list of auto complete terms
 that should be populated from Solr)

 lorem
  *Lorem* ipsum dolor sit amet
 Hieyed ddi *lorem* ipsum dolor
 test *lorem *ipsume
 test xyz *lorem *ipslili
 lorem ip
  *Lorem ip*sum dolor sit amet
 Hieyed ddi *lorem ip*sum dolor
 test *lorem ip*sume
 test xyz *lorem ip*slili
 lorem ipsl
  test xyz *lorem ipsl*ili



 Can anyone share ideas of how this can be achieved with Solr? Already
 tried with various tokenizers and filter factories like,
 WhiteSpaceTokenizer, KeywordTokenizer, EdgeNGramFilterFactory,
 ShingleFilterFactory etc. but no luck so far..

 Note that, It would be excellent if terms populated from Solr can be
 highlighted by using Highlighting or any other component/mechanism of Solr.

 *Note :* Standard autocomplete (like,
 facet.field=AutoCompletef.AutoComplete.facet.prefix=user entered
 termf.AutoComplete.facet.limit=10facet.sortrows=0) are already
 working fine with the application. but, nowadays, looking for enhancing
 the existing auto complete stuff with the above requirement.

 Any thoughts?

 Thanks in advance







The contents of this eMail including the contents of attachment(s) are 
privileged and confidential material of Gateway NINtec Pvt. Ltd. (GNPL) and 
should not be disclosed to, used by or copied in any manner by anyone other 
than the intended addressee(s). If this eMail has been received by error, 
please advise the sender immediately and delete it from your system. The views 
expressed in this eMail message are those of the individual sender, except 
where the sender expressly, and with authority, states them to be the views of 
GNPL. Any unauthorized review, use, disclosure, dissemination, forwarding, 
printing or copying of this eMail or any action taken in reliance on this eMail 
is strictly prohibited and may be unlawful. This eMail may contain viruses. 
GNPL has taken every reasonable precaution to minimize this risk, but is not 
liable for any damage you may sustain as a result of any virus in this eMail. 
You should carry out your own virus checks before opening the eMail or 
attachment(s). GNPL is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt. GNPL reserves the right to monitor and review the content

Stemming issue on spell correction and auto complete

2010-04-21 Thread Dhanushka Samarakoon

Hi,
I have some issues with stemming on spell corrections and auto completes.
Given below is a sample record from my docs.

doc
field name=idK-82/field
field name=GrantNumber22570/field
field name=GrantTitleExtension IPM Coordination Program/field
field name=FundingSponsorUS Department of Agriculture/field
field name=DateAwarded2009-06-11T00:00:00Z/field
field name=AwardTotal180/field
field name=ProgramNameExtension Integrated Management Coordination
Program/field
field name=DepartmentNameAgronomy/field
field name=DepartmentNameEntomology/field
field name=DepartmentNamePlant Pathology/field
field name=CollegeNameAgriculture/field
field name=CollegeNameCooperative Extension/field
field name=InvestigatorName Phillips, Marcelo/field
field name=InvestigatorName Kennelly, James/field
field name=InvestigatorName Michaud, Alberto/field
/doc

This is the schema.xml

   field name=id type=string indexed=true stored=true
required=true /
   field name=GrantNumber type=string indexed=true stored=true
required=true /
   field name=GrantTitle  type=text indexed=true stored=true/

   field name=FundingSponsor type=text indexed=true stored=true /
   field name=fFundingSponsor type=string indexed=true stored=false
/
   copyField source=FundingSponsor dest=fFundingSponsor/

   field name=AwardTotal type=float indexed=true stored=true /

   field name=ThroughSponsor type=text indexed=true stored=true
multiValued=true /
   field name=fThroughSponsor type=string indexed=true stored=false
multiValued=true /
   copyField source=ThroughSponsor dest=fThroughSponsor/

   field name=fSponsor type=string indexed=true stored=false
multiValued=true /
   copyField source=FundingSponsor dest=fSponsor/
   copyField source=ThroughSponsor dest=fSponsor/

   field name=ProgramName type=text indexed=text stored=true /
   field name=DateAwarded type=date indexed=true stored=true /

   field name=DepartmentName type=text indexed=true stored=true
multiValued=true /
   field name=fDepartmentName type=string indexed=true stored=false
multiValued=true /
   copyField source=DepartmentName dest=fDepartmentName/

   field name=CollegeName type=text indexed=true stored=true
multiValued=true /
   field name=fCollegeName type=string indexed=true stored=false
multiValued=true /
   copyField source=CollegeName dest=fCollegeName/

   field name=InvestigatorName type=text indexed=true stored=true
multiValued=true protected=true /
   field name=fInvestigatorName type=string indexed=true
stored=false multiValued=true /
   copyField source=InvestigatorName dest=fInvestigatorName/

   field name=fSpell type=text indexed=true stored=false
multiValued=true /
   copyField source=FundingSponsor dest=fSpell/
   copyField source=ThroughSponsor dest=fSpell/
   copyField source=ProgramName dest=fSpell/
   copyField source=InvestigatorName dest=fSpell/
   copyField source=DepartmentName dest=fSpell/

solrconfig.xml

  searchComponent name=spellcheck class=solr.SpellCheckComponent

str name=queryAnalyzerFieldTypetextSpell/str

lst name=spellchecker
  str name=namedefault/str
  str name=classnamesolr.IndexBasedSpellChecker/str
  str name=field*fSpell*/str
  str name=spellcheckIndexDir./spellchecker/str
  str name=buildOnCommittrue/str
  str name=accuracy0.5/str
/lst

1) spelling issue
If I submit the following URL to get the spelling suggetions for
'humanites', I get 'human' instead of 'humanities'.
http://localhost:8983/solr/spell/?q=humanitesspellcheck=truespellcheck.collate=true

It seems like if I change the type of 'fSpell' to 'string' the query would
not work. Any suggestions?

2) Auto complete issue

Currently I'm using the following URL for auto complete.
http://lib-dev-web1.lib.campus:8983/solr/terms?terms.fl=fSpellterms.sort=indexterms.prefix=humaindent=truewt=phpomitHeader=true

Then I get 'human' as the only suggestion, but I would rather get
few suggestions like 'humanities', 'human ecology'.

I also tried the following URL.
http://lib-dev-web1.lib.campus:8983/solr/select/?incident=onqt=dismaxfacet=onrows=0facet.limit=10facet.mincount=1facet.field=GrantTitleq=kansasfacet.prefix=st

For the auto complete I would like to be able to create a field with
Investigator names, Grant titles, Program names, department names, etc ..
and to match strings off that.  For example if I type 'hum' get
some suggestions as given below.

- Hummer, David (a name)
- Evaluation of Live Attenuated B. Melitensis Vaccines in Non-Human Primates
(a grant title)
- Program of global humanities (a program name)
- Humanities and Cultural bonding (a program name)
- Department of Human Nutrition (a department name)
- College of Human Ecology (a college name)
- Human-Robot Teams Informed By Human Performance Moderator Functions (a
grant title)

Any ideas on how to achieve this?

Thanks,
Dhanushka.

Re: Return one word - Auto Complete Request Handler

2009-09-15 Thread Grant Ingersoll



On Sep 14, 2009, at 2:06 PM, Mohamed Parvez wrote:


I am trying configure an request handler that will be uses in the Auto
Complete Query.

I am limiting the result to one field by using the fl parameter,  
which can

be used to specify field to return.

How to make the field return only one word not full sentences.




Is http://wiki.apache.org/solr/TermsComponent helpful?


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search

Return one word - Auto Complete Request Handler

2009-09-14 Thread Mohamed Parvez

I am trying configure an request handler that will be uses in the Auto
Complete Query.

I am limiting the result to one field by using the fl parameter, which can
be used to specify field to return.

How to make the field return only one word not full sentences.



Thanks/Regards,
Parvez

RE: Auto complete

2008-07-10 Thread sundar shankar

Daniel,
  I have tested your config of autocomplete. That works perfectly. 
THANKS A LOT FOR THAT. Truly appreciate your help.
 
All,
  I am not able to wiki for a lot of the 1.3 filters and analysers. Is 
there somewhere that I can get documentation on the same. 
 
-S



 Date: Tue, 8 Jul 2008 23:13:57 +0530 From: [EMAIL PROTECTED] To: 
 solr-user@lucene.apache.org Subject: Re: Auto complete  He must be using a 
 nightly build of Solr 1.3 -- I think you can consider using it as it is 
 quite stable and close to release.  On Tue, Jul 8, 2008 at 10:38 PM, sundar 
 shankar [EMAIL PROTECTED] wrote:   Hi Daniel,  Thanks for the code. I 
 just did observe that you have  EdgeNGramFilterFactory. I didnt find it in 
 the 1.2 Solr version. Which  version are you using for this. 1.3 isnt out 
 yet rite. Is there any other  production version of Solr available that I 
 can use?   Regards  Sundar  Subject: Re: Auto complete 
 From: [EMAIL PROTECTED] To:  solr-user@lucene.apache.org Date: Tue, 8 Jul 
 2008 11:30:31 +0100  Hi,   This is how we implement our autocomplete 
 feature, excerpt from  schema.xml  -First accept the input as is without 
 alteration -Lowercase  the input, and eliminate all non a-z0-9 chars to 
 normalize the input  -split into multiple tokens with 
 EdgeNGramFilterFactory upto a max of 100  chars, all starting from the 
 beginning of the input, e.g. hello becomes  h,he,hel,hell,hello.  -For 
 queries we accept the first 20 chars.  Hope  this helps.   fieldType 
 name=autocomplete class=solr.TextField  analyzer type=index 
 tokenizer class=solr.KeywordTokenizerFactory/  filter 
 class=solr.LowerCaseFilterFactory / filter  
 class=solr.PatternReplaceFilterFactory pattern=([^a-z0-9])  
 replacement= replace=all / filter  
 class=solr.EdgeNGramFilterFactory maxGramSize=100 minGramSize=1 /  
 /analyzer analyzer type=query tokenizer  
 class=solr.KeywordTokenizerFactory/ filter  
 class=solr.LowerCaseFilterFactory / filter  
 class=solr.PatternReplaceFilterFactory pattern=([^a-z0-9])  
 replacement= replace=all / filter  
 class=solr.PatternReplaceFilterFactory pattern=^(.{20})(.*)?  
 replacement=$1 replace=all / /analyzer /fieldType ... field  
 name=ac type=autocomplete indexed=true stored=true required=false 
  /  Regards, Dan On Mon, 2008-07-07 at 17:12 +, sundar  
 shankar wrote:  Hi All,  I am using Solr for some time and am having  
 trouble with an auto complete feature that I have been trying to  
 incorporate. I am indexing solr as a database column to solr field mapping. 
  I have tried various configs that were mentioned in the solr user 
 community  suggestions and have tried a few option of my own too. Each of 
 them seem to  either not bring me the exact data I want or seems to get 
 excess data. I have tried.  text_ws,  text,  string  
 EdgeNGramTokenizerFactory   the subword example  textTight  and 
 juggling arnd some of the filters  and analysers togther.Couldnt 
 get dismax to work as somehow it wasnt  able to connect my field defined in 
 the schema to the qf param that I was  passing in the request.Text 
 tight was the best results I had but the  problem there was it was 
 searching for whole words and not part words.   exampleif my 
 query String was field1:Word1 word2* I was getting back  results but if my 
 query string was field1: Word1 wor* I didnt get a result  back.I am 
 little perplexed on how to implement this. I dont know  what has to be 
 done.The schema  field name=  institution.name 
 type=text_ws indexed=true stored=true  termVectors=true/  
 !--Sundar changed city to subword so that spaces  are ignored--
 field name=instAlphaSort type=alphaOnlySort  indexed=true 
 stored=false multiValued=true/  !-- Tight text cos we  want results 
 to be much the same for this--  field name=instText  type=text 
 indexed=true stored=true termVectors=true  multiValued=true/  
 field name=instString type=autosuggest  indexed=true stored=true 
 termVectors=true multiValued=true/ field name=instSubword 
 type=subword indexed=true stored=true  multiValued=true 
 termVectors=true/  field name=instTight  type=textTight 
 indexed=true stored=true multiValued=true  termVectors=true/   
  I Index institution.name only, the  rest are copy fields of the 
 same.  Any help is appreciated. Thanks  Sundar
  _  Chose 
  your Life Partner? Join MSN Matrimony   
 http://www.shaadi.com/msn/matrimony.php This email has been  
 scanned for virus and spam content Daniel Rosher Developer  
 www.thehotonlinenetwork.com d: 0207 3489 912  t: 0845 4680 568  f:  
 0845 4680 868  m:   Beaumont House, Kensington Village, Avonmore Road,  
 London, W14 8TS- - - - - - - - - - - - - - - - - - - - - - - - - - - 
  - - - - - - - - - - - - - - - - - - - - - - - - - - - -  This message is 
  sent in confidence for the addressee only. It may contain privileged

Re: Auto complete

2008-07-08 Thread daniel rosher

Hi,

This is how we implement our autocomplete feature, excerpt from
schema.xml

-First accept the input as is without alteration
-Lowercase the input, and eliminate all non a-z0-9 chars to normalize
the input
-split into multiple tokens with EdgeNGramFilterFactory upto a max of
100 chars, all starting from the beginning of the input, e.g. hello
becomes h,he,hel,hell,hello. 
-For queries we accept the first 20 chars.

Hope this helps.


fieldType name=autocomplete class=solr.TextField
analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory /
filter class=solr.PatternReplaceFilterFactory
pattern=([^a-z0-9]) replacement= replace=all /
filter class=solr.EdgeNGramFilterFactory
maxGramSize=100 minGramSize=1 /
/analyzer
analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory /
filter class=solr.PatternReplaceFilterFactory
pattern=([^a-z0-9]) replacement= replace=all /
filter class=solr.PatternReplaceFilterFactory
pattern=^(.{20})(.*)? replacement=$1 replace=all /
/analyzer
/fieldType
...
field name=ac type=autocomplete indexed=true stored=true
required=false /

Regards,
Dan




On Mon, 2008-07-07 at 17:12 +, sundar shankar wrote:
 Hi All,
I am using Solr for some time and am having trouble with an auto 
 complete feature that I have been trying to incorporate. I am indexing solr 
 as a database column to solr field mapping. I have tried various configs that 
 were mentioned in the solr user community suggestions and have tried a few 
 option of my own too. Each of them seem to either not bring me the exact data 
 I want or seems to get excess data.
 
 I have tried.
 text_ws,
 text,
 string
 EdgeNGramTokenizerFactory
 the subword example
 textTight
 and juggling arnd some of the filters and analysers togther.
 
 Couldnt get dismax to work as somehow it wasnt able to connect my field 
 defined in the schema to the qf param that I was passing in the request.
 
 Text tight was the best results I had but the problem there was it was 
 searching for whole words and not part words.
 example
 
 if my query String was field1:Word1 word2* I was getting back results but if 
 my query string was field1: Word1 wor* I didnt get a result back.
 
 I am little perplexed on how to implement this. I dont know what has to be 
 done.
 
 The schema
 
 
field name=institution.name type=text_ws indexed=true stored=true 
 termVectors=true/
!--Sundar changed city to subword so that spaces are ignored--
 
field name=instAlphaSort type=alphaOnlySort indexed=true 
 stored=false multiValued=true/
!-- Tight text cos we want results to be much the same for this--
field name=instText type=text indexed=true stored=true  
 termVectors=true multiValued=true/
field name=instString type=autosuggest indexed=true stored=true  
 termVectors=true multiValued=true/
 
field name=instSubword type=subword indexed=true stored=true 
 multiValued=true  termVectors=true/
field name=instTight type=textTight indexed=true stored=true 
 multiValued=true  termVectors=true/
 
 
 
 I Index institution.name only, the rest are copy fields of the same.
 
 
 Any help is appreciated.
 
 Thanks
 Sundar
 
 _
 Chose your Life Partner? Join MSN Matrimony
 http://www.shaadi.com/msn/matrimony.php 
 
 This email has been scanned for virus and spam content
Daniel Rosher
Developer
www.thehotonlinenetwork.com
d: 0207 3489 912

t: 0845 4680 568

f: 0845 4680 868

m: 

Beaumont House, Kensington Village, Avonmore Road, London, W14 
8TS



- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - -

This message is sent in confidence for the addressee only. It may contain 
privileged

information. The contents are not to be disclosed to anyone other than the 
addressee.

Unauthorised recipients are requested to preserve this confidentiality and 
to advise

us of any errors in transmission. Thank you.

hotonline ltd is registered in England  Wales. Registered office: One 
Canada Square,

Canary Wharf, London E14 5AP. Registered No: 1904765.

RE: Auto complete

2008-07-08 Thread sundar shankar

Hi Daniel,
 Thanks for the code. I just did observe that you have 
EdgeNGramFilterFactory. I didnt find it in the 1.2 Solr version. Which version 
are you using for this. 1.3 isnt out yet rite. Is there any other production 
version of Solr available that I can use?
 
Regards
Sundar



 Subject: Re: Auto complete From: [EMAIL PROTECTED] To: 
 solr-user@lucene.apache.org Date: Tue, 8 Jul 2008 11:30:31 +0100  Hi,  
 This is how we implement our autocomplete feature, excerpt from schema.xml 
  -First accept the input as is without alteration -Lowercase the input, and 
 eliminate all non a-z0-9 chars to normalize the input -split into multiple 
 tokens with EdgeNGramFilterFactory upto a max of 100 chars, all starting 
 from the beginning of the input, e.g. hello becomes h,he,hel,hell,hello.  
 -For queries we accept the first 20 chars.  Hope this helps.   
 fieldType name=autocomplete class=solr.TextField analyzer 
 type=index tokenizer class=solr.KeywordTokenizerFactory/ filter 
 class=solr.LowerCaseFilterFactory / filter 
 class=solr.PatternReplaceFilterFactory pattern=([^a-z0-9]) 
 replacement= replace=all / filter class=solr.EdgeNGramFilterFactory 
 maxGramSize=100 minGramSize=1 / /analyzer analyzer type=query 
 tokenizer class=solr.KeywordTokenizerFactory/ filter 
 class=solr.LowerCaseFilterFactory / filter 
 class=solr.PatternReplaceFilterFactory pattern=([^a-z0-9]) 
 replacement= replace=all / filter 
 class=solr.PatternReplaceFilterFactory pattern=^(.{20})(.*)? 
 replacement=$1 replace=all / /analyzer /fieldType ... field 
 name=ac type=autocomplete indexed=true stored=true required=false 
 /  Regards, Dan On Mon, 2008-07-07 at 17:12 +, sundar 
 shankar wrote:  Hi All,  I am using Solr for some time and am having 
 trouble with an auto complete feature that I have been trying to incorporate. 
 I am indexing solr as a database column to solr field mapping. I have tried 
 various configs that were mentioned in the solr user community suggestions 
 and have tried a few option of my own too. Each of them seem to either not 
 bring me the exact data I want or seems to get excess data.I have 
 tried.  text_ws,  text,  string  EdgeNGramTokenizerFactory  the 
 subword example  textTight  and juggling arnd some of the filters and 
 analysers togther.Couldnt get dismax to work as somehow it wasnt able 
 to connect my field defined in the schema to the qf param that I was passing 
 in the request.Text tight was the best results I had but the problem 
 there was it was searching for whole words and not part words.  example  
   if my query String was field1:Word1 word2* I was getting back results but 
 if my query string was field1: Word1 wor* I didnt get a result back.I 
 am little perplexed on how to implement this. I dont know what has to be 
 done.The schema  field name=institution.name 
 type=text_ws indexed=true stored=true termVectors=true/  
 !--Sundar changed city to subword so that spaces are ignored--
 field name=instAlphaSort type=alphaOnlySort indexed=true 
 stored=false multiValued=true/  !-- Tight text cos we want results to 
 be much the same for this--  field name=instText type=text 
 indexed=true stored=true termVectors=true multiValued=true/  
 field name=instString type=autosuggest indexed=true stored=true 
 termVectors=true multiValued=true/field name=instSubword 
 type=subword indexed=true stored=true multiValued=true 
 termVectors=true/  field name=instTight type=textTight 
 indexed=true stored=true multiValued=true termVectors=true/ 
I Index institution.name only, the rest are copy fields of the same.  
 Any help is appreciated.Thanks  Sundar
 _  Chose 
 your Life Partner? Join MSN Matrimony  
 http://www.shaadi.com/msn/matrimony.php This email has been scanned 
 for virus and spam content Daniel Rosher Developer 
 www.thehotonlinenetwork.com d: 0207 3489 912  t: 0845 4680 568  f: 0845 
 4680 868  m:   Beaumont House, Kensington Village, Avonmore Road, London, 
 W14 8TS- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
 - - - - - - - - - - - - - - - - - - - - - - - -  This message is sent in 
 confidence for the addressee only. It may contain privileged  information. 
 The contents are not to be disclosed to anyone other than the addressee.  
 Unauthorised recipients are requested to preserve this confidentiality and to 
 advise  us of any errors in transmission. Thank you.  hotonline ltd is 
 registered in England  Wales. Registered office: One Canada Square,  
 Canary Wharf, London E14 5AP. Registered No: 1904765.
_
Wish to Marry Now? Join Shaadi.com FREE! 
http://www.shaadi.com/registration/user/index.php?ptnr=mhottag

Re: Auto complete

2008-07-08 Thread Shalin Shekhar Mangar

He must be using a nightly build of Solr 1.3 -- I think you can consider
using it as it is quite stable and close to release.

On Tue, Jul 8, 2008 at 10:38 PM, sundar shankar [EMAIL PROTECTED]
wrote:

 Hi Daniel,
 Thanks for the code. I just did observe that you have
 EdgeNGramFilterFactory. I didnt find it in the 1.2 Solr version. Which
 version are you using for this. 1.3 isnt out yet rite. Is there any other
 production version of Solr available that I can use?

 Regards
 Sundar



  Subject: Re: Auto complete From: [EMAIL PROTECTED] To:
 solr-user@lucene.apache.org Date: Tue, 8 Jul 2008 11:30:31 +0100  Hi,
  This is how we implement our autocomplete feature, excerpt from
 schema.xml  -First accept the input as is without alteration -Lowercase
 the input, and eliminate all non a-z0-9 chars to normalize the input
 -split into multiple tokens with EdgeNGramFilterFactory upto a max of 100
 chars, all starting from the beginning of the input, e.g. hello becomes
 h,he,hel,hell,hello.  -For queries we accept the first 20 chars.  Hope
 this helps.   fieldType name=autocomplete class=solr.TextField
 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory / filter
 class=solr.PatternReplaceFilterFactory pattern=([^a-z0-9])
 replacement= replace=all / filter
 class=solr.EdgeNGramFilterFactory maxGramSize=100 minGramSize=1 /
 /analyzer analyzer type=query tokenizer
 class=solr.KeywordTokenizerFactory/ filter
 class=solr.LowerCaseFilterFactory / filter
 class=solr.PatternReplaceFilterFactory pattern=([^a-z0-9])
 replacement= replace=all / filter
 class=solr.PatternReplaceFilterFactory pattern=^(.{20})(.*)?
 replacement=$1 replace=all / /analyzer /fieldType ... field
 name=ac type=autocomplete indexed=true stored=true required=false
 /  Regards, Dan On Mon, 2008-07-07 at 17:12 +, sundar
 shankar wrote:  Hi All,  I am using Solr for some time and am having
 trouble with an auto complete feature that I have been trying to
 incorporate. I am indexing solr as a database column to solr field mapping.
 I have tried various configs that were mentioned in the solr user community
 suggestions and have tried a few option of my own too. Each of them seem to
 either not bring me the exact data I want or seems to get excess data.  
  I have tried.  text_ws,  text,  string  EdgeNGramTokenizerFactory
  the subword example  textTight  and juggling arnd some of the filters
 and analysers togther.Couldnt get dismax to work as somehow it wasnt
 able to connect my field defined in the schema to the qf param that I was
 passing in the request.Text tight was the best results I had but the
 problem there was it was searching for whole words and not part words. 
 exampleif my query String was field1:Word1 word2* I was getting back
 results but if my query string was field1: Word1 wor* I didnt get a result
 back.I am little perplexed on how to implement this. I dont know
 what has to be done.The schema  field name=
 institution.name type=text_ws indexed=true stored=true
 termVectors=true/  !--Sundar changed city to subword so that spaces
 are ignored--field name=instAlphaSort type=alphaOnlySort
 indexed=true stored=false multiValued=true/  !-- Tight text cos we
 want results to be much the same for this--  field name=instText
 type=text indexed=true stored=true termVectors=true
 multiValued=true/  field name=instString type=autosuggest
 indexed=true stored=true termVectors=true multiValued=true/   
 field name=instSubword type=subword indexed=true stored=true
 multiValued=true termVectors=true/  field name=instTight
 type=textTight indexed=true stored=true multiValued=true
 termVectors=true/I Index institution.name only, the
 rest are copy fields of the same.  Any help is appreciated.   
 Thanks  Sundar   
 _  Chose
 your Life Partner? Join MSN Matrimony 
 http://www.shaadi.com/msn/matrimony.php This email has been
 scanned for virus and spam content Daniel Rosher Developer
 www.thehotonlinenetwork.com d: 0207 3489 912  t: 0845 4680 568  f:
 0845 4680 868  m:   Beaumont House, Kensington Village, Avonmore Road,
 London, W14 8TS- - - - - - - - - - - - - - - - - - - - - - - - - - -
 - - - - - - - - - - - - - - - - - - - - - - - - - - - -  This message is
 sent in confidence for the addressee only. It may contain privileged 
 information. The contents are not to be disclosed to anyone other than the
 addressee.  Unauthorised recipients are requested to preserve this
 confidentiality and to advise  us of any errors in transmission. Thank
 you.  hotonline ltd is registered in England  Wales. Registered office:
 One Canada Square,  Canary Wharf, London E14 5AP. Registered No: 1904765.
 _
 Wish to Marry Now? Join Shaadi.com FREE!
 http://www.shaadi.com/registration/user/index.php?ptnr=mhottag

Auto complete

2008-07-07 Thread sundar shankar


Hi All,
   I am using Solr for some time and am having trouble with an auto 
complete feature that I have been trying to incorporate. I am indexing solr as 
a database column to solr field mapping. I have tried various configs that were 
mentioned in the solr user community suggestions and have tried a few option of 
my own too. Each of them seem to either not bring me the exact data I want or 
seems to get excess data.

I have tried.
text_ws,
text,
string
EdgeNGramTokenizerFactory
the subword example
textTight
and juggling arnd some of the filters and analysers togther.

Couldnt get dismax to work as somehow it wasnt able to connect my field defined 
in the schema to the qf param that I was passing in the request.

Text tight was the best results I had but the problem there was it was 
searching for whole words and not part words.
example

if my query String was field1:Word1 word2* I was getting back results but if my 
query string was field1: Word1 wor* I didnt get a result back.

I am little perplexed on how to implement this. I dont know what has to be done.

The schema


   field name=institution.name type=text_ws indexed=true stored=true 
termVectors=true/
   !--Sundar changed city to subword so that spaces are ignored--

   field name=instAlphaSort type=alphaOnlySort indexed=true 
stored=false multiValued=true/
   !-- Tight text cos we want results to be much the same for this--
   field name=instText type=text indexed=true stored=true  
termVectors=true multiValued=true/
   field name=instString type=autosuggest indexed=true stored=true  
termVectors=true multiValued=true/

   field name=instSubword type=subword indexed=true stored=true 
multiValued=true  termVectors=true/
   field name=instTight type=textTight indexed=true stored=true 
multiValued=true  termVectors=true/



I Index institution.name only, the rest are copy fields of the same.


Any help is appreciated.

Thanks
Sundar

_
Chose your Life Partner? Join MSN Matrimony
http://www.shaadi.com/msn/matrimony.php

86 matches

Mail list logo