Re: Multi-language solr1.3 what would you reckon?

sunnyfr Wed, 15 Oct 2008 06:04:27 -0700

But about stopwords and stemming, is it a real issue if on one core I've
several stemming and stopwords(with a different name), it should work?




Hannes Carl Meyer-2 wrote:
> 
> Hi,
> 
> yes, if you don't handle (stopwords, stemming etc.) a specific language
> you
> should create a general core.
> 
> In my project I'm supporting 10 languages and if I get unsupported
> languages
> it is going to be logged and discarded right away!
> 
> Boosting on multiple cores is indeed a problem. An idea would be to merge
> the result sets from core0 and core1 and sort by scoring?
> 
> Regards
> 
> Hannes
> 
> On Wed, Oct 15, 2008 at 1:50 PM, sunnyfr <[EMAIL PROTECTED]> wrote:
> 
>>
>>
>> ok MultiCore is handy indeed to don't have this big index wich manage
>> every
>> language,
>> but when you have one modification to do you have to do it on all of
>> them.
>>
>> And the point as well is it's complicate too boost more one language than
>> another one,
>> ie with an Italian search video, if we don't have that much video then it
>> might be more interesting to bring back english one.
>>
>> And if there is some language like Slovakia which are not managed by the
>> website but people can come from there ... so the video will be stored in
>> core0 which will be all language which are not english, spanish, germany
>> ..
>> french.
>> so this kind of garbage core for every language which are not managed ...
>> and I think it might be hard to manage.
>>
>> What do you think?
>>
>>
>>
>> Hannes Carl Meyer-2 wrote:
>> >
>> > I attached an example for you.
>> >
>> > The challenge with MultiCore is on the client's search logic. It would
>> > help
>> > if you know which language the person wants to search through. If not
>> you
>> > would have to perform multiple requests to the multiple cores. Ordinary
>> > logic would be:
>> >
>> > 1. search "chien" in core0 (english)
>> > 2. if #1 returned zero results search for "chien" in core1 (french)
>> >
>> > ---
>> >
>> > In your client you could even parallelize the requests to minimize
>> waiting
>> > time.
>> >
>> > *One feature I didn't try yet is the DistributedSearch (and how it will
>> > help
>> > with multiple cores)*, find it here:
>> > http://wiki.apache.org/solr/DistributedSearch
>> >
>> > Regards,
>> >
>> > Hannes
>> >
>> > On Tue, Oct 14, 2008 at 4:26 PM, sunnyfr <[EMAIL PROTECTED]> wrote:
>> >
>> >>
>> >> Thanks for this explanation, but just to get it properly :
>> >>
>> >> One core per language, so with the same field and schema just the
>> >> language
>> >> part and management which is different?
>> >> and one core which consider every language which are not managed by
>> solr
>> >> like russian or ???
>> >> so different request to the dabase....
>> >> ok
>> >>
>> >> Just don't get really when you look for the word 'chien' on the
>> english
>> >> website I want get back result from french video because chien is
>> french
>> >> so
>> >> if it doesn't find any english video with chien I need my french video
>> >> then.
>> >>
>> >> Exactly the same for user's core, if somebody look for 'chien' and
>> there
>> >> is
>> >> one user with exactly the same username I would like to show it up.
>> >>
>> >> thanks for your time, really,
>> >>
>> >>
>> >>
>> >> John E. McBride wrote:
>> >> >
>> >> > Fairly nebulous requirements, but I recently was involved in a
>> >> > multilingual search platform.
>> >> >
>> >> > The approach, translated to solr 1.3 would be to use multicore - one
>> >> > core per geography.  Then a schema.xml per core, each with a
>> different
>> >> > language in the porter algorithm, stopwords etc - taken from
>> snowball.
>> >> >
>> >> > Then on the german front end you make requests to the de core, on
>> the
>> >> > english front end make requests to the english core.
>> >> >
>> >> > This is much simpler than sorting every language in the one index,
>> for
>> >> > example german queries will need to be run through the german query
>> >> > filters etc.  If you have all languages in one schema, then you will
>> >> > have to do some front end logic to map the query to the correct
>> field.
>> >> >
>> >> > You have failed to consider internationalisation of the query side
>> of
>> >> > the process - your field type merely have analysis filters.
>> >> >
>> >> > Additionally, if the data source for each different geography is
>> >> > different it makes sense to separate the indexes and subsequently
>> the
>> >> > ingestion mechanisms and schedules.
>> >> >
>> >> > Just a few thoughts.
>> >> >
>> >> > John
>> >> >
>> >> > sunnyfr wrote:
>> >> >> Hi,
>> >> >>
>> >> >> I would like to manage properly multi language search motor,
>> >> >> I would like your advice about what have I done.
>> >> >>
>> >> >> Solr1.3
>> >> >> tomcat55
>> >> >>
>> >> >> http://www.nabble.com/file/p19954805/schema.xml schema.xml
>> >> >>
>> >> >> Thanks a lot,
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >> >
>> >>
>> >> --
>> >> View this message in context:
>> >>
>> http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19974618.html
>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> >>
>> >>
>> >
>> > Solr1.3 MultiCore Scenario
>> >
>> > core0 (french)                core1 (english)         ...
>> core8 (russian)
>> > |schema.xml                   schema.xml
>>      schema.xml
>> > |- analyzers          |- analyzers                            |-
>> analyzers
>> > |-- FrenchAnalyzer    |-- EnglishAnalyzer                     |--
>> RussianAnalyzer
>> > |-- FrenchStops               |-- EnglishStops                       
>> |--
>> RussianStops
>> > |- fields                     |- fields
>>     |- fields
>> > |-- title                     |-- title
>>     |-- title
>> > |-- description               |-- description                        
>> |--
>> description
>> > |-- id                                |-- id
>>              |-- id
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19991949.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19993036.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Multi-language solr1.3 what would you reckon?

Reply via email to