But about stopwords and stemming, is it a real issue if on one core I've several stemming and stopwords(with a different name), it should work?
Hannes Carl Meyer-2 wrote: > > Hi, > > yes, if you don't handle (stopwords, stemming etc.) a specific language > you > should create a general core. > > In my project I'm supporting 10 languages and if I get unsupported > languages > it is going to be logged and discarded right away! > > Boosting on multiple cores is indeed a problem. An idea would be to merge > the result sets from core0 and core1 and sort by scoring? > > Regards > > Hannes > > On Wed, Oct 15, 2008 at 1:50 PM, sunnyfr <[EMAIL PROTECTED]> wrote: > >> >> >> ok MultiCore is handy indeed to don't have this big index wich manage >> every >> language, >> but when you have one modification to do you have to do it on all of >> them. >> >> And the point as well is it's complicate too boost more one language than >> another one, >> ie with an Italian search video, if we don't have that much video then it >> might be more interesting to bring back english one. >> >> And if there is some language like Slovakia which are not managed by the >> website but people can come from there ... so the video will be stored in >> core0 which will be all language which are not english, spanish, germany >> .. >> french. >> so this kind of garbage core for every language which are not managed ... >> and I think it might be hard to manage. >> >> What do you think? >> >> >> >> Hannes Carl Meyer-2 wrote: >> > >> > I attached an example for you. >> > >> > The challenge with MultiCore is on the client's search logic. It would >> > help >> > if you know which language the person wants to search through. If not >> you >> > would have to perform multiple requests to the multiple cores. Ordinary >> > logic would be: >> > >> > 1. search "chien" in core0 (english) >> > 2. if #1 returned zero results search for "chien" in core1 (french) >> > >> > --- >> > >> > In your client you could even parallelize the requests to minimize >> waiting >> > time. >> > >> > *One feature I didn't try yet is the DistributedSearch (and how it will >> > help >> > with multiple cores)*, find it here: >> > http://wiki.apache.org/solr/DistributedSearch >> > >> > Regards, >> > >> > Hannes >> > >> > On Tue, Oct 14, 2008 at 4:26 PM, sunnyfr <[EMAIL PROTECTED]> wrote: >> > >> >> >> >> Thanks for this explanation, but just to get it properly : >> >> >> >> One core per language, so with the same field and schema just the >> >> language >> >> part and management which is different? >> >> and one core which consider every language which are not managed by >> solr >> >> like russian or ??? >> >> so different request to the dabase.... >> >> ok >> >> >> >> Just don't get really when you look for the word 'chien' on the >> english >> >> website I want get back result from french video because chien is >> french >> >> so >> >> if it doesn't find any english video with chien I need my french video >> >> then. >> >> >> >> Exactly the same for user's core, if somebody look for 'chien' and >> there >> >> is >> >> one user with exactly the same username I would like to show it up. >> >> >> >> thanks for your time, really, >> >> >> >> >> >> >> >> John E. McBride wrote: >> >> > >> >> > Fairly nebulous requirements, but I recently was involved in a >> >> > multilingual search platform. >> >> > >> >> > The approach, translated to solr 1.3 would be to use multicore - one >> >> > core per geography. Then a schema.xml per core, each with a >> different >> >> > language in the porter algorithm, stopwords etc - taken from >> snowball. >> >> > >> >> > Then on the german front end you make requests to the de core, on >> the >> >> > english front end make requests to the english core. >> >> > >> >> > This is much simpler than sorting every language in the one index, >> for >> >> > example german queries will need to be run through the german query >> >> > filters etc. If you have all languages in one schema, then you will >> >> > have to do some front end logic to map the query to the correct >> field. >> >> > >> >> > You have failed to consider internationalisation of the query side >> of >> >> > the process - your field type merely have analysis filters. >> >> > >> >> > Additionally, if the data source for each different geography is >> >> > different it makes sense to separate the indexes and subsequently >> the >> >> > ingestion mechanisms and schedules. >> >> > >> >> > Just a few thoughts. >> >> > >> >> > John >> >> > >> >> > sunnyfr wrote: >> >> >> Hi, >> >> >> >> >> >> I would like to manage properly multi language search motor, >> >> >> I would like your advice about what have I done. >> >> >> >> >> >> Solr1.3 >> >> >> tomcat55 >> >> >> >> >> >> http://www.nabble.com/file/p19954805/schema.xml schema.xml >> >> >> >> >> >> Thanks a lot, >> >> >> >> >> >> >> >> > >> >> > >> >> > >> >> >> >> -- >> >> View this message in context: >> >> >> http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19974618.html >> >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> >> >> >> > >> > Solr1.3 MultiCore Scenario >> > >> > core0 (french) core1 (english) ... >> core8 (russian) >> > |schema.xml schema.xml >> schema.xml >> > |- analyzers |- analyzers |- >> analyzers >> > |-- FrenchAnalyzer |-- EnglishAnalyzer |-- >> RussianAnalyzer >> > |-- FrenchStops |-- EnglishStops >> |-- >> RussianStops >> > |- fields |- fields >> |- fields >> > |-- title |-- title >> |-- title >> > |-- description |-- description >> |-- >> description >> > |-- id |-- id >> |-- id >> > >> >> -- >> View this message in context: >> http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19991949.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19993036.html Sent from the Solr - User mailing list archive at Nabble.com.