Re: Multi-language solr1.3 what would you reckon?

2008-10-15 Thread sunnyfr


ok MultiCore is handy indeed to don't have this big index wich manage every
language,
but when you have one modification to do you have to do it on all of them.

And the point as well is it's complicate too boost more one language than
another one,
ie with an Italian search video, if we don't have that much video then it
might be more interesting to bring back english one.

And if there is some language like Slovakia which are not managed by the
website but people can come from there ... so the video will be stored in
core0 which will be all language which are not english, spanish, germany ..
french.
so this kind of garbage core for every language which are not managed ...
and I think it might be hard to manage.

What do you think? 



Hannes Carl Meyer-2 wrote:
 
 I attached an example for you.
 
 The challenge with MultiCore is on the client's search logic. It would
 help
 if you know which language the person wants to search through. If not you
 would have to perform multiple requests to the multiple cores. Ordinary
 logic would be:
 
 1. search chien in core0 (english)
 2. if #1 returned zero results search for chien in core1 (french)
 
 ---
 
 In your client you could even parallelize the requests to minimize waiting
 time.
 
 *One feature I didn't try yet is the DistributedSearch (and how it will
 help
 with multiple cores)*, find it here:
 http://wiki.apache.org/solr/DistributedSearch
 
 Regards,
 
 Hannes
 
 On Tue, Oct 14, 2008 at 4:26 PM, sunnyfr [EMAIL PROTECTED] wrote:
 

 Thanks for this explanation, but just to get it properly :

 One core per language, so with the same field and schema just the
 language
 part and management which is different?
 and one core which consider every language which are not managed by solr
 like russian or ???
 so different request to the dabase
 ok

 Just don't get really when you look for the word 'chien' on the english
 website I want get back result from french video because chien is french
 so
 if it doesn't find any english video with chien I need my french video
 then.

 Exactly the same for user's core, if somebody look for 'chien' and there
 is
 one user with exactly the same username I would like to show it up.

 thanks for your time, really,



 John E. McBride wrote:
 
  Fairly nebulous requirements, but I recently was involved in a
  multilingual search platform.
 
  The approach, translated to solr 1.3 would be to use multicore - one
  core per geography.  Then a schema.xml per core, each with a different
  language in the porter algorithm, stopwords etc - taken from snowball.
 
  Then on the german front end you make requests to the de core, on the
  english front end make requests to the english core.
 
  This is much simpler than sorting every language in the one index, for
  example german queries will need to be run through the german query
  filters etc.  If you have all languages in one schema, then you will
  have to do some front end logic to map the query to the correct field.
 
  You have failed to consider internationalisation of the query side of
  the process - your field type merely have analysis filters.
 
  Additionally, if the data source for each different geography is
  different it makes sense to separate the indexes and subsequently the
  ingestion mechanisms and schedules.
 
  Just a few thoughts.
 
  John
 
  sunnyfr wrote:
  Hi,
 
  I would like to manage properly multi language search motor,
  I would like your advice about what have I done.
 
  Solr1.3
  tomcat55
 
  http://www.nabble.com/file/p19954805/schema.xml schema.xml
 
  Thanks a lot,
 
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19974618.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 Solr1.3 MultiCore Scenario
 
 core0 (french)core1 (english) ... core8 
 (russian)
 |schema.xml   schema.xml  
 schema.xml
 |- analyzers  |- analyzers|- analyzers
 |-- FrenchAnalyzer|-- EnglishAnalyzer |-- 
 RussianAnalyzer 
 |-- FrenchStops   |-- EnglishStops|-- 
 RussianStops
 |- fields |- fields   
 |- fields
 |-- title |-- title   
 |-- title
 |-- description   |-- description |-- 
 description
 |-- id|-- id  
 |-- id
 

-- 
View this message in context: 
http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19991949.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Multi-language solr1.3 what would you reckon?

2008-10-15 Thread sunnyfr

Hi,

Sorry I didnt get your example can you send me it again? 
thanks,


Hannes Carl Meyer-2 wrote:
 
 I attached an example for you.
 
 The challenge with MultiCore is on the client's search logic. It would
 help
 if you know which language the person wants to search through. If not you
 would have to perform multiple requests to the multiple cores. Ordinary
 logic would be:
 
 1. search chien in core0 (english)
 2. if #1 returned zero results search for chien in core1 (french)
 
 ---
 
 In your client you could even parallelize the requests to minimize waiting
 time.
 
 *One feature I didn't try yet is the DistributedSearch (and how it will
 help
 with multiple cores)*, find it here:
 http://wiki.apache.org/solr/DistributedSearch
 
 Regards,
 
 Hannes
 
 On Tue, Oct 14, 2008 at 4:26 PM, sunnyfr [EMAIL PROTECTED] wrote:
 

 Thanks for this explanation, but just to get it properly :

 One core per language, so with the same field and schema just the
 language
 part and management which is different?
 and one core which consider every language which are not managed by solr
 like russian or ???
 so different request to the dabase
 ok

 Just don't get really when you look for the word 'chien' on the english
 website I want get back result from french video because chien is french
 so
 if it doesn't find any english video with chien I need my french video
 then.

 Exactly the same for user's core, if somebody look for 'chien' and there
 is
 one user with exactly the same username I would like to show it up.

 thanks for your time, really,



 John E. McBride wrote:
 
  Fairly nebulous requirements, but I recently was involved in a
  multilingual search platform.
 
  The approach, translated to solr 1.3 would be to use multicore - one
  core per geography.  Then a schema.xml per core, each with a different
  language in the porter algorithm, stopwords etc - taken from snowball.
 
  Then on the german front end you make requests to the de core, on the
  english front end make requests to the english core.
 
  This is much simpler than sorting every language in the one index, for
  example german queries will need to be run through the german query
  filters etc.  If you have all languages in one schema, then you will
  have to do some front end logic to map the query to the correct field.
 
  You have failed to consider internationalisation of the query side of
  the process - your field type merely have analysis filters.
 
  Additionally, if the data source for each different geography is
  different it makes sense to separate the indexes and subsequently the
  ingestion mechanisms and schedules.
 
  Just a few thoughts.
 
  John
 
  sunnyfr wrote:
  Hi,
 
  I would like to manage properly multi language search motor,
  I would like your advice about what have I done.
 
  Solr1.3
  tomcat55
 
  http://www.nabble.com/file/p19954805/schema.xml schema.xml
 
  Thanks a lot,
 
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19974618.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 Solr1.3 MultiCore Scenario
 
 core0 (french)core1 (english) ... core8 
 (russian)
 |schema.xml   schema.xml  
 schema.xml
 |- analyzers  |- analyzers|- analyzers
 |-- FrenchAnalyzer|-- EnglishAnalyzer |-- 
 RussianAnalyzer 
 |-- FrenchStops   |-- EnglishStops|-- 
 RussianStops
 |- fields |- fields   
 |- fields
 |-- title |-- title   
 |-- title
 |-- description   |-- description |-- 
 description
 |-- id|-- id  
 |-- id
 

-- 
View this message in context: 
http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19990348.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Multi-language solr1.3 what would you reckon?

2008-10-15 Thread Hannes Carl Meyer
It should work, but if you want to handle multiple languages in ONE index
you end up with a lot of filters and fields handled with different analyzers
in a SINGLE configuration.

On Wed, Oct 15, 2008 at 3:03 PM, sunnyfr [EMAIL PROTECTED] wrote:


 But about stopwords and stemming, is it a real issue if on one core I've
 several stemming and stopwords(with a different name), it should work?



 Hannes Carl Meyer-2 wrote:
 
  Hi,
 
  yes, if you don't handle (stopwords, stemming etc.) a specific language
  you
  should create a general core.
 
  In my project I'm supporting 10 languages and if I get unsupported
  languages
  it is going to be logged and discarded right away!
 
  Boosting on multiple cores is indeed a problem. An idea would be to merge
  the result sets from core0 and core1 and sort by scoring?
 
  Regards
 
  Hannes
 
  On Wed, Oct 15, 2008 at 1:50 PM, sunnyfr [EMAIL PROTECTED] wrote:
 
 
 
  ok MultiCore is handy indeed to don't have this big index wich manage
  every
  language,
  but when you have one modification to do you have to do it on all of
  them.
 
  And the point as well is it's complicate too boost more one language
 than
  another one,
  ie with an Italian search video, if we don't have that much video then
 it
  might be more interesting to bring back english one.
 
  And if there is some language like Slovakia which are not managed by the
  website but people can come from there ... so the video will be stored
 in
  core0 which will be all language which are not english, spanish, germany
  ..
  french.
  so this kind of garbage core for every language which are not managed
 ...
  and I think it might be hard to manage.
 
  What do you think?
 
 
 
  Hannes Carl Meyer-2 wrote:
  
   I attached an example for you.
  
   The challenge with MultiCore is on the client's search logic. It would
   help
   if you know which language the person wants to search through. If not
  you
   would have to perform multiple requests to the multiple cores.
 Ordinary
   logic would be:
  
   1. search chien in core0 (english)
   2. if #1 returned zero results search for chien in core1 (french)
  
   ---
  
   In your client you could even parallelize the requests to minimize
  waiting
   time.
  
   *One feature I didn't try yet is the DistributedSearch (and how it
 will
   help
   with multiple cores)*, find it here:
   http://wiki.apache.org/solr/DistributedSearch
  
   Regards,
  
   Hannes
  
   On Tue, Oct 14, 2008 at 4:26 PM, sunnyfr [EMAIL PROTECTED]
 wrote:
  
  
   Thanks for this explanation, but just to get it properly :
  
   One core per language, so with the same field and schema just the
   language
   part and management which is different?
   and one core which consider every language which are not managed by
  solr
   like russian or ???
   so different request to the dabase
   ok
  
   Just don't get really when you look for the word 'chien' on the
  english
   website I want get back result from french video because chien is
  french
   so
   if it doesn't find any english video with chien I need my french
 video
   then.
  
   Exactly the same for user's core, if somebody look for 'chien' and
  there
   is
   one user with exactly the same username I would like to show it up.
  
   thanks for your time, really,
  
  
  
   John E. McBride wrote:
   
Fairly nebulous requirements, but I recently was involved in a
multilingual search platform.
   
The approach, translated to solr 1.3 would be to use multicore -
 one
core per geography.  Then a schema.xml per core, each with a
  different
language in the porter algorithm, stopwords etc - taken from
  snowball.
   
Then on the german front end you make requests to the de core, on
  the
english front end make requests to the english core.
   
This is much simpler than sorting every language in the one index,
  for
example german queries will need to be run through the german query
filters etc.  If you have all languages in one schema, then you
 will
have to do some front end logic to map the query to the correct
  field.
   
You have failed to consider internationalisation of the query side
  of
the process - your field type merely have analysis filters.
   
Additionally, if the data source for each different geography is
different it makes sense to separate the indexes and subsequently
  the
ingestion mechanisms and schedules.
   
Just a few thoughts.
   
John
   
sunnyfr wrote:
Hi,
   
I would like to manage properly multi language search motor,
I would like your advice about what have I done.
   
Solr1.3
tomcat55
   
http://www.nabble.com/file/p19954805/schema.xml schema.xml
   
Thanks a lot,
   
   
   
   
   
  
   --
   View this message in context:
  
 
 http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19974618.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
  

Re: Multi-language solr1.3 what would you reckon?

2008-10-15 Thread sunnyfr

But about stopwords and stemming, is it a real issue if on one core I've
several stemming and stopwords(with a different name), it should work? 



Hannes Carl Meyer-2 wrote:
 
 Hi,
 
 yes, if you don't handle (stopwords, stemming etc.) a specific language
 you
 should create a general core.
 
 In my project I'm supporting 10 languages and if I get unsupported
 languages
 it is going to be logged and discarded right away!
 
 Boosting on multiple cores is indeed a problem. An idea would be to merge
 the result sets from core0 and core1 and sort by scoring?
 
 Regards
 
 Hannes
 
 On Wed, Oct 15, 2008 at 1:50 PM, sunnyfr [EMAIL PROTECTED] wrote:
 


 ok MultiCore is handy indeed to don't have this big index wich manage
 every
 language,
 but when you have one modification to do you have to do it on all of
 them.

 And the point as well is it's complicate too boost more one language than
 another one,
 ie with an Italian search video, if we don't have that much video then it
 might be more interesting to bring back english one.

 And if there is some language like Slovakia which are not managed by the
 website but people can come from there ... so the video will be stored in
 core0 which will be all language which are not english, spanish, germany
 ..
 french.
 so this kind of garbage core for every language which are not managed ...
 and I think it might be hard to manage.

 What do you think?



 Hannes Carl Meyer-2 wrote:
 
  I attached an example for you.
 
  The challenge with MultiCore is on the client's search logic. It would
  help
  if you know which language the person wants to search through. If not
 you
  would have to perform multiple requests to the multiple cores. Ordinary
  logic would be:
 
  1. search chien in core0 (english)
  2. if #1 returned zero results search for chien in core1 (french)
 
  ---
 
  In your client you could even parallelize the requests to minimize
 waiting
  time.
 
  *One feature I didn't try yet is the DistributedSearch (and how it will
  help
  with multiple cores)*, find it here:
  http://wiki.apache.org/solr/DistributedSearch
 
  Regards,
 
  Hannes
 
  On Tue, Oct 14, 2008 at 4:26 PM, sunnyfr [EMAIL PROTECTED] wrote:
 
 
  Thanks for this explanation, but just to get it properly :
 
  One core per language, so with the same field and schema just the
  language
  part and management which is different?
  and one core which consider every language which are not managed by
 solr
  like russian or ???
  so different request to the dabase
  ok
 
  Just don't get really when you look for the word 'chien' on the
 english
  website I want get back result from french video because chien is
 french
  so
  if it doesn't find any english video with chien I need my french video
  then.
 
  Exactly the same for user's core, if somebody look for 'chien' and
 there
  is
  one user with exactly the same username I would like to show it up.
 
  thanks for your time, really,
 
 
 
  John E. McBride wrote:
  
   Fairly nebulous requirements, but I recently was involved in a
   multilingual search platform.
  
   The approach, translated to solr 1.3 would be to use multicore - one
   core per geography.  Then a schema.xml per core, each with a
 different
   language in the porter algorithm, stopwords etc - taken from
 snowball.
  
   Then on the german front end you make requests to the de core, on
 the
   english front end make requests to the english core.
  
   This is much simpler than sorting every language in the one index,
 for
   example german queries will need to be run through the german query
   filters etc.  If you have all languages in one schema, then you will
   have to do some front end logic to map the query to the correct
 field.
  
   You have failed to consider internationalisation of the query side
 of
   the process - your field type merely have analysis filters.
  
   Additionally, if the data source for each different geography is
   different it makes sense to separate the indexes and subsequently
 the
   ingestion mechanisms and schedules.
  
   Just a few thoughts.
  
   John
  
   sunnyfr wrote:
   Hi,
  
   I would like to manage properly multi language search motor,
   I would like your advice about what have I done.
  
   Solr1.3
   tomcat55
  
   http://www.nabble.com/file/p19954805/schema.xml schema.xml
  
   Thanks a lot,
  
  
  
  
  
 
  --
  View this message in context:
 
 http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19974618.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
  Solr1.3 MultiCore Scenario
 
  core0 (french)core1 (english) ...
 core8 (russian)
  |schema.xml   schema.xml
  schema.xml
  |- analyzers  |- analyzers|-
 analyzers
  |-- FrenchAnalyzer|-- EnglishAnalyzer |--
 RussianAnalyzer
  |-- FrenchStops   |-- EnglishStops   
 |--
 

Re: Multi-language solr1.3 what would you reckon?

2008-10-15 Thread Hannes Carl Meyer
Hi,

yes, if you don't handle (stopwords, stemming etc.) a specific language you
should create a general core.

In my project I'm supporting 10 languages and if I get unsupported languages
it is going to be logged and discarded right away!

Boosting on multiple cores is indeed a problem. An idea would be to merge
the result sets from core0 and core1 and sort by scoring?

Regards

Hannes

On Wed, Oct 15, 2008 at 1:50 PM, sunnyfr [EMAIL PROTECTED] wrote:



 ok MultiCore is handy indeed to don't have this big index wich manage every
 language,
 but when you have one modification to do you have to do it on all of them.

 And the point as well is it's complicate too boost more one language than
 another one,
 ie with an Italian search video, if we don't have that much video then it
 might be more interesting to bring back english one.

 And if there is some language like Slovakia which are not managed by the
 website but people can come from there ... so the video will be stored in
 core0 which will be all language which are not english, spanish, germany ..
 french.
 so this kind of garbage core for every language which are not managed ...
 and I think it might be hard to manage.

 What do you think?



 Hannes Carl Meyer-2 wrote:
 
  I attached an example for you.
 
  The challenge with MultiCore is on the client's search logic. It would
  help
  if you know which language the person wants to search through. If not you
  would have to perform multiple requests to the multiple cores. Ordinary
  logic would be:
 
  1. search chien in core0 (english)
  2. if #1 returned zero results search for chien in core1 (french)
 
  ---
 
  In your client you could even parallelize the requests to minimize
 waiting
  time.
 
  *One feature I didn't try yet is the DistributedSearch (and how it will
  help
  with multiple cores)*, find it here:
  http://wiki.apache.org/solr/DistributedSearch
 
  Regards,
 
  Hannes
 
  On Tue, Oct 14, 2008 at 4:26 PM, sunnyfr [EMAIL PROTECTED] wrote:
 
 
  Thanks for this explanation, but just to get it properly :
 
  One core per language, so with the same field and schema just the
  language
  part and management which is different?
  and one core which consider every language which are not managed by solr
  like russian or ???
  so different request to the dabase
  ok
 
  Just don't get really when you look for the word 'chien' on the english
  website I want get back result from french video because chien is french
  so
  if it doesn't find any english video with chien I need my french video
  then.
 
  Exactly the same for user's core, if somebody look for 'chien' and there
  is
  one user with exactly the same username I would like to show it up.
 
  thanks for your time, really,
 
 
 
  John E. McBride wrote:
  
   Fairly nebulous requirements, but I recently was involved in a
   multilingual search platform.
  
   The approach, translated to solr 1.3 would be to use multicore - one
   core per geography.  Then a schema.xml per core, each with a different
   language in the porter algorithm, stopwords etc - taken from snowball.
  
   Then on the german front end you make requests to the de core, on the
   english front end make requests to the english core.
  
   This is much simpler than sorting every language in the one index, for
   example german queries will need to be run through the german query
   filters etc.  If you have all languages in one schema, then you will
   have to do some front end logic to map the query to the correct field.
  
   You have failed to consider internationalisation of the query side of
   the process - your field type merely have analysis filters.
  
   Additionally, if the data source for each different geography is
   different it makes sense to separate the indexes and subsequently the
   ingestion mechanisms and schedules.
  
   Just a few thoughts.
  
   John
  
   sunnyfr wrote:
   Hi,
  
   I would like to manage properly multi language search motor,
   I would like your advice about what have I done.
  
   Solr1.3
   tomcat55
  
   http://www.nabble.com/file/p19954805/schema.xml schema.xml
  
   Thanks a lot,
  
  
  
  
  
 
  --
  View this message in context:
 
 http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19974618.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
  Solr1.3 MultiCore Scenario
 
  core0 (french)core1 (english) ...
 core8 (russian)
  |schema.xml   schema.xml
  schema.xml
  |- analyzers  |- analyzers|-
 analyzers
  |-- FrenchAnalyzer|-- EnglishAnalyzer |--
 RussianAnalyzer
  |-- FrenchStops   |-- EnglishStops|--
 RussianStops
  |- fields |- fields
 |- fields
  |-- title |-- title
 |-- title
  |-- description   |-- description |--
 description
  

Re: Multi-language solr1.3 what would you reckon?

2008-10-14 Thread sunnyfr

is it ??? 


sunnyfr wrote:
 
 Ok so actually multi-core is multi-index?
 Cheers for this links
 
 
 Hannes Carl Meyer-2 wrote:
 
 Nope, your schema defines a single index with alle languages being
 stored.
 The other way would be MultiCore/MultipleIndexes as described here:
 http://wiki.apache.org/solr/CoreAdmin and
 http://wiki.apache.org/solr/MultipleIndexes#head-e517417ef9b96e32168b2cf35ab6ff393f360d59
 
 On Mon, Oct 13, 2008 at 5:05 PM, sunnyfr [EMAIL PROTECTED] wrote:
 

 But I don't get, if you look in my schema.xml it's what I've done, multi
 index?
 So I was right ?


 Hannes Carl Meyer-2 wrote:
 
  Hi Ralf,
 
  you should also check on the example inside the Solr 1.3 download
 package!
 
  The management of multiple languages inside multiple indexes really
 makes
  sense in terms of configuration efforts (look at your big kahuna
  configuration file!), performance and gives an additional
 scalibility
  feature (in fact that you index/search in multiple cores which could
 be
  theoretically placed on different machines).
 
  But, from the perspecitve of the search client you will have to
 execute
  search processes on multiple cores simultaneously. If this is feasible
 you
  should really think about using multiple indexes.
 
  Regards,
 
  Hannes
 
  On Mon, Oct 13, 2008 at 4:14 PM, Kraus, Ralf | pixelhouse GmbH 
  [EMAIL PROTECTED] wrote:
 
  Hannes Carl Meyer schrieb:
 
  Hi,
 
  is it really neccessary to put it all into one index? You could also
 use
  the
  Solr MultiCore/MultipleIndexes feature and seperate by language.
 
 
  Is there a good webpage with infos about the multiindex-feature ?
  I know http://wiki.apache.org/solr/MultipleIndexes but there is not
  enough
  info :-(
 
 
  Greets -Ralf-
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19956421.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19970307.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Multi-language solr1.3 what would you reckon?

2008-10-14 Thread sunnyfr

Thanks for this explanation, but just to get it properly :

One core per language, so with the same field and schema just the language
part and management which is different?
and one core which consider every language which are not managed by solr
like russian or ??? 
so different request to the dabase 
ok 

Just don't get really when you look for the word 'chien' on the english
website I want get back result from french video because chien is french so
if it doesn't find any english video with chien I need my french video then.

Exactly the same for user's core, if somebody look for 'chien' and there is
one user with exactly the same username I would like to show it up.

thanks for your time, really,



John E. McBride wrote:
 
 Fairly nebulous requirements, but I recently was involved in a 
 multilingual search platform.
 
 The approach, translated to solr 1.3 would be to use multicore - one 
 core per geography.  Then a schema.xml per core, each with a different 
 language in the porter algorithm, stopwords etc - taken from snowball.
 
 Then on the german front end you make requests to the de core, on the 
 english front end make requests to the english core.
 
 This is much simpler than sorting every language in the one index, for 
 example german queries will need to be run through the german query 
 filters etc.  If you have all languages in one schema, then you will 
 have to do some front end logic to map the query to the correct field.
 
 You have failed to consider internationalisation of the query side of 
 the process - your field type merely have analysis filters. 
 
 Additionally, if the data source for each different geography is 
 different it makes sense to separate the indexes and subsequently the 
 ingestion mechanisms and schedules.
 
 Just a few thoughts.
 
 John
 
 sunnyfr wrote:
 Hi,

 I would like to manage properly multi language search motor,
 I would like your advice about what have I done.

 Solr1.3
 tomcat55

 http://www.nabble.com/file/p19954805/schema.xml schema.xml 

 Thanks a lot,

   
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19974618.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Multi-language solr1.3 what would you reckon?

2008-10-14 Thread Hannes Carl Meyer
I attached an example for you.

The challenge with MultiCore is on the client's search logic. It would help
if you know which language the person wants to search through. If not you
would have to perform multiple requests to the multiple cores. Ordinary
logic would be:

1. search chien in core0 (english)
2. if #1 returned zero results search for chien in core1 (french)

---

In your client you could even parallelize the requests to minimize waiting
time.

*One feature I didn't try yet is the DistributedSearch (and how it will help
with multiple cores)*, find it here:
http://wiki.apache.org/solr/DistributedSearch

Regards,

Hannes

On Tue, Oct 14, 2008 at 4:26 PM, sunnyfr [EMAIL PROTECTED] wrote:


 Thanks for this explanation, but just to get it properly :

 One core per language, so with the same field and schema just the language
 part and management which is different?
 and one core which consider every language which are not managed by solr
 like russian or ???
 so different request to the dabase
 ok

 Just don't get really when you look for the word 'chien' on the english
 website I want get back result from french video because chien is french so
 if it doesn't find any english video with chien I need my french video
 then.

 Exactly the same for user's core, if somebody look for 'chien' and there is
 one user with exactly the same username I would like to show it up.

 thanks for your time, really,



 John E. McBride wrote:
 
  Fairly nebulous requirements, but I recently was involved in a
  multilingual search platform.
 
  The approach, translated to solr 1.3 would be to use multicore - one
  core per geography.  Then a schema.xml per core, each with a different
  language in the porter algorithm, stopwords etc - taken from snowball.
 
  Then on the german front end you make requests to the de core, on the
  english front end make requests to the english core.
 
  This is much simpler than sorting every language in the one index, for
  example german queries will need to be run through the german query
  filters etc.  If you have all languages in one schema, then you will
  have to do some front end logic to map the query to the correct field.
 
  You have failed to consider internationalisation of the query side of
  the process - your field type merely have analysis filters.
 
  Additionally, if the data source for each different geography is
  different it makes sense to separate the indexes and subsequently the
  ingestion mechanisms and schedules.
 
  Just a few thoughts.
 
  John
 
  sunnyfr wrote:
  Hi,
 
  I would like to manage properly multi language search motor,
  I would like your advice about what have I done.
 
  Solr1.3
  tomcat55
 
  http://www.nabble.com/file/p19954805/schema.xml schema.xml
 
  Thanks a lot,
 
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19974618.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Solr1.3 MultiCore Scenario

core0 (french)  core1 (english) ... core8 (russian)
|schema.xml schema.xml  
schema.xml
|- analyzers|- analyzers|- analyzers
|-- FrenchAnalyzer  |-- EnglishAnalyzer |-- 
RussianAnalyzer 
|-- FrenchStops |-- EnglishStops|-- 
RussianStops
|- fields   |- fields   
|- fields
|-- title   |-- title   
|-- title
|-- description |-- description |-- description
|-- id  |-- id  
|-- id

Re: Multi-language solr1.3 what would you reckon?

2008-10-14 Thread Hannes Carl Meyer
Sorry, yes MultiCore means multiple indexes!

Regards,

Hannes

On Tue, Oct 14, 2008 at 11:53 AM, sunnyfr [EMAIL PROTECTED] wrote:


 is it ???


 sunnyfr wrote:
 
  Ok so actually multi-core is multi-index?
  Cheers for this links
 
 
  Hannes Carl Meyer-2 wrote:
 
  Nope, your schema defines a single index with alle languages being
  stored.
  The other way would be MultiCore/MultipleIndexes as described here:
  http://wiki.apache.org/solr/CoreAdmin and
 
 http://wiki.apache.org/solr/MultipleIndexes#head-e517417ef9b96e32168b2cf35ab6ff393f360d59
 
  On Mon, Oct 13, 2008 at 5:05 PM, sunnyfr [EMAIL PROTECTED] wrote:
 
 
  But I don't get, if you look in my schema.xml it's what I've done,
 multi
  index?
  So I was right ?
 
 
  Hannes Carl Meyer-2 wrote:
  
   Hi Ralf,
  
   you should also check on the example inside the Solr 1.3 download
  package!
  
   The management of multiple languages inside multiple indexes really
  makes
   sense in terms of configuration efforts (look at your big kahuna
   configuration file!), performance and gives an additional
  scalibility
   feature (in fact that you index/search in multiple cores which could
  be
   theoretically placed on different machines).
  
   But, from the perspecitve of the search client you will have to
  execute
   search processes on multiple cores simultaneously. If this is
 feasible
  you
   should really think about using multiple indexes.
  
   Regards,
  
   Hannes
  
   On Mon, Oct 13, 2008 at 4:14 PM, Kraus, Ralf | pixelhouse GmbH 
   [EMAIL PROTECTED] wrote:
  
   Hannes Carl Meyer schrieb:
  
   Hi,
  
   is it really neccessary to put it all into one index? You could
 also
  use
   the
   Solr MultiCore/MultipleIndexes feature and seperate by language.
  
  
   Is there a good webpage with infos about the multiindex-feature ?
   I know http://wiki.apache.org/solr/MultipleIndexes but there is not
   enough
   info :-(
  
  
   Greets -Ralf-
  
  
  
  
 
  --
  View this message in context:
 
 http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19956421.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19970307.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: Multi-language solr1.3 what would you reckon?

2008-10-13 Thread Hannes Carl Meyer
Hi,

is it really neccessary to put it all into one index? You could also use the
Solr MultiCore/MultipleIndexes feature and seperate by language.

Regards,

Hannes

On Mon, Oct 13, 2008 at 3:20 PM, sunnyfr [EMAIL PROTECTED] wrote:


 Hi,

 I would like to manage properly multi language search motor,
 I would like your advice about what have I done.

 Solr1.3
 tomcat55

 http://www.nabble.com/file/p19954805/schema.xml schema.xml

 Thanks a lot,

 --
 View this message in context:
 http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19954805.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: Multi-language solr1.3 what would you reckon?

2008-10-13 Thread John E. McBride
Fairly nebulous requirements, but I recently was involved in a 
multilingual search platform.


The approach, translated to solr 1.3 would be to use multicore - one 
core per geography.  Then a schema.xml per core, each with a different 
language in the porter algorithm, stopwords etc - taken from snowball.


Then on the german front end you make requests to the de core, on the 
english front end make requests to the english core.


This is much simpler than sorting every language in the one index, for 
example german queries will need to be run through the german query 
filters etc.  If you have all languages in one schema, then you will 
have to do some front end logic to map the query to the correct field.


You have failed to consider internationalisation of the query side of 
the process - your field type merely have analysis filters. 

Additionally, if the data source for each different geography is 
different it makes sense to separate the indexes and subsequently the 
ingestion mechanisms and schedules.


Just a few thoughts.

John

sunnyfr wrote:

Hi,

I would like to manage properly multi language search motor,
I would like your advice about what have I done.

Solr1.3
tomcat55

http://www.nabble.com/file/p19954805/schema.xml schema.xml 


Thanks a lot,

  




Re: Multi-language solr1.3 what would you reckon?

2008-10-13 Thread sunnyfr

Hi,

Thanks guys for your answer, but I don't think I can use multi-core for each
language, 
because for exemple if somebody is connected from Italia and if there is not
that much Italian's book,
so by default I will show up few italian books but all the english one as
well.

Do you have an example ? 
I'm quite lost about it,


John E. McBride wrote:
 
 Fairly nebulous requirements, but I recently was involved in a 
 multilingual search platform.
 
 The approach, translated to solr 1.3 would be to use multicore - one 
 core per geography.  Then a schema.xml per core, each with a different 
 language in the porter algorithm, stopwords etc - taken from snowball.
 
 Then on the german front end you make requests to the de core, on the 
 english front end make requests to the english core.
 
 This is much simpler than sorting every language in the one index, for 
 example german queries will need to be run through the german query 
 filters etc.  If you have all languages in one schema, then you will 
 have to do some front end logic to map the query to the correct field.
 
 You have failed to consider internationalisation of the query side of 
 the process - your field type merely have analysis filters. 
 
 Additionally, if the data source for each different geography is 
 different it makes sense to separate the indexes and subsequently the 
 ingestion mechanisms and schedules.
 
 Just a few thoughts.
 
 John
 
 sunnyfr wrote:
 Hi,

 I would like to manage properly multi language search motor,
 I would like your advice about what have I done.

 Solr1.3
 tomcat55

 http://www.nabble.com/file/p19954805/schema.xml schema.xml 

 Thanks a lot,

   
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19955092.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Multi-language solr1.3 what would you reckon?

2008-10-13 Thread John E. McBride
Well, it's this section shown below, which would change from geography 
to geography.

Parameterise the EnglishPorterFilterFactory and protwords.

You could introduce logic in the front end which asks if num results is 
zero then makes a call to the english language, but it doesn't make 
logical sense?  why would a search in the italian language bring up 
anything in the english index?


I think you need to explain your application in a little more detail.


fieldType name=text class=solr.TextField positionIncrementGap=100
-
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
-
!--
in this example, we will only use synonyms at query time
   filter class=solr.SynonymFilterFactory 
synonyms=index_synonyms.txt ignoreCase=true expand=false/
  
--

-
!--
Case insensitive stop word removal.
enablePositionIncrements=true ensures that a 'gap' is left to
allow for accurate phrase queries.
  
--
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=1 catenateNumbers=1 
catenateAll=0 splitOnCaseChange=1/

filter class=solr.LowerCaseFilterFactory/
filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
-
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=0 catenateNumbers=0 
catenateAll=0 splitOnCaseChange=1/

filter class=solr.LowerCaseFilterFactory/
filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
/fieldType

sunnyfr wrote:

Hi,

Thanks guys for your answer, but I don't think I can use multi-core for each
language, 
because for exemple if somebody is connected from Italia and if there is not

that much Italian's book,
so by default I will show up few italian books but all the english one as
well.

Do you have an example ? 
I'm quite lost about it,



John E. McBride wrote:
  
Fairly nebulous requirements, but I recently was involved in a 
multilingual search platform.


The approach, translated to solr 1.3 would be to use multicore - one 
core per geography.  Then a schema.xml per core, each with a different 
language in the porter algorithm, stopwords etc - taken from snowball.


Then on the german front end you make requests to the de core, on the 
english front end make requests to the english core.


This is much simpler than sorting every language in the one index, for 
example german queries will need to be run through the german query 
filters etc.  If you have all languages in one schema, then you will 
have to do some front end logic to map the query to the correct field.


You have failed to consider internationalisation of the query side of 
the process - your field type merely have analysis filters. 

Additionally, if the data source for each different geography is 
different it makes sense to separate the indexes and subsequently the 
ingestion mechanisms and schedules.


Just a few thoughts.

John

sunnyfr wrote:


Hi,

I would like to manage properly multi language search motor,
I would like your advice about what have I done.

Solr1.3
tomcat55

http://www.nabble.com/file/p19954805/schema.xml schema.xml 


Thanks a lot,

  
  





  




Re: Multi-language solr1.3 what would you reckon?

2008-10-13 Thread sunnyfr

What is the problem with the way that I've done, 
Does that's means that there is some which are linked with language that we
won't manage by search,
there is too many language, the application will be for video,
we will manage around 10 language, but in our database we have around  25
language, 
Should i create a core text and others like text_en, text_fr, text_es, and
all the video which are not in this language manage by the search engine
should be stored in text ?

Because even if they are on the english website they should be able if they
enter a french word chien for dog
to find french videos.
I don't know if I'm clear??

and even so text should manage all the other language which are not managed
in the other cores ?? 

thanks


John E. McBride wrote:
 
 Well, it's this section shown below, which would change from geography 
 to geography.
 Parameterise the EnglishPorterFilterFactory and protwords.
 
 You could introduce logic in the front end which asks if num results is 
 zero then makes a call to the english language, but it doesn't make 
 logical sense?  why would a search in the italian language bring up 
 anything in the english index?
 
 I think you need to explain your application in a little more detail.
 
 
 fieldType name=text class=solr.TextField positionIncrementGap=100
 -
 analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 -
 !--
  in this example, we will only use synonyms at query time
 filter class=solr.SynonymFilterFactory 
 synonyms=index_synonyms.txt ignoreCase=true expand=false/

 --
 -
 !--
  Case insensitive stop word removal.
  enablePositionIncrements=true ensures that a 'gap' is left to
  allow for accurate phrase queries.

 --
 filter class=solr.StopFilterFactory ignoreCase=true 
 words=stopwords.txt enablePositionIncrements=true/
 filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
 generateNumberParts=1 catenateWords=1 catenateNumbers=1 
 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EnglishPorterFilterFactory
 protected=protwords.txt/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
 /analyzer
 -
 analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
 ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory ignoreCase=true 
 words=stopwords.txt/
 filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
 generateNumberParts=1 catenateWords=0 catenateNumbers=0 
 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EnglishPorterFilterFactory
 protected=protwords.txt/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
 /analyzer
 /fieldType
 
 sunnyfr wrote:
 Hi,

 Thanks guys for your answer, but I don't think I can use multi-core for
 each
 language, 
 because for exemple if somebody is connected from Italia and if there is
 not
 that much Italian's book,
 so by default I will show up few italian books but all the english one as
 well.

 Do you have an example ? 
 I'm quite lost about it,


 John E. McBride wrote:
   
 Fairly nebulous requirements, but I recently was involved in a 
 multilingual search platform.

 The approach, translated to solr 1.3 would be to use multicore - one 
 core per geography.  Then a schema.xml per core, each with a different 
 language in the porter algorithm, stopwords etc - taken from snowball.

 Then on the german front end you make requests to the de core, on the 
 english front end make requests to the english core.

 This is much simpler than sorting every language in the one index, for 
 example german queries will need to be run through the german query 
 filters etc.  If you have all languages in one schema, then you will 
 have to do some front end logic to map the query to the correct field.

 You have failed to consider internationalisation of the query side of 
 the process - your field type merely have analysis filters. 

 Additionally, if the data source for each different geography is 
 different it makes sense to separate the indexes and subsequently the 
 ingestion mechanisms and schedules.

 Just a few thoughts.

 John

 sunnyfr wrote:
 
 Hi,

 I would like to manage properly multi language search motor,
 I would like your advice about what have I done.

 Solr1.3
 tomcat55

 http://www.nabble.com/file/p19954805/schema.xml schema.xml 

 Thanks a lot,

   
   

 

   
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19955411.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Multi-language solr1.3 what would you reckon?

2008-10-13 Thread Kraus, Ralf | pixelhouse GmbH

Hannes Carl Meyer schrieb:

Hi,

is it really neccessary to put it all into one index? You could also use the
Solr MultiCore/MultipleIndexes feature and seperate by language.
  

Is there a good webpage with infos about the multiindex-feature ?
I know http://wiki.apache.org/solr/MultipleIndexes but there is not 
enough info :-(



Greets -Ralf-



Re: Multi-language solr1.3 what would you reckon?

2008-10-13 Thread John E. McBride

In your schema you define each field as follows:

fieldtype name=text_it class=solr.TextField
−
analyzer
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StandardFilterFactory/
filter class=solr.ISOLatin1AccentFilterFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=Italian/
/analyzer
/fieldtype

etc

However, you have not defined the query filters - if you do not this 
then you will not get any matches for searches in different languages.


for example, in english if you index the sentence the joyful boy played 
tennis, this would typically get stored as joy boy play tennis due to 
the analysis filters. If you then made a query for joyful without 
applying the same filters on the query side you would get no matches.


You will also want to get some multilingual stop words lists from 
snowball website eg http://snowball.tartarus.org/algorithms/german/stop.txt.


sunnyfr wrote:
What is the problem with the way that I've done, 
Does that's means that there is some which are linked with language that we

won't manage by search,
there is too many language, the application will be for video,
we will manage around 10 language, but in our database we have around  25
language, 
Should i create a core text and others like text_en, text_fr, text_es, and

all the video which are not in this language manage by the search engine
should be stored in text ?

Because even if they are on the english website they should be able if they
enter a french word chien for dog
to find french videos.
I don't know if I'm clear??

and even so text should manage all the other language which are not managed
in the other cores ?? 


thanks


John E. McBride wrote:
  
Well, it's this section shown below, which would change from geography 
to geography.

Parameterise the EnglishPorterFilterFactory and protwords.

You could introduce logic in the front end which asks if num results is 
zero then makes a call to the english language, but it doesn't make 
logical sense?  why would a search in the italian language bring up 
anything in the english index?


I think you need to explain your application in a little more detail.


fieldType name=text class=solr.TextField positionIncrementGap=100
-
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
-
!--
 in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory 
synonyms=index_synonyms.txt ignoreCase=true expand=false/
   
--

-
!--
 Case insensitive stop word removal.
 enablePositionIncrements=true ensures that a 'gap' is left to
 allow for accurate phrase queries.
   
--
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=1 catenateNumbers=1 
catenateAll=0 splitOnCaseChange=1/

filter class=solr.LowerCaseFilterFactory/
filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
-
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=0 catenateNumbers=0 
catenateAll=0 splitOnCaseChange=1/

filter class=solr.LowerCaseFilterFactory/
filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
/fieldType

sunnyfr wrote:


Hi,

Thanks guys for your answer, but I don't think I can use multi-core for
each
language, 
because for exemple if somebody is connected from Italia and if there is

not
that much Italian's book,
so by default I will show up few italian books but all the english one as
well.

Do you have an example ? 
I'm quite lost about it,



John E. McBride wrote:
  
  
Fairly nebulous requirements, but I recently was involved in a 
multilingual search platform.


The approach, translated to solr 1.3 would be to use multicore - one 
core per geography.  Then a schema.xml per core, each with a different 
language in the porter algorithm, stopwords etc - taken from snowball.


Then on the german front end you make requests to the de core, on the 
english front end make requests to the english core.


This is much simpler than sorting every language in the one index, for 
example german queries will need to be run through the german query 
filters etc.  If you have all languages in one schema, then you will 
have to do some front end logic to map the query to the correct field.


You have failed to consider internationalisation of the query side of 
the process - your field type merely have analysis filters. 

Additionally, if the data source for each 

Re: Multi-language solr1.3 what would you reckon?

2008-10-13 Thread Hannes Carl Meyer
Hi Ralf,

you should also check on the example inside the Solr 1.3 download package!

The management of multiple languages inside multiple indexes really makes
sense in terms of configuration efforts (look at your big kahuna
configuration file!), performance and gives an additional scalibility
feature (in fact that you index/search in multiple cores which could be
theoretically placed on different machines).

But, from the perspecitve of the search client you will have to execute
search processes on multiple cores simultaneously. If this is feasible you
should really think about using multiple indexes.

Regards,

Hannes

On Mon, Oct 13, 2008 at 4:14 PM, Kraus, Ralf | pixelhouse GmbH 
[EMAIL PROTECTED] wrote:

 Hannes Carl Meyer schrieb:

 Hi,

 is it really neccessary to put it all into one index? You could also use
 the
 Solr MultiCore/MultipleIndexes feature and seperate by language.


 Is there a good webpage with infos about the multiindex-feature ?
 I know http://wiki.apache.org/solr/MultipleIndexes but there is not enough
 info :-(


 Greets -Ralf-




Re: Multi-language solr1.3 what would you reckon?

2008-10-13 Thread sunnyfr

But I don't get, if you look in my schema.xml it's what I've done, multi
index?
So I was right ?


Hannes Carl Meyer-2 wrote:
 
 Hi Ralf,
 
 you should also check on the example inside the Solr 1.3 download package!
 
 The management of multiple languages inside multiple indexes really makes
 sense in terms of configuration efforts (look at your big kahuna
 configuration file!), performance and gives an additional scalibility
 feature (in fact that you index/search in multiple cores which could be
 theoretically placed on different machines).
 
 But, from the perspecitve of the search client you will have to execute
 search processes on multiple cores simultaneously. If this is feasible you
 should really think about using multiple indexes.
 
 Regards,
 
 Hannes
 
 On Mon, Oct 13, 2008 at 4:14 PM, Kraus, Ralf | pixelhouse GmbH 
 [EMAIL PROTECTED] wrote:
 
 Hannes Carl Meyer schrieb:

 Hi,

 is it really neccessary to put it all into one index? You could also use
 the
 Solr MultiCore/MultipleIndexes feature and seperate by language.


 Is there a good webpage with infos about the multiindex-feature ?
 I know http://wiki.apache.org/solr/MultipleIndexes but there is not
 enough
 info :-(


 Greets -Ralf-


 
 

-- 
View this message in context: 
http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19956421.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Multi-language solr1.3 what would you reckon?

2008-10-13 Thread Hannes Carl Meyer
Nope, your schema defines a single index with alle languages being stored.
The other way would be MultiCore/MultipleIndexes as described here:
http://wiki.apache.org/solr/CoreAdmin and
http://wiki.apache.org/solr/MultipleIndexes#head-e517417ef9b96e32168b2cf35ab6ff393f360d59

On Mon, Oct 13, 2008 at 5:05 PM, sunnyfr [EMAIL PROTECTED] wrote:


 But I don't get, if you look in my schema.xml it's what I've done, multi
 index?
 So I was right ?


 Hannes Carl Meyer-2 wrote:
 
  Hi Ralf,
 
  you should also check on the example inside the Solr 1.3 download
 package!
 
  The management of multiple languages inside multiple indexes really makes
  sense in terms of configuration efforts (look at your big kahuna
  configuration file!), performance and gives an additional scalibility
  feature (in fact that you index/search in multiple cores which could be
  theoretically placed on different machines).
 
  But, from the perspecitve of the search client you will have to execute
  search processes on multiple cores simultaneously. If this is feasible
 you
  should really think about using multiple indexes.
 
  Regards,
 
  Hannes
 
  On Mon, Oct 13, 2008 at 4:14 PM, Kraus, Ralf | pixelhouse GmbH 
  [EMAIL PROTECTED] wrote:
 
  Hannes Carl Meyer schrieb:
 
  Hi,
 
  is it really neccessary to put it all into one index? You could also
 use
  the
  Solr MultiCore/MultipleIndexes feature and seperate by language.
 
 
  Is there a good webpage with infos about the multiindex-feature ?
  I know http://wiki.apache.org/solr/MultipleIndexes but there is not
  enough
  info :-(
 
 
  Greets -Ralf-
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19956421.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: Multi-language solr1.3 what would you reckon?

2008-10-13 Thread sunnyfr

Ok so actually multi-core is multi-index?
Cheers for this links


Hannes Carl Meyer-2 wrote:
 
 Nope, your schema defines a single index with alle languages being stored.
 The other way would be MultiCore/MultipleIndexes as described here:
 http://wiki.apache.org/solr/CoreAdmin and
 http://wiki.apache.org/solr/MultipleIndexes#head-e517417ef9b96e32168b2cf35ab6ff393f360d59
 
 On Mon, Oct 13, 2008 at 5:05 PM, sunnyfr [EMAIL PROTECTED] wrote:
 

 But I don't get, if you look in my schema.xml it's what I've done, multi
 index?
 So I was right ?


 Hannes Carl Meyer-2 wrote:
 
  Hi Ralf,
 
  you should also check on the example inside the Solr 1.3 download
 package!
 
  The management of multiple languages inside multiple indexes really
 makes
  sense in terms of configuration efforts (look at your big kahuna
  configuration file!), performance and gives an additional scalibility
  feature (in fact that you index/search in multiple cores which could be
  theoretically placed on different machines).
 
  But, from the perspecitve of the search client you will have to execute
  search processes on multiple cores simultaneously. If this is feasible
 you
  should really think about using multiple indexes.
 
  Regards,
 
  Hannes
 
  On Mon, Oct 13, 2008 at 4:14 PM, Kraus, Ralf | pixelhouse GmbH 
  [EMAIL PROTECTED] wrote:
 
  Hannes Carl Meyer schrieb:
 
  Hi,
 
  is it really neccessary to put it all into one index? You could also
 use
  the
  Solr MultiCore/MultipleIndexes feature and seperate by language.
 
 
  Is there a good webpage with infos about the multiindex-feature ?
  I know http://wiki.apache.org/solr/MultipleIndexes but there is not
  enough
  info :-(
 
 
  Greets -Ralf-
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19956421.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://www.nabble.com/Multi-language-solr1.3-what-would-you-reckon--tp19954805p19957842.html
Sent from the Solr - User mailing list archive at Nabble.com.