Thank Alex, So BasisTech works for the latest version of solr?
Sincerely, Mahmoud On Tue, Nov 10, 2015 at 5:28 PM, Alexandre Rafalovitch <arafa...@gmail.com> wrote: > If this is for a significant project and you are ready to pay for it, > BasisTech has commercial solutions in this area I believe. > > Regards, > Alex. > ---- > Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: > http://www.solr-start.com/ > > > On 10 November 2015 at 08:46, Mahmoud Almokadem <prog.mahm...@gmail.com> > wrote: > > Thanks Pual, > > > > Arabic analyser applying filters of normalisation and stemming only for > > single terms out of standard tokenzier. > > Gathering all synonyms will be hard work. Should I customise my Tokenizer > > to handle this case? > > > > Sincerely, > > Mahmoud > > > > > > On Tue, Nov 10, 2015 at 3:06 PM, Paul Libbrecht <p...@hoplahup.net> > wrote: > > > >> Mahmoud, > >> > >> there is an arabic analyzer: > >> https://wiki.apache.org/solr/LanguageAnalysis#Arabic > >> doesn't it do what you describe? > >> Synonyms probably work there too. > >> > >> Paul > >> > >> > Mahmoud Almokadem <mailto:prog.mahm...@gmail.com> > >> > 9 novembre 2015 17:47 > >> > Thanks Jack, > >> > > >> > This is a good solution, but we have more combinations that I think > >> > can’t be handled as synonyms like every word starts with ‘عبد’ ‘Abd’ > >> > and ‘أبو’ ‘Abo’. When using Standard tokenizer on ‘أبو بكر’ ‘Abo > >> > Bakr’, It’ll be tokenised to ‘أبو’ and ‘بكر’ and the filters will be > >> > applied for each separate term. > >> > > >> > Is there available tokeniser to tokenise ‘أبو *’ or ‘عبد *' as a > >> > single term? > >> > > >> > Thanks, > >> > Mahmoud > >> > > >> > > >> > > >> > Jack Krupansky <mailto:jack.krupan...@gmail.com> > >> > 9 novembre 2015 16:47 > >> > Use an index-time (but not query time) synonym filter with a rule > like: > >> > > >> > Abd Allah,Abdallah > >> > > >> > This will index the combined word in addition to the separate words. > >> > > >> > -- Jack Krupansky > >> > > >> > On Mon, Nov 9, 2015 at 4:48 AM, Mahmoud Almokadem < > >> prog.mahm...@gmail.com> > >> > > >> > Mahmoud Almokadem <mailto:prog.mahm...@gmail.com> > >> > 9 novembre 2015 10:48 > >> > Hello, > >> > > >> > We are indexing Arabic content and facing a problem for tokenizing > multi > >> > terms phrases like 'عبد الله' 'Abd Allah', so users will search for > >> > 'عبدالله' 'Abdallah' without space and need to get the results of 'عبد > >> > الله' with space. We are using StandardTokenizer. > >> > > >> > > >> > Is there any configurations to handle this case? > >> > > >> > Thank you, > >> > Mahmoud > >> > > >> > >> >