Re: language

2002-06-04 Thread Alexander Barkov
Hi! Kreso wrote: > Hello all, > > what would be the recommended way of specifying the language in > which the indexed documents are written? I have noticed in indexer.c > that "Content-Language:" header is examined, however I would prefer > specifying the langua

language

2002-06-03 Thread Kreso
Hello all, what would be the recommended way of specifying the language in which the indexed documents are written? I have noticed in indexer.c that "Content-Language:" header is examined, however I would prefer specifying the language somewhere in the document itself. Is this pos

Webboard: No Chinese Language Support..

2002-03-11 Thread Alex Barkov
Author: Alex Barkov Email: [EMAIL PROTECTED] Message: > Is it really no Chinese Language support? > If not at this moment, when will it support? There is not Chinese support in releases before 3.2.0 really. Since mnogosearch-3.2.0 it has Big5 and GB2312 Chinese character sets support.

Webboard: Again on language guessing

2002-03-10 Thread maxime
Author: maxime Email: Message: No. expectation and dispersion was used to avoid sorting, i guess (i don't know exactly, as it is not my idea). Indexes was used to limit memory usage. Yes, it give a little worst result against all n-grams, but guesser work well and fast and not comsume much mem

Webboard: Again on language guessing

2002-03-10 Thread maxime
Author: maxime Email: Message: No. expectation and dispersion was used to avoid sorting, i guess (i don't know exactly, as it is not my idea). Indexes was used to limit memory usage. Yes, it give a little worst result against all n-grams, but guesser work well and fast and not comsume much mem

Webboard: No Chinese Language Support..

2002-03-09 Thread D. I.
Author: D. I. Email: [EMAIL PROTECTED] Message: Where can I find any information about chinese dialects on net or your site, if you please? Reply: ___ If you want to unsubscribe send "unsubscribe gene

Webboard: Again on language guessing

2002-03-08 Thread Gialuca
Author: Gialuca Email: [EMAIL PROTECTED] Message: Hi, I'll try the new version. About the substitution I mean that 'e' and 'wil' have the same index (as 'g' and ' I ') and, since there isn't collision handling, that keys share the same value. So if your text is 'Since I think I will be alive. Eg

Webboard: Again on language guessing

2002-03-07 Thread maxime
Author: maxime Email: Message: Since 3.2.4 version we use different measure based on information gain function. You may build new mguesser from current CVS sources. What you mean under "'g' is substituted by ' I ' or where 'wil' by 'e'" ? Reply:

Webboard: Again on language guessing

2002-03-07 Thread Gialuca
Author: Gialuca Email: [EMAIL PROTECTED] Message: Hi all, we did further research on language guessing and during it compared mguesser to text_cat. It appears that mguesser doesn't handle collisions, accepting maps in which 'g' is substituted by ' I ' or where 'wil

Webboard: Research on language guessing

2002-02-07 Thread maxime
Author: maxime Email: Message: Thanks for link. But it seems to me, that this method require much more computational power that method used in mnogosearch. Reply: ___ If you want to unsubscribe send

Webboard: Research on language guessing

2002-02-07 Thread maxime
Author: maxime Email: Message: May be not. Compaire maps for various languages - equal 1-gramms have different frequencies for different languages. Reply: ___ If you want to unsubscribe send "unsubsc

Webboard: Research on language guessing

2002-02-07 Thread Gialuca
Author: Gialuca Email: [EMAIL PROTECTED] Message: Yes, your right, but I saw, and cavnar and trenkle say that, that very first entries are just single letters, so you're just getting letter freqs, and that's the reason to believe a pass-band filter could be useful. Thanks anyway for your answers

Webboard: Research on language guessing

2002-02-06 Thread kentsin
Author: kentsin Email: [EMAIL PROTECTED] Message: FYI, Wired.com just have an article about using gzip to do language guessing. http://wired.com/news/technology/0,1282,50192,00.html Reply: <http://www.mnogosearch.org/board/message.php?id=4095>

Webboard: Research on language guessing

2002-02-06 Thread maxime
Author: maxime Email: Message: Because _top_ n-gramms highly language specific. And middle n-grams may be equal for related languages (ex. russian, ukranian, byelorussian). N.B. our guesser based on this papper: http://sochi.net.ru/~maxime/doc/cavnar_trenkle_ngram.ps.gz Reply: <h

Webboard: Research on language guessing

2002-02-06 Thread Gialuca
Author: Gialuca Email: [EMAIL PROTECTED] Message: Hi, I and my company are doing some research on language guessing, and we are using mnogosearch at some levels, including its guesser. I'd have a question about the language maps costruction: why did you use a filter cutting only the

Re: LAnguage support

2001-12-04 Thread Alexander Barkov
Hi! costas wrote: > HI, > > I was wondering if you have yet begun work on > > Make it possible to use several "LocalCharset" indexer.conf commands. > > It should help to index multi-language servers such as www.debian.org > <http://www.debian.org>.

LAnguage support

2001-12-04 Thread costas
HI,   I was wondering if you have yet begun work on   Make it possible to use several "LocalCharset" indexer.conf commands. It should help to index multi-language servers such as www.debian.org.   This is the most important feature for my work since i am constantly indexing mult

Webboard: Has system built-in language or keywords?

2001-11-22 Thread Alex Barkov
Author: Alex Barkov Email: [EMAIL PROTECTED] Message: multi and single modes support substring searches. Default template contains a SELECT with OPTIONs to choose word match type: full, beginning, ending, substring. > Hello, all, > > Sample, can I searching string admin*, > and result will page

Webboard: Has system built-in language or keywords?

2001-11-22 Thread Anton
Author: Anton Email: [EMAIL PROTECTED] Message: Hello, all, Sample, can I searching string admin*, and result will pages with words administrator, administration, etc. Thenk you Reply: ___ If you wa

Webboard: Language not understood

2001-10-26 Thread Alex Barkov
Author: Alex Barkov Email: [EMAIL PROTECTED] Message: > Why though does the indexer not recognise the language even though it recognises the >charset. I have read the documentation and found nothing which tells me how to >switch language detection on. > > I assume it is automa

Webboard: Language not understood

2001-10-26 Thread mike jaffa
Author: mike jaffa Email: [EMAIL PROTECTED] Message: Why though does the indexer not recognise the language even though it recognises the charset. I have read the documentation and found nothing which tells me how to switch language detection on. I assume it is automatic but it does not work

Webboard: Language not understood

2001-10-25 Thread Alex Barkov
Author: Alex Barkov Email: [EMAIL PROTECTED] Message: Take a look here: http://www.mnogosearch.org/doc/msearch-ch06.html This page explains how mnogosearch process charsets and languages. Please note this is 3.2.x documentation, so 3.1.x works in different ways. > I have read your manuals and ask

Webboard: Language not understood

2001-10-25 Thread mike jaffa
Author: mike jaffa Email: [EMAIL PROTECTED] Message: I have read your manuals and asked a few people on the board but I cant quite understand the logic, I would appreciate it if you could answer my queries. 1)Mnogosearch seems to recognise the char set of indexed pages with out using the 'Local

Re: stop-list for catalan language

2001-10-24 Thread Maxime Zakharov
Hi, I added this stopwordslist to 3.2 CVS. Jordi Gay Sensat wrote: > We are sending you the stopwords list for catalan language. We hope that > it will be included in next distribution and it will be useful for > catalan people. > We are using the mngosearch for indexing a city

Re: stop-list for catalan language

2001-10-24 Thread Maxime Zakharov
Hi, Thank you. But which charset are used for this stopwordslist ? Jordi Gay Sensat wrote: > We are sending you the stopwords list for catalan language. We hope that > it will be included in next distribution and it will be useful for > catalan people. > We are using the mn

stop-list for catalan language

2001-10-19 Thread Jordi Gay Sensat
We are sending you the stopwords list for catalan language. We hope that it will be included in next distribution and it will be useful for catalan people. We are using the mngosearch for indexing a city council web site in Catalonia. Congratulations for your fantastic work!!! The Cthulhu

Webboard: Problem with czech language

2001-09-01 Thread loverman
Author: loverman Email: [EMAIL PROTECTED] Message: The best resolution of your problem for you is to translate your web-project to different languages to make visitor choose the language. Reply: <http://www.mnogosearch.org/board/message.php?id=2

Re: Webboard: Language Autodetection

2001-08-31 Thread Maxime Zakharov
John Fax wrote: > > Is there a way to let mnoGoSearch guess what is the language > of the document ? > If not, does anybody know a program that is able to perform > such a task ? For mnogosearch 3.1.x, you can add attribute lang to tag (ex.: ), or setup your web-server to add C

Webboard: Language Autodetection

2001-08-31 Thread John Fax
Author: John Fax Email: [EMAIL PROTECTED] Message: Excellent. Thanks a lot ! So starting 3.2.0, you can leave the "lang" field blank in the url table, and mnoGoSearch puts the right value for you ? Reply: ___

Webboard: Language Autodetection

2001-08-31 Thread Alexander Barkov
Author: Alexander Barkov Email: [EMAIL PROTECTED] Message: > > Hi, > > Is there a way to let mnoGoSearch guess what is the language > of the document ? > If not, does anybody know a program that is able to perform > such a task ? > > Thanks a lot ! There is also m

Webboard: Language Autodetection

2001-08-31 Thread Alexander Barkov
Author: Alexander Barkov Email: [EMAIL PROTECTED] Message: > > Hi, > > Is there a way to let mnoGoSearch guess what is the language > of the document ? > If not, does anybody know a program that is able to perform > such a task ? > > Thanks a lot ! > Hi

Webboard: Language Autodetection

2001-08-31 Thread John Fax
Author: John Fax Email: [EMAIL PROTECTED] Message: Hi, Is there a way to let mnoGoSearch guess what is the language of the document ? If not, does anybody know a program that is able to perform such a task ? Thanks a lot ! John Reply: <http://www.mnogosearch.org/board/message.php?id=2

Webboard: indexing a multi-language site

2001-08-26 Thread Alexander Barkov
Author: Alexander Barkov Email: [EMAIL PROTECTED] Message: > Hi, I am trying to index a site which is in 4 diff. languages.. the user chooses the >language on the splash page, then a cookie is set, and every page is shown in the >corrisponding language according to the cookie... >

Webboard: indexing a multi-language site

2001-08-26 Thread Sergio
Author: Sergio Email: [EMAIL PROTECTED] Message: Hi, I am trying to index a site which is in 4 diff. languages.. the user chooses the language on the splash page, then a cookie is set, and every page is shown in the corrisponding language according to the cookie... I would like to index

Webboard: How to set the language to English?

2001-07-19 Thread gluke
Author: gluke Email: [EMAIL PROTECTED] Message: > DBAddr xxx > Server xxx > Localcharset koi8-r > > Am I right? > Tanx in advance Localcharset Should be set in indexer conf before all Server commands. And to specify remote server chatset you should use Charset indexer command before Server als

Webboard: How to set the language to English?

2001-07-19 Thread E-Peng
Author: E-Peng Email: [EMAIL PROTECTED] Message: Tanx for ur reply. Let's forget about the Unicode. Now, I am trying to do a simple search. It works well when the language is set to "Any". But it search nothing when I set the language to "English". Same thing happen

Webboard: What should i do to do language based search?

2001-07-18 Thread gluke
Author: gluke Email: [EMAIL PROTECTED] Message: Mnogosearch php extension currently doen not support 3.2.* branch of mnogosearch. Reply: ___ If you want to unsubscribe send "unsubscribe general" to [EM

Webboard: What should i do to do language based search?

2001-07-18 Thread E-Peng
Author: E-Peng Email: [EMAIL PROTECTED] Message: I had just upgraded mnogosearch version 3.2.0.b.0. Now mnoGoSearch supports almost all widely used charsets. So, I am trying to use UTF8 as a LocalCharset one can build a multi-lingual search engine with languages. My questions are: 1. What shld

Webboard: Page language recognition

2001-03-22 Thread Volker Wysk
Author: Volker Wysk Email: post @volker-wysk.de Message: Hi If you use Apache, you could use its content negotiation features. See the manual. bye Reply: ___ If you want to unsubscribe send "unsubscrib

Re: Webboard: Page language recognition

2001-03-17 Thread Maxime Zakharov
Molara Federico wrote: > How can I set the language for a HTML page? > I'm indexing a multi-language site of dinamically > generated pages (I'm using ASP). > > I've tryed to insert a in my > pages, but it don't seems to work. > > What's wrong??

Webboard: Page language recognition

2001-03-17 Thread Molara Federico
Author: Molara Federico Email: [EMAIL PROTECTED] Message: How can I set the language for a HTML page? I'm indexing a multi-language site of dinamically generated pages (I'm using ASP). I've tryed to insert a in my pages, but it don't seems to work. What's wrong??? Tha

Webboard: Language - spell question

2001-03-14 Thread Nir Shahaf
Author: Nir Shahaf Email: [EMAIL PROTECTED] Message: Hi, I've just finished installing mnogosearch-3.1.12 and I have a few questions regarding the &L parameter in the search.htm template. If I understand correctly then if I add another $L entry for some_language it will search the database u

Webboard: Language based search with PHP front end

2001-03-12 Thread csraje
/ispell.inc on line 146 Warning: REG_EMPTY in /usr/local/apache_1.3.17/htdocs/mnofrontend/ispell.inc on line 146 Warning: REG_EMPTY in /usr/local/apache_1.3.17/htdocs/mnofrontend/ispell.inc on line 146 How to avaid this? 2) what should i do to do language based search? even though i indexed a

Webboard: multiple dictionaries for the same language?

2001-03-07 Thread Alexander Barkov
pell wordlists and/or affix rules > for the same language? What happens if you import several? You have to use the only one affix file to one language. But it is possible to use several wordlists with this affix file. Reply: <http://search.mnogo.r

Webboard: multiple dictionaries for the same language?

2001-03-06 Thread Volker Wysk
same language? What happens if you import several? bye Reply: <http://search.mnogo.ru/board/message.php?id=1636> ___ If you want to unsubscribe send "unsubscribe general" to [EMAIL PROTECTED]