Webboard: Newsgroup Indexing

2001-08-13 Thread Alexey Botchkov

Author: Alexey Botchkov
Email: [EMAIL PROTECTED]
Message:
Unfortunately current version of MnoGoSearch doesn't work with newsgroups and ftp.

Reply: http://www.mnogosearch.org/board/message.php?id=2824

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: PDF indexing not working in Pro Trial for Windows

2001-08-13 Thread Ramil Kalimullin

Hello!

Please send me your *.conf file directly.

BR, Ramil.

- Original Message -
From: Joe Frost [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Monday, August 13, 2001 12:45 AM
Subject: PDF indexing not working in Pro Trial for Windows


 Hi,

 I've been using Mnogosearch under Linux for ages and love it to bits but a
 client of mine who hosts exclusively on NT wants to set it up as a search
 engine for a new project they have. Much of the content will be in PDFs so
 this feature is vital.

 I have set up my own test system using the current Pro Trial version on
 Windows 2000 with MySQL. Indexing of normal html URLs works fine but
 indexing of PDFs does not. I'm using pdftotext.exe with the settings
 suggested in the help file including using / instead of the normal
Windows
 \. The PDF is fetched by the indexer and seems to be briefly parsed but
 the URL is not included in any subsequent searches for terms that it is
 known to include.

 Is this a restriction of the trial version or am I doing something wrong?

 Thanks and best regards,

 Joe
 ___
 If you want to unsubscribe send unsubscribe general
 to [EMAIL PROTECTED]



___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Depth

2001-08-13 Thread Alexey Botchkov

Author: Alexey Botchkov
Email: [EMAIL PROTECTED]
Message:
You're right.

Reply: http://www.mnogosearch.org/board/message.php?id=2825

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Duplicate URLs

2001-08-13 Thread Alexander Barkov

Dominique Asselineau wrote:
 
 Hello,
 
 Is there a manner to avoid duplicate URLs ?
 E.G.  path/file.html = path//file.html = path/./file.html
 orpath/index.html = path/
 


indexer should detect duplicate documents.
Use Clones yes command in both search.htm and
indexer.conf
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: User rights on mnoGoSearch Database?

2001-08-13 Thread Alexander Barkov

Andre Pfeiler wrote:
 Hello,
 
 #1045: Access denied for user: 'foo@localhost' (Using password: YES)
 
 i solve the problem.it works now! but wich user rights(permissions) do the
 mnoGoSearch mysql database need for the mysql user foo?
 


INSERTs, UPDATEs, DELETEs, SELECTs.
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: follow links to remote pages?

2001-08-13 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 Hello, one more question
 it is possbel to follow links  
 to remote pages šandššon remote sites to index them?
 Greets
 Andre
 

Yes, Use Follow world indexer.conf command

Reply: http://www.mnogosearch.org/board/message.php?id=2828

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: hot to add charset into mnogosearch

2001-08-13 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 I'm Thai people and Thailand use charset tis-620 or iso8599-11 but mnogosearch v. 
3.2.0b1 not have tis-620 and iso8599-11 charset.
 where i can find document about add charset into mnogosearch ?
 help me please :^)


I'll add Thai support today. As I read 
tis-620 and iso-8859-11 are the same charsets,
just aliases. Is that true?


Reply: http://www.mnogosearch.org/board/message.php?id=2829

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Solaris/Oracle

2001-08-13 Thread Xander Buys

Author: Xander Buys
Email: [EMAIL PROTECTED]
Message:
Hi all,  running Solaris 8, Oracle 8i and mnoGoSearch 3114.

Indexer runs an Server A and pumps data into Oracle DB on same server - No problems.

Search.cgi runs on Server B and queries data from Server A - PROBLEM!

Returns error Oracle - Error while trying to retrieve text for error ORA-12154!

Indexer -S with a valid conf file works fine.

What is wrong?

TIA

Reply: http://www.mnogosearch.org/board/message.php?id=2830

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: hot to add charset into mnogosearch

2001-08-13 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 I'm Thai people and Thailand use charset tis-620 or iso8599-11 but mnogosearch v. 
3.2.0b1 not have tis-620 and iso8599-11 charset.
 where i can find document about add charset into mnogosearch ?
 help me please :^)


OK. Now mnoGoSearch supports TIS-620 with the following
aliases: TIS620, TACTIS, ISO-8859-11

Take updated unicode.c and udm_unicode.h here:

http://gw.udmsearch.izhnet.ru/~bar/udm_unicode.h
http://gw.udmsearch.izhnet.ru/~bar/unicode.c

 then recompile the package.



Reply: http://www.mnogosearch.org/board/message.php?id=2831

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




RE: PDF indexing not working in Pro Trial for Windows

2001-08-13 Thread Holmes, Gregory
Title: RE: PDF indexing not working in Pro Trial for Windows





Joe:


I've successfully set up mnogo on NT with MySQL. However, I compiled the unix source version with Cygwin (http://sources.redhat.com/cygwin/).

Here are the lines from indexer.conf that I use:


Mime application/pdf; charset=iso-8859-1 text/html /usr/local/bin/pdf2html.pl $1 application/pdf

Mime application/pdf text/html /usr/local/bin/pdf2html.pl $1 application/pdf


pdf2html.pl is a contributed script from htdig (www.htdig.org) that uses pdfinfo and pdftotext to construct a web page and feed it back to the indexer. You get meta data this way as well as the text indexed. Obviously, you need perl for this to work.

Also, there might have been a default line in indexer.conf excluding PDFs from indexing, if so, you'll have to remove it or comment it out.

Works for me, hope it helps.


Greg Holmes


-Original Message-
From: Joe Frost [mailto:[EMAIL PROTECTED]]
Subject: PDF indexing not working in Pro Trial for Windows


...
a client of mine who hosts exclusively on NT wants to set it up as a search
engine for a new project they have. Much of the content will be in PDFs so
this feature is vital.


I have set up my own test system using the current Pro Trial version on
Windows 2000 with MySQL. Indexing of normal html URLs works fine but
indexing of PDFs does not. I'm using pdftotext.exe with the settings
suggested in the help file including using / instead of the normal Windows
\. The PDF is fetched by the indexer and seems to be briefly parsed but
the URL is not included in any subsequent searches for terms that it is
known to include.


Is this a restriction of the trial version or am I doing something wrong?





Webboard: hot to add charset into mnogosearch

2001-08-13 Thread apples

Author: apples
Email: [EMAIL PROTECTED]
Message:
Thank you for support tis-620 charset but search is not correct

e.g.   Search word àÁ×èÍ
search result split word to Search results: àÁ : 22, Í : 1082 
mnogosearch split word when found upper case 
and mongosearch have problem with mod_gzip apache module
I fix by disable mod_gzip module for use mnogosearch
Thank again :-)



Reply: http://www.mnogosearch.org/board/message.php?id=2832

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: NEW PHP Interface Problem

2001-08-13 Thread Sergey Kartashoff

Hi!

Tuesday, July 31, 2001, 5:36:05 AM, you wrote:


LB Has anyone seen these errors, I have checked, and can connect to the Oracle
LB server without a problem from the machine..

LB Warning: Supplied argument is not a valid mnoGoSearch-Result resource in
LB /usr/local/searchtest/htdocs/search.php on line 50

LB Warning: Supplied argument is not a valid mnoGoSearch-Result resource in
LB /usr/local/searchtest/htdocs/search.php on line 51

Please describe your configuration.
I mean mnogosearch version, mnogosearch-php version, php version,
search template. Have you tried search.cgi with the same template ?
Does it works ?

-- 
Regards, Sergey aka gluke.


___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Webboard: hot to add charset into mnogosearch

2001-08-13 Thread Maxime Zakharov

Hi,

can you give me url of server with mod_gzip ?

apples wrote:
 
 and mongosearch have problem with mod_gzip apache module
 I fix by disable mod_gzip module for use mnogosearch



--
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




RE: PDF indexing not working in Pro Trial for Windows

2001-08-13 Thread Joe Frost
Title: RE: PDF indexing not working in Pro Trial for Windows



I was 
rather hoping to get the Windows binary version working as I don't think the 
client will want to have to compile the Linux version and it'll be one of their 
people that installs it on their live web servers.

Does 
anyone know if the 1k limitation on the trial version causes PDFs not to get 
properly indexed?

BR, 
Joe

  -Original Message-From: Holmes, Gregory 
  [mailto:[EMAIL PROTECTED]]Sent: 13 August 2001 
  13:29To: '[EMAIL PROTECTED]'; 'Joe Frost'Subject: 
  RE: PDF indexing not working in Pro Trial for Windows
  Joe: 
  I've successfully set up mnogo on NT with MySQL. 
  However, I compiled the unix source version with Cygwin (http://sources.redhat.com/cygwin/).
  Here are the lines from indexer.conf that I use: 
  Mime "application/pdf; charset=iso-8859-1" 
  "text/html" 
  "/usr/local/bin/pdf2html.pl $1 application/pdf"
  Mime application/pdf 
  "text/html" 
  "/usr/local/bin/pdf2html.pl $1 application/pdf" 
  pdf2html.pl is a contributed script from htdig (www.htdig.org) 
  that uses pdfinfo and pdftotext to construct a web page and feed it back to 
  the indexer. You get meta data this way as well as the text 
  indexed. Obviously, you need perl for this to work.
  Also, there might have been a default line in indexer.conf 
  excluding PDFs from indexing, if so, you'll have to remove it or comment it 
  out.
  Works for me, hope it helps. 
  Greg Holmes 
  -Original Message- From: Joe 
  Frost [mailto:[EMAIL PROTECTED]] 
  Subject: PDF indexing not working in Pro Trial for 
  Windows 
  ... a client of mine who hosts 
  exclusively on NT wants to set it up as a search engine for a new project they have. Much of the content will be in PDFs 
  so this feature is vital. 
  I have set up my own test system using the current Pro Trial 
  version on Windows 2000 with MySQL. Indexing of normal 
  html URLs works fine but indexing of PDFs does not. 
  I'm using pdftotext.exe with the settings suggested in 
  the help file including using "/" instead of the normal Windows 
  "\". The PDF is fetched by the indexer and seems to be 
  briefly parsed but the URL is not included in any 
  subsequent searches for terms that it is known to 
  include. 
  Is this a restriction of the trial version or am I doing 
  something wrong? 


Webboard: how to add charset into mnogosearch

2001-08-13 Thread apples

Author: apples
Email: [EMAIL PROTECTED]
Message:
indexer.conf I use LocalCharset tis-620 
and search.htm I use LocalCharset tis-620  BrowserCharset tis-620
I see in mysql database 
mysql select * from dict limit 0,3;
++-+-+
| url_id | word| intag   |
++-+-+
|  1 | ÊÓ¹|   66560 |
|  1 | ¡Ê  |  132096 |
|  1 | §àÊà   |  197632 |

the word lost 3 character true word is  ÊӹѡÊè§àÊà 
mnogosearch split word when found
0xD10x0E31  # THAI CHARACTER MAI HAN-AKAT
0xD40x0E34  # THAI CHARACTER SARA I
0xD50x0E35  # THAI CHARACTER SARA II
0xD60x0E36  # THAI CHARACTER SARA UE
0xD70x0E37  # THAI CHARACTER SARA UEE
0xD80x0E38  # THAI CHARACTER SARA U
0xD90x0E39  # THAI CHARACTER SARA UU
0xDA0x0E3A  # THAI CHARACTER PHINTHU
0xE70x0E47  # THAI CHARACTER MAITAIKHU
0xE80x0E48  # THAI CHARACTER MAI EK
0xE90x0E49  # THAI CHARACTER MAI THO
0xEA0x0E4A  # THAI CHARACTER MAI TRI
0xEB0x0E4B  # THAI CHARACTER MAI CHATTAWA
0xEC0x0E4C  # THAI CHARACTER THANTHAKHAT
0xED0x0E4D  # THAI CHARACTER NIKHAHIT
0xEE0x0E4E  # THAI CHARACTER YAMAKKAN
0xEF0x0E4F  # THAI CHARACTER FONGMAN

and search.cgi problem same indexer

Reply: http://www.mnogosearch.org/board/message.php?id=2834

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Newsgroup Indexing

2001-08-13 Thread Weeddude

Author: Weeddude
Email: [EMAIL PROTECTED]
Message:
Do you know when it's planned to work with the Windows Version?

Reply: http://www.mnogosearch.org/board/message.php?id=2835

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Webboard: Depth

2001-08-13 Thread Andre Pfeiler

On Monday 13 August 2001 10:40, you wrote:
 Author: Alexander Barkov
 Email: [EMAIL PROTECTED]

 Message:
  I am evaluating the NT trail version and have a question?  I would like
  to set the search depth for each server.  Am I correct in assuming the
  hops limit performs that function?  If that is not the setting for search
  depth how do I configure search depth on the NT version.
 
  thanks

 If I understood task correctly, you may also use
 Allow/Disallow rules to do this.

 Use for example something like this

 Diallow http://*/*/*

  to avoid indexing of pages from 2nd level and deeper.


 Reply: http://www.mnogosearch.org/board/message.php?id=2827

Hello,
Disallow http://*/*/*
i wanna to index only 4 levels , is it like this: Disallow 
http://suse.de/*/*/*/*  ? or how must i configure it in indexer.conf?

greets
Andre




___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Disallow http://*/*/* command

2001-08-13 Thread Andre Pfeiler

Author: Andre Pfeiler
Email: [EMAIL PROTECTED]
Message:
Hello,
where must i specify in the indexer.conf the
Disallow http://*/*/* command? i tried it in line 320 and got an 
error:
Error in config file '/usr/local/mnogosearch/etc/indexer.conf' line 
320: Diallow http://*/*/*

an other question is!...it is possible to specify the depth of one 
url only? example: http://suse.de/*/*/* ?

greets
Andre


Reply: http://www.mnogosearch.org/board/message.php?id=2836

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Weight commands

2001-08-13 Thread Andre Pfeiler

Author: Andre Pfeiler
Email: [EMAIL PROTECTED]
Message:
Hi,
what are the the highest and lowest numbers for example for 
the(Weight) KeywordWeight command? higher value = better match?

greets
Andre
 


Reply: http://www.mnogosearch.org/board/message.php?id=2838

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Disallow http://*/*/* command

2001-08-13 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 Hello,
 where must i specify in the indexer.conf the
 Disallow http://*/*/* command? i tried it in line 320 and got an 
 error:
 Error in config file '/usr/local/mnogosearch/etc/indexer.conf' line 
 320: Diallow http://*/*/*

Check spelling:

Disallow, not Diallow 

 an other question is!...it is possible to specify the depth of one 
 url only? example: http://suse.de/*/*/* ?
 

Yes.


Reply: http://www.mnogosearch.org/board/message.php?id=2839

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Weight commands

2001-08-13 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 Hi,
 what are the the highest and lowest numbers for example for 
 the(Weight) quot;KeywordWeightquot; command? higher value = better match?
 

In 3.1.x take a look into doc/search.txt, into this section:

  Changing different document parts weights at search time.


Reply: http://www.mnogosearch.org/board/message.php?id=2840

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Indexer loop in 3.2.0.b0

2001-08-13 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Thanks for reporting, we'll check it.

 
 I use the Version 3.2.0.b0 and the indexer get following results:
 
 http://www.working-retriever.com/
 http://www.working-retriever.com//
 http://www.working-retriever.com///
 http://www.working-retriever.com
 http://www.working-retriever.com/
 http://www.working-retriever.com//
 http://www.working-retriever.com///
 http://www.working-retriever.com
 and so on
 
 A double (triple...) slash after http:// is not regular. Can you delete 
unnecessarily slashs and then insert into table? I think that must fix the bug.
 
 cu
 Aiko

Reply: http://www.mnogosearch.org/board/message.php?id=2841

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]