Webboard: multilanguages sites

2001-04-09 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Use different tags. Latest 3.1.x branch also supports search limits
by language.

 I have done a web site with 5 differents languages. Each language shows differents 
pages, that can be only in one language
 Example, a page can be in the french site, but not in the english one.
 
 I've installed udmSearch on the french version, and i start to index it. Everythings 
works fine. A search in the french version will shows the resutls of the french 
website. 
 
 But I'd like now to index the english one. And I want that the results diplayed by 
the search engine ine the english version would be only the english pages, and not 
the french one.
 
 How to do it, using only one indexer.conf file ??


Reply: http://search.mnogo.ru/board/message.php?id=1921

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Indexing ok, command line show info, no web output

2001-04-09 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Do you use built-in or SQL version? What database is being used in
last case? Some of them provides SQL rueries logging. What SQL
queries are sent at a search time?

 Hi guys, I have not found any info in this subject about the error, more than an 
error is no errors but not search results. In fact, I was able to index my site not 
problem, I did have some problem to indexing the site first, but after adding the 
line in the config file it worked. Now I am to a point that I can use the search but 
I get not output at all in fact, here is what is says:
 
 Sorry, but search returned no results.
   
 Any suggestions!
 It works fine from the command line:
 example:
 ./search.cgi remo
 
 I have at least 10 results.


Reply: http://search.mnogo.ru/board/message.php?id=1922

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Link length

2001-04-09 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
It is 128 characters.

 Can you tell me please what is the maximum link length that indexer can process? I 
mean how much characters can be in HREF ?
 
 Thanks, Alexander

Reply: http://search.mnogo.ru/board/message.php?id=1925

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: How to index ftps with non-standard port numbers

2001-04-09 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Why did you decide it?


 You must change function to connect to mysql database. See code bellow
 
 static int InitDB(DB*db)
 {
  mysql_init(amp;(db-gt;mysql));
 
if(!(mysql_real_connect(amp;(db-gt;mysql),DBHost,DBUser,DBPass,DBName,DBPort?port:0,NULL,0)))
  {
   fprintf(stderr, quot;Failed to connect to database: Error: 
%s\nquot;,mysql_error(amp;(db-gt;mysql)));
   db-gt;errcode=1;
   return(1);
  }
  db-gt;connected=1;
  return(0);
 }

Reply: http://search.mnogo.ru/board/message.php?id=1924

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Search bugs for Windows

2001-04-09 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Do you use ispell in indexer?

 hi
 
 if i search for image it would give me 172 results, but if i search for images it 
gives me no result, however i am able to see the title quot;Macbeth, Images of blood 
and waterquot; when i search for Image
 
 
 
 Image
 
http://essay.studyarea.com/cgi-bin/essay/search.exe?q=Imageamp;ps=20amp;o=0amp;m=anyamp;wm=wrdamp;ul=amp;wf=10
 
 
 Images
 
http://essay.studyarea.com/cgi-bin/essay/search.exe?q=Imagesamp;ps=20amp;o=0amp;m=anyamp;wm=wrdamp;ul=amp;wf=10

Reply: http://search.mnogo.ru/board/message.php?id=1928

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Search.cgi Problem

2001-04-10 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 When I go to run the Search.cgi program,
 I get a Error 500 Internal Web server error.
 
 The Apache error-log tells me that the 
 library mysqllient.so.9 cannot be 
 found.
 
 I have added this path to the LD_LIBRARY_PATH
 and re-compiled and installed over again.
 
 Still no go...any ideas?

You have to set this varible at run time, not in compilation time.

Check also our faq at http://search.mnogo.ru/faq.html

Reply: http://search.mnogo.ru/board/message.php?id=1935

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Search bugs for Windows

2001-04-11 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 Yes i use ispell.
 JOe

You have to add correspondent ispell commands into your template,
Default one has some examples.

Reply: http://search.mnogo.ru/board/message.php?id=1955

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: description and txt

2001-04-11 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
It is $DE template variable. Take a look into doc/templates.txt
to check all available variables.


 Hi
 
 is there a feature that when a site have description, it will show the decription 
instead of the txt.
 and if no description, it will show the txt
 Joe

Reply: http://search.mnogo.ru/board/message.php?id=1956

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Multiple databases

2001-04-11 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 why don't you use tag to seperate the two sites
 
 Joe
 

Which tags and sites do you mean?


Reply: http://search.mnogo.ru/board/message.php?id=1957

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Unable to configure mnogosearch...

2001-04-14 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
I have no ideas :-(

 
 BTW, i'm using RedHat 7.0
 I know that there are some compatibility problems with the new GNU packages in RH 
7.0...
 
 I've tried to ./configure
 mnogosearch-3.1.12.tar.gz
 and
 udmsearch-3.0.23.tar.gz
 it was the same fault..
 

Reply: http://search.mnogo.ru/board/message.php?id=1980

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Same Trouble search displays blank pages

2001-04-15 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
What http server do you use?

 I had that problem on another server..that wasn't a hard fix..depends on how you set 
the conf-dist and htm-dist,what database...etc
 The problem I am having now is the search form itself will not show up when 
executing via browser,if I remove the search.htm file I should get an 
error..eg:template not found,not even that..run it from telnet and prints out the 
html format of my search.htm,so the script does know where to find it via telnet..but 
this browser thing is got me crazy,seems to me I should at least get an error page..

Reply: http://search.mnogo.ru/board/message.php?id=1987

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Same Trouble search displays blank pages

2001-04-15 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
It seems that search.cgi crashes during execution.
Try to test it from command line:

  ./search.cgi some_word

Does it crashe?


 I have exactly the same problem!!!
 
  ./indexer -S
 
   Database statistics
 
 StatusExpired  Total
-
200  0 33 OK
302  0 13 Moved Temporarily
404  0  1 Not found
-
  Total  0 47
 
 33 OK urls and http://search.easy-list.net/bin/search.cgi
 is ALWAYS blank!!

Reply: http://search.mnogo.ru/board/message.php?id=1988

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: TXT over 255 characters

2001-04-15 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 Is it possible to ajust the TXt field to catch more than 255 charaters.
 I have tried to define the field to a [txt] field with the limet of 65535 
characters. Not that i will use the amount, but can it be posible to define how much 
to indexer.
 
 Best regards 
 
 Thomas Thygesen.
 

You are right, you have to change txt field definition in SQL table.
Also there is a #define UDM_MAXTEXTSIZE 255  in udm_common.h which
is respobsible for text length. Change it and recompile.

Reply: http://search.mnogo.ru/board/message.php?id=1989

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Same Trouble search displays blank pages

2001-04-15 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
What is displayed in both cases:
with and without template? What can you see in "View HTML source"?


 I had that problem on another server..that wasn't a hard fix..depends on how you set 
the conf-dist and htm-dist,what database...etc
 The problem I am having now is the search form itself will not show up when 
executing via browser,if I remove the search.htm file I should get an 
error..eg:template not found,not even that..run it from telnet and prints out the 
html format of my search.htm,so the script does know where to find it via telnet..but 
this browser thing is got me crazy,seems to me I should at least get an error page..

Reply: http://search.mnogo.ru/board/message.php?id=1990

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Link not found

2001-04-15 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Run indexer -amv6 and check it's output when it visit topics list.

 I've got a little problem. Actually I'm running UltraBoard conference on my site and 
it generates links to the topics like this:
 
 lt;a 
href=quot;UltraBoard.pl?Action=ShowPostamp;Board=mssqlamp;Post=779amp;Idle=365amp;Sort=0amp;Order=Descendamp;Page=0amp;Session=quot;
 OnMouseOver=quot;window.status='Read this (Encoding problem when using DTS) 
topic.';return true;quot; OnMouseOut=quot;window.status=''quot;gt;Encoding 
problem when using DTSlt;/agt;
 -
 
 After running indexer - it doesn't add this link : 
quot;UltraBoard.pl?Action=ShowPostamp;Board=mssqlamp;Post=779amp;Idle=365amp;Sort=0amp;Order=Descendamp;Page=0amp;Session=quot;
  to the database (url table) and so doen't index it. Can you tell me what can cause 
the problem?
 
 Many thanks, Alexander

Reply: http://search.mnogo.ru/board/message.php?id=1991

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: ERROR: Cannot insert a duplicate key intounique index url_url

2001-04-15 Thread Alexander Barkov

I can't believe that. The only thing that I can realize is that
it is the PostgreSQL  who sometimes is "successul" in inserting
duplicate keys. 


Peter Hanecak wrote:
 
 Hello,
 
 On Thu, 11 Jan 2001, Alexander Barkov wrote:
 
   Author: mocha
   Email: [EMAIL PROTECTED]
   Message:
   i see a lot of these errors in my postgres log:
  
   ERROR:  Cannot insert a duplicate key into unique index url_url
 ...
   ERROR:  Cannot insert a duplicate key into unique index url_url
  
   is that normal?
 
  This is normal. indexer is trying to add documents which are already in
  the database. It ignores "duplicate key" errors after an attempt to run
 
  "INSERT INTO url ...".
 
 Looks like indexer is sometimes "sucesfull" in inserting duplicate URL
 into url table because it happens quite a few times to me that I dumped DB
 for backup and when trying to restore it, I get error:
 
 ERROR:  Cannot insert a duplicate key into unique index url_url
 
 It happens with older mnogosearch 3.1.x indexers and it also happens with
 mnogosearch 3.1.12 indexer and clean database at start. I'm using
 PostgreSQL 7.0.3 on Linux 2.4.3, glibc 2.2.1, threads enabled (and used 2
 threads when indexing).
___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Core dump under Solaris

2001-04-15 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Seeing this report I can imagine that you compiled threaded
version. It that true? If yes, how did you do it? There is
only FreeBSD and Linux threads support.

 gdb quot;/usr/local/mnogosearch/sbin/indexer -a 
/usr/local/mnogosearch/etc/indexer_open.confquot;  core
 GNU gdb 5.0
 Copyright 2000 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain conditions.
 Type quot;show copyingquot; to see the conditions.
 There is absolutely no warranty for GDB.  Type quot;show warrantyquot; for details.
 This GDB was configured as quot;sparc-sun-solaris2.6quot;...unknown option `-a'
 
 Core was generated by `/usr/local/mnogosearch/sbin/indexer -a 
/usr/local/mnogosearch/etc/indexer_open.'.
 Program terminated with signal 11, Segmentation Fault.
 Reading symbols from /usr/lib/libsocket.so.1...done.
 Loaded symbols for /usr/lib/libsocket.so.1
 Reading symbols from /usr/lib/libxnet.so.1...done.
 Loaded symbols for /usr/lib/libxnet.so.1
 Reading symbols from /usr/local/lib/mysql/libmysqlclient.so.6...done.
 Loaded symbols for /usr/local/lib/mysql/libmysqlclient.so.6
 Reading symbols from /usr/lib/libm.so.1...done.
 Loaded symbols for /usr/lib/libm.so.1
 Reading symbols from /usr/lib/libc.so.1...done.
 Loaded symbols for /usr/lib/libc.so.1
 Reading symbols from /usr/lib/libnsl.so.1...done.
 Loaded symbols for /usr/lib/libnsl.so.1
 Reading symbols from /usr/lib/libdl.so.1...done.
 Loaded symbols for /usr/lib/libdl.so.1
 Reading symbols from /usr/lib/libmp.so.2...done.
 Loaded symbols for /usr/lib/libmp.so.2
 Reading symbols from /usr/platform/SUNW,Ultra-250/lib/libc_psr.so.1...done.
 Loaded symbols for /usr/platform/SUNW,Ultra-250/lib/libc_psr.so.1
 Reading symbols from /usr/lib/nss_files.so.1...done.
 Loaded symbols for /usr/lib/nss_files.so.1
 Reading symbols from /usr/lib/nss_dns.so.1...done.
 Loaded symbols for /usr/lib/nss_dns.so.1
 Reading symbols from /usr/lib/libresolv.so.2...done.
 Loaded symbols for /usr/lib/libresolv.so.2
 #0  0x0 in ?? ()
 (gdb) backtrace
 #0  0x0 in ?? ()
 #1  0xef4d7494 in res_init () from /usr/lib/libresolv.so.2
 #2  0xef4daeac in gethostbyname2 () from /usr/lib/libresolv.so.2
 #3  0xef5c0c28 in _gethostbyname () from /usr/lib/nss_dns.so.1
 #4  0xef5c0ce8 in getbyname () from /usr/lib/nss_dns.so.1
 #5  0xef655244 in nss_search () from /usr/lib/libc.so.1
 #6  0xef5197ac in _switch_gethostbyname_r () from /usr/lib/libnsl.so.1
 #7  0xef52fcb4 in _door_gethostbyname_r () from /usr/lib/libnsl.so.1
 #8  0xef5179cc in _get_hostserv_inetnetdir_byname () from /usr/lib/libnsl.so.1
 #9  0xef52f6d0 in gethostbyname_r () from /usr/lib/libnsl.so.1
 #10 0x2b81c in UdmHostLookup (Conf=0x552b0, connp=0x81048) at host.c:140
 #11 0x23528 in open_host (Indexer=0x7d8f0, hostname=0xefffcc4c 
quot;www.cimaglobal.comquot;, port=80, timeout=30) at proto.c:259
 #12 0x23908 in UdmHTTPGet (Indexer=0x7d8f0, 
 header=0xefffe650 quot;GET /main/index.htm HTTP/1.0\r\nIf-Modified-Since: Fri, 
23 Mar 2001 11:47:42 GMT\r\nUser-Agent: UdmSearch/3.1.12\r\nHost: 
www.cimaglobal.com\r\n\r\nquot;, host=0xefffcc4c quot;www.cimaglobal.comquot;, 
port=80) at proto.c:370
 #13 0x1488c in UdmIndexNextURL (Indexer=0x7d8f0, index_flags=0) at indexer.c:712
 #14 0x12324 in thread_main (arg=0x0) at main.c:256
 #15 0x12d04 in main (argc=1, argv=0xecd3) at main.c:596

Reply: http://search.mnogo.ru/board/message.php?id=1993

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Need help with regexp in config

2001-04-16 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
The first one, with space between.

 
 Are the disallow line above one or two commands?
 ie: Disallow *ubbmisc.cgilt;space heregt;*privatesend* ?
 
 or
 
 Should it be *ubbmisc.cgi*privatesend* ?
 


Reply: http://search.mnogo.ru/board/message.php?id=2008

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: BIG BUGS descriptions fixes

2001-04-16 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
  Hello!


 I've posted a message with topic quot;Too many open filesquot; a week ago and 
there are no any messages about it. It makes me sad. I like your project, but your 
support and programming culture is BAD ENOUGH!!!
 
 1) The first problem was in quot;too many open filesquot; topic. As I've guessed 
first time, you forgot to close the TCP socket when connection to host fails (time 
out). Look at the line 265 in proto.c (function open_host)... I've found this bug in 
20 minutes by simple reading and text search operation though all your source. I do 
not understand why developers did not react to my message. They get money for 
installing and supporting their system from clients. It seems that it's senseless to 
pay them money for support. Hey, guys, do not loose your clients.


We develop such big project first time. So probably your are 
right, our programming culture may be still not good enough.
3 years ago personally me even didn't know what is socket. So
we are learning how to open and close them developing the project.
Yes, that's my code. An I'm VERY VERY VERY sorry that I forget to close the socket 
when connection fails. Probably this is because I didn't had so many timeouts to test 
this case.


We are currently very busy developing new 3.2 branch, 
it will have many new nice features. It can seem that we don't react because 
we are doing our best to give first release as soon as possible.
However all bug reports are collected and considered how to fix
them. All fixes will be incorporated in both 3.1.13 release and
new 3.2.x releases.


 2) Your UdmEscapeURL() function from udmutils.c (line 394) is WRONG. It does not 
escapes russian characters. More accurate and precise variant of while statement is 
the following one:
 
 for ( ; *s; s++,d++){
 if (isalnum(*s)) *d=*s;
 else if (*s==' ') *d='+';
 else {
 sprintf(d,quot;%%%02Xquot;, (unsigned char)*s);
 d+=2;
 }
 }

This is known RFC incompatibility.
Unfortunatelly, this DOES NOT WORK under Apache with mod_charset
available from apache.lexa.ru. Incorrect behaviour appears when
somebody press Next page link and browser and CGI script works
in different character sets. Links became broken, all letters
in the range 128-255 are not in the original form, posted by
user. And CGI even does not know the original form. It have already
recoded query string.

So we didn't implement this because:

  1. This DOES NOT affect non-Russian users. At least we never
 got such bug reports from non-Russians. All national characters
 work fine for Gemans, Czechs, Hebrews and many many other people.
  2. Apache with mod_charset is the MOST POPULAR in Russian word.
 I hardly will be much wrong if I say that 95 % Russians web
 servers work under Apache with mod_charset.
  3. This DOES NOT WORK under Apache with mod_charset.
  4. Our version DOES WORK under Apache with mod_charset.
  5. Our version DOES WORK almost under any HTTP server without
 built-in charset processing.

I agree we can add something like --enable-escaping into 
configure with conditional compilation for this piece of
code. But trust me we had much work to do implement this
before. We had only one related bug report, it was from
guys who port OS2 version of msearch. So we spent out time
for other most requested things.


 3) Your HTML parsing is wrong in some cases. For example, they are parsing META-tag 
Content-Type/charser and Refresh/URL. Can you imagine that quot;URLquot; may be in 
lower or mixed cases???
 I've repaired a lot of your lines like those (parsehtml.c, line 190):
 
 if(!strcasecmp(tag.name,quot;refreshquot;)){
 if((href=strstr(tag.content,quot;URL=quot;)))
 href+=4;
 }else
 
 Right code is:
 
 if(!strcasecmp(tag.name,quot;refreshquot;)){
 if((href=strcasestr(tag.content,quot;URL=quot;)))
 href+=4;
 }else
 
 Don't look for strcasecmp in manuals. It's handwritten function. I hope you are able 
to write it in 5 minutes.
 

This is several lines above:

// Make lower string
for(l=s;*l;*l=tolower(*l),l++);


 4) You have some bugs in spelling module when using two different languages. It does 
not work properly in some cases of placement Affix and Spell lines in config file.

Ivan, please provide more information. Your third  bug report
was very informative, unfortunately it was fixed before it appeared.




Reply: http://search.mnogo.ru/board/message.php?id=2016

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: HOW TO protect indexer from DoS attacks?

2001-04-16 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 Imagine that there is a magic index.php file.
 It has only a link to index.php?id=1. The last one has a link to index.php?id=2, 
etc., the N-th one (index.php?id=N) has a link to index.php?id=N+1...

 So, it's a simple example of DoS attack to MnoGoSearch based search system. Is there 
a way to protect system from it?
 I think a good one is to set limit of URL per each server. But it should not require 
a lot of additional resources...
 
 Any ideas?


Use MaxHops indexer.conf command.



Reply: http://search.mnogo.ru/board/message.php?id=2017

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Remote Quering

2001-04-17 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 I was wandering.  If I want to query my database from a remote server,
 then how do I get cache data over the query??
 
 So for example, 
 ServerA is webserver with index.html
 ServerB is Search server with all sql/cache data stored on it.
 
 how do I get ServerA to query ServerB for words?

There is a solution:

 1. install apache on ServerB 
 2. configure Apache on ServerA to consider some path, for example
http://ServerA/search/,  as a remote data from ServerB. You have
to add mod_proxy, as far as I remember it is not built by default.
Then add something like this into httpd.conf on ServerA:


ProxyRequests   On
ProxyPass   /search/ http://ServerB/search/
ProxyPassReverse/search/ http://ServerB/search/

 3. Put search.cgi into /search/ directory of ServerB

After that all requests to http://ServerA/search/search.cgi will
cause ServerA to do request to http://ServerB/search/search.cgi
and return it's result to client. 

This is known tecknology to distribute web server between several
machines and keep all resources available under the same
server name, i.e. http://ServerA/ in your case. Note that all machines
except ServerA may be hidden in internal network which is not 
available directly from internet or protected by firewalls.




Reply: http://search.mnogo.ru/board/message.php?id=2023

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: 404 URL's

2001-04-17 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 My pages are accessed by something like main.php?p=1400amp;mt=1 where 
quot;mtquot; is section and quot;pquot; is page name. In database I have indexed 
only pages like 1400.php (status 200) but URL's like main.php?p=1400amp;mt=1 has 
status 404 - not found! Any suggestions?
 

indexer stores server response. Check Apache's access_log.
Whats is written there for main.php?p=1400mt=1  ? Has it
status 200 or 400?


Reply: http://search.mnogo.ru/board/message.php?id=2025

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: HOW TO protect indexer from DoS attacks?

2001-04-17 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
There are many features planned to be implemented in future releases. 
I think this one is under low priority.

Do you really have such problem with DoS attack?



 Hm, nice feature.
 
 But it helps only in linear DoS attack. If it will a tree, for example decimal one - 
this feature useless. Look:
 
 index.php has 10 links:
 index.php?id=0, index.php?id=1,...,index.php?id=9.
 
 ...
 
 index.php?id=abc has 10 links too:
 index.php?id=abc0, index.php?id=abc1,...,index.php?id=abc9.
 etc.
 
 So, for example, MaxHops=8. It is small enough value and I do not want to decrease 
it more. But in this case it is possible to flood 10^8 links before the limitation 
will be used.
 
 MaxDocuments variable will protect indexer much better.
 

Reply: http://search.mnogo.ru/board/message.php?id=2027

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: 404 URL's

2001-04-17 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 Status in Apache access_log is 200...

Try to reindex those pages using:

indexer -am -s404


Reply: http://search.mnogo.ru/board/message.php?id=2031

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: Parsing URL Values

2001-04-19 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
  Hi!


Take a look into  'Using alias in Realm command' section of alias.txt
This is very powerful thing. I hope it's what you need.



 hi guys!
 
 this has nothing much to do w/ mnogosearch engine, but i really need help and i know 
that you people got more expirience than me :)
 
 what i need to do is something like the google directory structure:
 
 let's suppose i have a URL like this: 
http://www.site.com/section1/lesson1/computers/computer.html
 
 the /section1/lesson1/computers/computer.html are nothing but variables (it could be 
like this too: ?sec=1amp;les=1amp;topic=computersamp;file=computer - but this is 
not readable by the common user)
 These variables (section1, lesson1, computers, computer) would be then related to 
ID's in a MySQL database.
 What i need is a way (using PHP and APACHE) to parse those variables without getting 
the ERROR 404! how can i do it? is it possible w/ PHP4? do i need to make a DHanlder 
(default handler) in APACHE and how? where i work , we use Mason and PERL but we are 
switching to PHP. 
 If you go to de Google Directory you will understand what I need
 
 Hope for an answer... Tanx guys! :)
 
 Sergio
 

Reply: http://search.mnogo.ru/board/message.php?id=2046

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Re: Index 2 or more sites problem! - please help

2001-04-19 Thread Alexander Barkov

Please try 

  indexer -amv6


Tek Guy wrote:
 
 Hello,
 
 I still have problems with indexing,  I would like to be able to index
 3 of our websites but it seems like it only accept 1 and only index 1
 page.
 
 I'm using version 3.1.12 with MySQL support  -
 Configuration file i specified:
 
 Robots no
 Server site http://www.domainX.com
 
 --Log of oupt
 
 Indexer[16469]: indexer from mnogosearch-3.1.12/MySQL started
 with '/usr/local/m
 nogosearch/etc/indexer.conf'
 Indexer[16469]: [1] http://www.domainX.com/
 Indexer[16469]: [1] Server 'http://www.domainX.com/'
 Indexer[16469]: [1] Allow by default
 Indexer[16469]: [1] HTTP/1.1 200 OK
 Indexer[16469]: [1] Date: Thu, 19 Apr 2001 17:08:41 GMT
 .
 Indexer[16485]: [1] "/opalbum/": Allow by default
 Indexer[16485]: [1] "/chatting/": Allow by default
 Indexer[16485]: [1] Done (0 seconds)
 
 --log ends--
 
  Hi!
 
  Check your robots.txt as well as main page for links.
  Note that indexer does not follow links within JAVA
  scripts. YOu can also run indexer with these options:
 
 indexer -amv6 http://www.domainX.com/
 
  and check it's output. indexer will display a lot of debug
  information including all found links and reasons why indexer
  accepts those links or does not accept them.
 
 
 
  Tek Guy wrote:
 
  Hello,
 
  I have 2 "Server site http://www.domainX.com/" lines with different
  value of X in the configuration file "indexer.conf" but for some
  reasons it only index the first site and ignores the second one.  I'm
  using  mnogosearch-3.1.12 and the original dist conf indexer.conf-
 dist
  after renaming to indexer.conf.
 
  1.  My problem, in only index 1 site instead of all the sites
 specified
  with "Server" command.
 
  2.  Indexing only work for the first page ie "index.html" instead of
  all the pages with respect to http://www.domainX.com/ setting.  What
 I
  mean is it doesn't follow the links on the main page and index those
  pages.  In conf file, i have "Follow site"
 
  If anyone has a configuration, could I have a copied that can index
  multiple pages within the same domains by following the links and
 also
  multi domains.
 
 
 Powered by a href=http://www.vietmedia.comhttp://www.vietmedia.com/a
 Free E-mail, Instant Messaging, and more!
___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: robots.txt problem

2001-04-19 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
It seems that robots.txt still hasn't been indexed. 
Run

   indexer -amu http://servername/robots.txt

 then run indexer in usual manner.


 I created a robots.txt file and placed it in the root of my web site.  The contents 
of the robots.txt are as follows:
 
 User-agent: *
 Disallow: /
 
 This should keep all search engines from indexing my site.  When I run mnoGoSearch, 
it still indexes all the pages.  I have the robots.txt option checked in the servers 
tab.  Any ideas?  Thanks.

Reply: http://search.mnogo.ru/board/message.php?id=2057

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: $DD cut off: solution?

2001-04-21 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 I found in udm_indexer.h a line:
 #define UDM_MAXDESCSIZE   100
 I changed to 
 #define UDM_MAXDESCSIZE   254
 
 Heiko
 
 

Yes, that's right place to change description size.

Reply: http://search.mnogo.ru/board/message.php?id=2068

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Re: Webboard: Indexing only .iso files

2001-04-24 Thread Alexander Barkov

Try this combination instead of your one:

Allow *.iso
HrefOnly  *




duncan wrote:
 
 Here is the output:
 
 Indexer[31136]: indexer from mnogosearch-3.1.12/UdmDB started with
 '/home/mnogo/etc/indexer.conf'
 Indexer[31136]: [1] http://www.redhat.com/robots.txt
 Indexer[31136]: [1] Server 'http://www.redhat.com/download/'
 Indexer[31136]: [1] Disallow NoCase  *
 Indexer[31136]: [1] http://www.redhat.com/download/mirror.html
 Indexer[31136]: [1] Server 'http://www.redhat.com/download/'
 Indexer[31136]: [1] Disallow NoCase  *
 Indexer[31136]: [1] Done (0 seconds)
 
 thank you!
 
 On Tue, 24 Apr 2001, Alexander Barkov wrote:
 
  Please run
 
indexer -amv6
 
  and check it's output. It will print an information about all
  found links.
 
 
  duncan wrote:
  
   Hello, and thanks matthew-
  
   I tried what you suggested, and in fact, here is the whole conf file:
  
   Allow */
   CheckOnly *.iso
   Disallow *
  
   Server http://www.redhat.com/download/mirror.html
  
   and it only returns this:
  
   Indexer[30131]: indexer from mnogosearch-3.1.12/UdmDB started with
   '/home/mnogo/etc/indexer.conf'
   Indexer[30131]: [1] http://www.redhat.com/robots.txt
   Indexer[30131]: [1] http://www.redhat.com/download/mirror.html
   Indexer[30131]: [1] Done (0 seconds)
  
   Something isint right there, and I dont know where to do with it.  I feel
   like i try so many things... is there more doc. out there, or more
   examples?
  
   thnaks, i appreciate your response
 
 
 --
 ||  ||  ||  ||  ||  ||
 duncan shannon
 [EMAIL PROTECTED]
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Ignoring navigation text

2001-04-24 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
If it is your site, you can use !--UdmComment--  or NOINDEX
tags. Check documentation.


 Hi all,
 
 we have set up our pages using HTML for the navigation - this means that when the 
Indexer (indexing the pages) runs, it indexes the navigation words, so our display 
shows things like
 
 quot;Local Info Americas Africa/Middle East Asia Australasia Europe Corporate 
profile Financial management Any questions Research amp; sponsorship Malta CIMA 
university award This yearamp;s ceremony took place at the University of Malta on 23 
Novembquot;
 
 where Local Info Americas Africa/Middle East Asia Australasia Europe Corporate 
profile Financial management Any questions Research amp; sponsorship are all part of 
our navigation (this looks pretty dumb) ios there a way to get the indexer to ignore 
these
 
 thanks
 Michael
 
 
 

Reply: http://search.mnogo.ru/board/message.php?id=2094

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: next_index_time query

2001-04-25 Thread Alexander Barkov

Hello!

Yes, you are right.

You may also use big Period command for those pages.

- Original Message - 
From: Anand Raman [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, April 25, 2001 10:50 AM
Subject: next_index_time query


 HI guys
 I am operating mnogosearch under db mode in postgresql.
 
 I want to stop some urls from reindexing.. Can i just change the
 next_index_time column in the url table to some value and prevent this
 from happening..
 
 Any comments
 
 Thanks
 Anand
 ___
 If you want to unsubscribe send unsubscribe general
 to [EMAIL PROTECTED]
 
 

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Webboard: Ignoring navigation text

2001-04-25 Thread Alexander Barkov

Gavin Love wrote:
 
 If you use !--UdmComment--  or NOINDEX
 does the indexer still follow the links contained
 within the area enclosed by the the tags?
 or will it simply not store the text,
 in the txt field in the url table?
 


It will  follow links, but will not add those words into word index,
as well as will not add into TXT field.
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: charset

2001-04-25 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Check HTTP headers which are sent by your web-server.

Try this:  wget -s http://localhost/

What can you see in Content-Type  header?


 Hello All,
 
 I tried to index web site with Cyrillic koi8-r charset,
 but indexer didn't store any russian words in dict table, only latin.
 As result, I can search latin words, but not russian
 
 indexer.conf:
 -
 # This is a minimal sample indexer config file
 
 DBAddr mysql://user@pass:localhost/mnogosearch/
 #DBMode crc
 
 LocalCharset koi8-r
 CharSet koi8-r
 
 #Ispellmode text
 #Affix ru /usr/local/share/ispell/russian.aff
 #Spell ru /usr/local/share/ispell/russian.dict
 
 ServerTable server
 
 Server  http://localhost/ 
 
 # Allow some known extensions and directory index
 Allow *.html *.htm *.shtml *.txt */
 
 # Disallow everything else
 Disallow *

Reply: http://search.mnogo.ru/board/message.php?id=2104

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: charset

2001-04-25 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Please contact me by email

 % wget -s http://localhost/
 --22:45:45--  http://localhost/
=gt; `index.shtml'
 Connecting to localhost:80... connected!
 HTTP request sent, awaiting response... 200 OK
 Length: unspecified [text/html]
 
 
 index.shtml has the following header
 
 lt;htmlgt;
 lt;headgt;
 lt;META HTTP-EQUIV=quot;Content-Typequot; CONTENT=quot;text/html; 
charset=koi8-rquot;gt;

Reply: http://search.mnogo.ru/board/message.php?id=2110

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: 3.1.12 and MacOSX

2001-05-03 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Are there any debugging tools like gdb under MacOSX? 

 Hi all,
 We report to all are interesting with this, that mnogo install correctly on the new 
MacOSX platform. Compile with only one warning.
 (OSX 10.0.0 - Tenon iTools 6.01 (Apache) - MySql 3.23.27)
 
 The only problems is that it made a quot;segmentation faultquot; at the end of 
indexing. Therfore the database seems to be correct, and searching run fine with 
search.cgi.
 
 You can see it (in french) at http://mno.imotep.com/cgi-bin/search.cgi
 (this is a special Porsche crawl ;-)

Reply: http://search.mnogo.ru/board/message.php?id=2137

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: 3.1.12 search.cgi remote gaining shell access exploit fix

2001-05-03 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Thanks. This fixed in 3.1.13 sources.
 
 Bad news. I just check your very recent search.c v1.23 via WWW cvs and see that you 
add tmplt= variable parsing there. Previous buffer overflow (I post the patch for) 
overflows data segment and stack by some indirect tricks, but new tmplt= parsing 
allow direct writing to the stack because template[] is on the stack of main(). 
Dangerous code is:
 sprintf(template,quot;%s%s%squot;,UDM_CONF_DIR,UDMSLASHSTR,token+6);
 It overflows even with my posted fix because UDMSTRSIZ for token increased by 
UDM_CONF_DIR+UDMSLASHSTR count characters. If someone have UDM_CONF_DIR long enough 
for shell code, he'll got it.
 
 

Reply: http://search.mnogo.ru/board/message.php?id=2138

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Linux binary available?

2001-05-04 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 Is there a Linux binary install of mnogosearch available that can be installed 
without quot;rootquot; privilege and works with MySQL?  Thanks.


There is no binaries. You may install from sources without
having root access. 

Reply: http://search.mnogo.ru/board/message.php?id=2142

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: question

2001-05-07 Thread Alexander Barkov

No.

La Rocca Network wrote:
 
 Hi !
 
 is there any way to include a site in more than one category ?
 
 Regards,
 Nelson
 
 ___
 If you want to unsubscribe send unsubscribe general
 to [EMAIL PROTECTED]
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Few random things

2001-05-10 Thread Alexander Barkov

Briggs, Gary wrote:
 
 Has anyone here got a way of indexing powerpoint or visio documents?
 
 Changing the document is not viable; I need a way to get the strings out of
 it.
 
 strings is not too bad on powerpoint, but for visio it's not worth the
 effort.


You may use so called external parser - any program which can convert
visio documents into text or html. Check doc/parsers.txt


 Also, Is there any way to convert documents with this in them:
  META HTTP-EQUIV=Content-Type CONTENT=text/html; charset=windows-1252
 ?
 I'd ideally like to convert them to something more standard... Can I do
 this?


What format do you want to get after convertion?


 As in, I can't change anything. At all. I need a way to do all these things
 in the search engine.
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: indexing multiple sites

2001-05-10 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
You may use URL limits. Take a look into default search.htm.
SELECT NAME=ul  is responsible for them.


 I need to know like I can index several sites, and that the finder allows to look 
for me in: 
 - All the sites 
 - Each site in particular form 
 - Some section of some site .
 Example:
 I've the folowing URLs to index:
 - http://www.tercera.cl/
 - http://www.tercera.cl/sitios/
 - http://www.tercera.cl/casos/
 - http://www.lacuarta.cl/
 - http://www.lacuarta.cl/temas/
 - http://www.lacuarta.cl/sitios/
 - http://www.deportivo.cl/
 - http://www.mouse.cl/
 and some others...
 Someone can send me the indexer.conf and some search.htm?
 thanks a lot.
 
 
 
 
 

Reply: http://search.mnogo.ru/board/message.php?id=2172

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: htdb and his first entry

2001-05-10 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Try  this indexer.conf command
  URLWeight 0


 hello,
 my question: how do i get ride of the first entry in the url table with all the 
other urls produced by htdblist inside? why? because this entry come out als first 
when somebody search for a word included in the url!
 thanks,
 manu :-)

Reply: http://search.mnogo.ru/board/message.php?id=2173

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Errors During Make

2001-05-10 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 RH 5.2 i686 linux
 Getting these errors during make
 AM_PROG_LIBTOOL not found in lib
 AM_DISABLE_SHARED not found in lib
 What do you think?

Which version of msearch are you using? 
Is it taken from CVS? If so, probably you have
to upgrade automake and autoconf.



Reply: http://search.mnogo.ru/board/message.php?id=2174

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Long URLs and 3.2 branch

2001-05-10 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Hello!

To fix this in 3.1.x:

1. Change SQL url table structure, make url field longer.
2. Change UDM_URLSIZE definition in udm_common.h
3. Recompile



 Hello All!
 
 I just stumpled across a problem, is hopefully going to be solved. In
 the mnogo 3.1 branch URLs which are longer than 128 bytes are
 obviously not supported. I found a mail that says, this will be
 tackled in the 3.2 branch.
 
 Question:
 Is there a timeline for the 3.2 branch? When will it be released?
 or
 Does anyone have a patch, which solves the problem for mnogosearch
 with mysql?
 
 Thanks for your answers,
 
 Markus
 

Reply: http://search.mnogo.ru/board/message.php?id=2175

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: resulting url when using frameset

2001-05-10 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 Is there a simple way to give as answer to a search the frameset file
 rather than the file appearing in the tags lt;Frame src= ...
 

Unfortunatelly it's not implemented.


Reply: http://search.mnogo.ru/board/message.php?id=2176

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: need to decode Intag field

2001-05-10 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
When phrase yes:

It is combined using word position and it's weight:

  pos*0x1+weight

When phrase no word appearance count  is used instead of it's pos:

  count*0x1+weight

Reply: http://search.mnogo.ru/board/message.php?id=2177

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: multilanguage text

2001-05-10 Thread Alexander Barkov


3.2.x branch will have language guesser. It's already implemented
and work very fine for single-language pages or even mostly
single-language
pages. I hope first release of 3.2.x will be available in May.



Danil Lavrentyuk wrote:
 
 [ On Wed, 9 May 2001, Maxime Zakharov wrote: ]
 
 MZ  And what if a site having many texts uploaded by users?
 MZ  Have I manualy edit all they satting lang attributes? :)
 MZ  Have I demand it from uploader? They will not.
 MZ
 MZ Users may upload big mega gifs as .html files :)
 
 It would be an obvious fraud...
 
 MZ Let talk about W3C recommendations.
 
 ... but ignoring of far-away-placed committee's recomendations could be a
 simply laziness.
 Not all of the software use all of the recomendations.
 Not all of users know all of the recomedations. Even not all of users think on
 using such recomendations.
 
 Text could be converted to HTML from someone another text fromat.
 Who, for example, will check for foreign phrases such text like big books
 which consists of many volumes (like Amber by Zhilazny or Wheel Of Time by
 Jordan or even bigger)? :)
 
 Let's tall about real world where we would have to index multilanguage texts
 without lang attributes.
 
 MZ  What if I have to index texts placed somewhere in the internet, not locally?
 MZ  What if a site contains texts of many books (something like www.lib.ry, for
 MZ  example)?
 MZ
 MZ Sometime, without explicit language definition it's impossible uniquely
 MZ select language for a word.
 MZ For example, word 'test' may be english or german.
 
 I know.
 Think it is real (but hard, I see) to make a system which could guess what the
 text's language is. It could use 2 steps:
 1) Create a list of encodings this text could be written in (symply by
 testing, is all of the word's characters are aplhas in this encoding). Here we
 could think that a two or more successive foreign words are from the same
 language.
 2) Check (using ispell tables) all the languages which use encondigs from list
 (created above), looking for one where this words are correct.
 3) (optoinal) If there more then one language suitable, select one that was
 seelcted for the previous phrase.
 
 OK. This method does not gurantee that selection will be correct always. But
 in the most cases it will.
 
 Yes, I know, this method is not too quick... But it is better then no any
 method at all. Any way it is good to make it able to turn it of in the
 indexer.conf file or by a command line option.
 
 
 Danil Lavrentyuk
 Communiware.net
 Programmer
 
 ___
 If you want to unsubscribe send unsubscribe general
 to [EMAIL PROTECTED]
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: indexing multiple sites

2001-05-12 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
indexer.conf is OK. Don't forget to configure SELECT with
name ul in your search.htm


 need to know like I can index several sites, and that the finder allows to look for 
me in: 
- All the sites 
- Each site in particular form 
- Some section of some site . 

 Is this OK in indexer.conf?
 #
 Period 1d
 Serverhttp://www.quepasa.cl/
 Server  path  http://www.quepasa.cl/sitios/
 Server  path  http://www.quepasa.cl/sitios/especiales/
 Server  path  http://www.quepasa.cl/sitios/enfoco/
 Serverhttp://deportivo.tercera.cl/
 Serverhttp://dirigible.tercera.cl/
 Serverhttp://mouse.tercera.cl/
 Serverhttp://www.lacuarta.cl/
 Server  path  http://www.lacuarta.cl/sitios/
 Server  path  http://www.lacuarta.cl/temas/
 Serverhttp://www.lahora.cl/
 Serverhttp://mujer.tercera.cl/
 Serverhttp://siglo20.tercera.cl/
 Serverhttp://www.radiozero.cl/
 Serverhttp://papasfritas.tercera.cl/
 Serverhttp://www.tercera.cl/
 Server  path  http://www.tercera.cl/sitios/
 Server  path  http://www.tercera.cl/casos/

Reply: http://search.mnogo.ru/board/message.php?id=2189

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Mp3 Search

2001-04-23 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 Hello.
 I am creating an mp3 search engine and i have a such problem.
 When I am indexing some mp3s some of them gets status 206 (partitialy ok)

It's OK. indexer does not download whole file. It checks only
those document's parts where  MP3 headers are expected to be
found.

 and I can't search for that mp3s but as I see in url table it have  written the 
description for mp3s

Probably those files have empty MP3 tags.

Reply: http://search.mnogo.ru/board/message.php?id=2078

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Webboard: Titles incorrect for pdf files

2001-04-26 Thread Alexander Barkov

Richard Wall wrote:
 
 - Original Message -
 From: Richard Wall [EMAIL PROTECTED]
 
  Yeah, it works really well. Infact it accepts a third argument, the URL of
  the page so I've modified your shell script as follows, using the $UDM_URL
  environment variable set by mnogosearch...
 
 Actually, I've discovered a problem. When indexing certain pdf documents,
 the doc2html perl script hangs and uses 100% processor resources.
 
 It always gets stuck at the same place...
 confident that the automotive sector can
 
 But I can't understand why.
 
 Alexander, could you try indexing this document with doc2html.pl...
 
 http://elkie.coventry-id.co.uk/~richard/wb58.pdf
 
 to see if you get the same problem.


pdfinfo called from doc2html does not return anything to
stdout. It warns about bad format to stderr:

/usr/home/bar  pdfinfo wb58.txt 
Error: May not be a PDF file (continuing anyway)
Error (0): PDF file is damaged - attempting to reconstruct xref table...
Error: Couldn't find trailer dictionary
Error: Couldn't read xref table


  So, doc2html seems to wait for pdfinfo output forever.
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: mnogosearch on intranet

2001-05-15 Thread Alexander Barkov

íÏÖÅÔÅ ÌÉ ÷Ù ÞÉÔÁÔØ ÐÏ-ÒÕÓÓËÉ?

Can you read Russian?


Florin Andrei wrote:
 
 On 12 May 2001 15:35:36 +0500, Alexander Barkov wrote:
  We tested up to 5 mln document in so called cache mode storage.
  But this mode is still in beta and people reports that it does
  not work properly in some cases.
 
 What is the typical error for cache mode?
 If it's not something important or obvious, then i hope i could use it.
 
 --
 Florin Andrei
 
 Remember, son:
 if you never try, you never fail - Homer Simpson
 
 ___
 If you want to unsubscribe send unsubscribe general
 to [EMAIL PROTECTED]
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Doc Relevance ($DR)

2001-05-15 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 All of my returned results show the same relevance/rating [1] when using $DR... 
Perhaps I don't understand the meaning of this. I expected results at the top to have 
a higher rating. I have a rough idea of how this is being calc'ed (how many times 
does the word appear in dict.word)...
 
 Suggestions?? Comments??
 
 Thanks

It's covered in the doc/relevancy.txt 

Reply: http://search.mnogo.ru/board/message.php?id=2208

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: htdb and his first entry

2001-05-15 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 it doesn't work! the entry is still there!!!
 (of course deleted first the db: indexer -C)
 ???
 manu :-)

Try this patch


--- sql.c.orig  Thu Mar 29 19:04:08 2001
+++ sql.c   Tue May 15 13:22:24 2001
@@ -4405,7 +4405,7 @@
 #ifdef HAVE_MYSQL
MYSQL_ROW row;
row=mysql_fetch_row(db-res);
-   sprintf(s,a href=\%s\%s/abr\n,*row,*row);
+   sprintf(s,a href=\%s\/abr\n,*row);
s=UDM_STREND(s);
 #else
sprintf(UDM_STREND(s),a href=\%s\%s/abr\n,



Reply: http://search.mnogo.ru/board/message.php?id=2209

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: different indexing/splitters

2001-05-15 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 what would happen if I indexed some urls with mnogo+sql+cachemode, but
 then split the cachelogs using a different version that was 
mnogo+sql+phrase+cachemode+fasttag+fastcat?
 
 

That seems to be a reason of empty results- pages with 0 file size,
 No title and so on.


Reply: http://search.mnogo.ru/board/message.php?id=2210

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: 3.1.12 fix

2001-05-15 Thread Alexander Barkov

  Hello, Thomas!

Sorry for so late reply, we are working on 3.2.x branch now,
so I just had no time to check your function. It works fine
and I replaced the old one by your version. It will appear
in 3.1.13 release which will be available this week. At least
we hope so.  Thatk you very much for contribution!


Thomas Olsson wrote:
 
 Hi,
 
 Thanks for a very nice program :-)
 
 I've got a small contribution, since I found a bug in mnogosearch 3.1.12
 and fixed it. It is the hilightcpy() function in search.c. This function
 chops off the last char if the input string ends in a single word
 character. E.g. a title string called RFC 7 will be output as RFC .
 
 I must admit I couldn't be bothered to work out exactly why it didn't
 work, so I just wrote a replacement. You can decide yourself if you want
 to fix the original bug, or use this replacement. I've been using it for
 some time now, and it does seem to work.
 
 I've tried to keep the style from the code, though it is pretty far from
 my usual formatting :-)
 
 static char *hilightcpy(int LCharset, char *dst, char *src, char *w_list, char 
*start, char *stop) {
 char *t = dst, *s = src, *word = src;
 char real_word[64];
 
 if (*s) {
 do {
 if (!UdmWordChar(*s, LCharset)) {
 if (word  s) {
 char save = *s;
 *s = 0;
 sprintf(real_word,  %.61s , word);
 UdmTolower(real_word, LCharset);
 if (strstr(w_list, real_word)) {
 sprintf(t, %s%s%s, start, word, 
stop);
 }else{
 strcpy(t, word);
 }
 t += strlen(t);
 *t++ = *s++ = save;
 word = s;
 }else{
 *t++ = *s++;
 word++;
 }
 }else{
 s++;
 }
 } while (s[-1]);
 }
 *t = 0;
 return dst;
 }
 
 Regards,
 Thomas
 
 --
 Thomas Olsson
 http://www.armware.dk/
 
 ___
 If you want to unsubscribe send unsubscribe general
 to [EMAIL PROTECTED]
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: htdb and his first entry

2001-05-15 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
You are right! Thanks!


 hi,
 thank you very much, the patch does the work! but for the distributed 
 version should look as follows (see line number):
 
 --- mnogosearch-3.1.12/src/sql.c.SQLTue May 15 11:45:46 2001
 +++ mnogosearch-3.1.12/src/sql.cTue May 15 11:47:45 2001
 @@ -4406,7 +4406,7 @@
  #ifdef HAVE_MYSQL
 MYSQL_ROW row;
 row=mysql_fetch_row(db-gt;res);
 -   sprintf(s,quot;lt;a 
href=\quot;%s\quot;gt;%slt;/agt;lt;brgt;\
 nquot;,*row,*row);
 +   sprintf(s,quot;lt;a 
href=\quot;%s\quot;gt;lt;/agt;lt;brgt;\
 nquot;,*row);
 s=UDM_STREND(s);
  #else
 sprintf(UDM_STREND(s),quot;lt;a href=\quot;%s\
 quot;gt;%slt;/agt;lt;brgt;\nquot;,
 
 spasiba!
 manu :-)
 
 

Reply: http://search.mnogo.ru/board/message.php?id=2213

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: htdb and reindex

2001-05-16 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
To add new entries you have to reindex a page which is generated
by HTDBList query using -am arguments. Then run indexer in
usual manner w/o arguments. Also take a look into MySQL
queries log, it may help to check what happens.


 hi,
 i am indexing with mnogosearch ver 3.1.12 and method htdb (mysql). 
 every clean index (after a indexer -C) is done really fine, but if i 
 want to reindex (indexer -a) or add the new entries in the db 
 (indexer w/o args), the indexer reads the urls but nothing is 
 indexed, so i have to delete and index the whole db again, this takes 
 a long time. any solution?
 thanks,
 manu :-)
 

Reply: http://search.mnogo.ru/board/message.php?id=2220

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Windows character sets

2001-05-17 Thread Alexander Barkov

Not all of charsets can be converted to ascii or latin1 charaset.
windows-1251 can't be converted to latin1/ascii, at least it's cyrillic
part.
Don't worry about windows-1252, it's letter compatible with latin1.
windows-1250 can be converted to latin1/ascii without having to loose 
major information, but 3.1.x branch has not so powerfull charset 
convertion code.

The answer is NO, you can't do translation in indexer. You may try
to do it in PHP.

Briggs, Gary wrote:
 
 I'm outputting XML from my search engine for use in other people's websites,
 and I'm having a small problem.
 
 Some of the sites I'm indexing are made in word [I've no control over this],
 and outputted as html.
 
 And they're in strange character sets like windows-125{0,1,2}.
 
 When I output the XML, it contains things like 92s, which are the word
 equivalent of a normal '. Is there any way I can do translations on this,
 either in the indexer, or in the php? [I'm using the php front end, and
 crc-multi DB schema].
 
 Basically, I'd like to see nothing more than US-ASCII or friends; much
 easier to use, and won't break perl scripts on unix boxes.
 
 Anybody?
 
 Ta,
 Gary (-;
 
 PS I never got any response to my RFC on my code for putting stuff INTO the
 database from XML. Does anyone have anythign to add to it?
 ___
 If you want to unsubscribe send unsubscribe general
 to [EMAIL PROTECTED]
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: index by time

2001-05-17 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 I want to index a block of url withing mnogo.url, so I set
 
 last_index_time=unix_timestamp()-3600,
 last_mod_time=unix_timestamp()-1800,
 next_index_time=unix_timestamp()
 and all status=209
 
 but when I run indexer(usu. -m -s 209), it doesnt seem to care about what the dates 
are, does anyone have the same situation where 
 indexer seems to ignore the time values?
 

What do you mean it does not care about dates?
If you want to exclude some URLs from indexing using such way,
you have to set their next_index_time to something in the future,
for example a month:

next_index_time=unix_timestamp()+30*24*60*60


Reply: http://search.mnogo.ru/board/message.php?id=2227

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: why indexer indexes url with ?

2001-05-17 Thread Alexander Barkov

You have Allow *  before those disallow commands.
Note that Allow/Disallow commands are checked in the order of their
appearence, in the indexer.conf so indexer finds Allow * before
others. You have to move this command after all Disallow commands.


FL wrote:
 
 Here it is, I have cut the URLs (about 1000).
 
 Thanks.
 
 François
 
 At 11:10 15/05/01 +0500, Alexander Barkov wrote:
 Please send your indexer.conf
 
 
 FL wrote:
 
  Hi! I don't want to index url with '?'.
 
  This is my indexer file (no modifications from the default) :
 
  # Exclude cgi-bin and non-parsed-headers using string match:
  Disallow */cgi-bin/* *.cgi */nph-*
  # Exclude anything with '?' sign in URL. Note that '?' sign has a
  # special meaning in string match, so we have to use regex match here:
  Disallow Regex  \?
 
  But I can see URL indexed like :
 
 
 http://www.premier-ministre.gouv.fr/spihtm/sig_ie4/theme/r_t.cfm?t1=Culture;
  t2=Histoire
 
  What's wrong ?
 
  François
 
  ___
  If you want to unsubscribe send unsubscribe general
  to [EMAIL PROTECTED]
 ___
 If you want to unsubscribe send unsubscribe general
 to [EMAIL PROTECTED]
 
 
 
 
   
  Name: sample.zip
sample.zipType: Zip Compressed Data (application/x-zip-compressed)
  Encoding: base64
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: AliasProg

2001-05-17 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Thanks for reporting! It's fixed in 3.1.13 sources, it will be
available today.

 The very nature of any URL that is passed in AliasProg's $1 is wrong
 and messed up. That's why it's being Aliased.
 
 My issue is with URL's that contain a single quote (') or adouble
 quote (quot;). When every I try to pass it a script to be processed (via
 $1), the shell interprets the single (or double) and waits for another
 ending quote.  The problem would also be with astericks (*), because
 essentially the raw URL is being processed by the shell.  Any
 occurance of this type of URL will crash quot;indexerquot;.
 
 Either the URL is passed in on stdin to a shell script (specified in
 AliasProg) or the URL needs to be escaped. Otherwise this problem will
 persist. It's an easy fix too. Escaping the URL would probably make
 the most sense, and wouldn't change the syntax of AliasProg. Maybe
 adding quot;AliasProgStdinquot; directive would do it too.
 
 -justin

Reply: http://search.mnogo.ru/board/message.php?id=2230

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Webboard: Indexing product categories which are dynamically updated from base.

2001-05-17 Thread Alexander Barkov

ðÒÉ×ÅÔ!

ëÁÖÉÓØ ÔÕÔ ËÁË-ÔÏ ÍÏÖÎÏ ÉÚ×ÅÒÎÕÔØÓÑ. ðÏËÁ ËÁË ÉÍÅÎÎÏ
Ñ ÎÅ ÐÏÎÑÌ, ÎÏ ×ÒÏÄÅ ÍÏÖÎÏ. á ÎÁ ÆÁÊÌÏ×ÏÊ ÓÉÓÔÅÍÅ ÞÔÏ 
ÌÅÖÉÔ É × ËÁËÏÍ ×ÉÄÅ?


ðÏ ×Ï×ÏÄÕ SELECT, ÍÏÖÎÏ ÎÅ ×ÅÓØ search.htm ÇÅÎÅÒÑÔØ,
Á ÔÏÌØËÏ ÞÁÓÔØ. á ÅÅ ×ËÌÀÞÁÔØ ÞÅÒÅÚ $if(/file/to/select.htm)
á ÅÝÅ ÍÏÖÎÏ ×ËÌÀÞÉÔØ ÅÅ ËÁË ×ÎÅÛÎÉÊ URL: 
$iurl(http://www.blalba.ru/select.sgi)



Author: Alexander  
Email: [EMAIL PROTECTED] 
Message:   
íÎÅ ÎÅÏÂÈÏÄÉÍÏ ÐÒÉËÒÕÔÉÔØ ËÁËÕÀ-ÎÉÂÕÄØ ÐÏÉÓËÏ×ÕÀ ÓÉÓÔÅÍÕ Ë ÜÌÅËÔÒÏÎÎÏÍÕ
ÍÁÇÁÚÉÎÕ (ÐÏÉÓË ÐÏ ÔÏ×ÁÒÁÍ). ñ ÏÓÔÁÎÏ×ÉÌ Ó×ÏÊ ×ÙÂÏÒ ÎÁ íÎÏÇÏÓÅÒÞÅ. ïó: 
Linux, âä: Oracle. ñ ×ÓÅ ÕÓÔÁÎÏ×ÉÌ, ×ÒÏÄÅ ÂÙ ×ÓÅ ÒÁÂÏÔÁÅÔ. 
   
îÏ ÅÓÔØ ÎÅËÏÔÏÒÙÅ ÐÒÏÂÌÅÍÙ.
   
1.éÎÄÅËÓÉÒÏ×ÁÔØ ÔÏ×ÁÒÙ ÎÁÄÏ ÐÏ ÆÁÊÌÏ×ÏÊ ÓÉÓÔÅÍÅ, Á ÎÅ ÐÏ www.  
2.îÅÏÂÈÏÄÉÍÏ ÓÏÚÄÁÔØ ÐÏÉÓË ÐÏ ÒÁÚÄÅÌÁÍ (Ó ÓÏÏÔ×ÅÔÓÔ×ÕÀÝÉÍ select'ÏÍ),  
ËÏÔÏÒÙÅ _ÄÉÎÁÍÉÞÅÓËÉ_ ÏÂÎÏ×ÌÑÀÔÓÑ ÉÚ ÂÁÚÙ (ÒÁÚÄÅÌ ÐÏÉÓËÁ == ÒÁÚÄÅÌ 
ËÁÔÁÌÏÇÁ ÔÏ×ÁÒÏ×). ñ ÉÓÐÏÌØÚÏ×ÁÌ ÄÌÑ ÜÔÏÇÏ ÐÏÌÅ category.  
   
óÏ×ÍÅÓÔÉ× 1. É 2., Ñ ÎÁÐÉÓÁÌ ÓËÒÉÐÔ, ËÏÔÏÒÙÊ ÐÏ ÏÞÅÒÅÄÉ ÄÌÑ ËÁÖÄÏÊ 
ËÁÔÅÇÏÒÉÉ: 
   
- ÇÅÎÅÒÉÔ ÆÁÊÌÏ×ÏÅ ÄÅÒÅ×Ï (ÆÁÊÌÙ ×ÉÄÁ  
../perl-cgi/product_card?prod_id=nnn ), ÄÏÓÔÁ×ÁÑ ÉÚ ÂÁÚÙ ÎÅÏÂÈÏÄÉÍÕÀ 
ÉÎÆÏÒÍÁÃÉÀ Ï ÔÏ×ÁÒÁÈ, ÐÏÄÌÅÖÁÝÉÈ ÉÎÄÅËÓÁÃÉÉ (ÏÐÉÓÁÎÉÅ, ÃÅÎÁ É Ô.Ä.).   
ËÁÖÄÙÊ ÆÁÊÌ product_card?prod_id=nnn ÓÏÄÅÒÖÉÔ HTML-ÄÏËÕÍÅÎÔ Ó title, 
ËÌÀÞÅ×ÙÍÉ ÓÌÏ×ÁÍÉ, ÏÐÉÓÁÎÉÅÍ ÔÏ×ÁÒÁ É Ô.Ð. îÁ www ÌÅÖÁÔ ËÁÒÔÏÞËÉ ÔÏ×ÁÒÏ×   
Ó ÓÏÏÔ×ÅÔÓÔ×ÕÀÝÉÍÉ ÐÕÔÑÍÉ (
http://www.blablabla.ru/perl-cgi/product_card?prod_id=nnn);
   
- ÇÅÎÅÒÉÔ indexer.conf, ×ÓÔÁ×ÌÑÑ ÔÕÄÁ ÔÅËÕÝÕÀ ËÁÔÅÇÏÒÉÀ × ÐÏÌÅ Category;   
   
- ÚÁÐÕÓËÁÅÔ ÉÎÄÅËÓÁÔÏÒ.
   
 üÔÏ ×ÓÅ ÐÏ×ÔÏÒÑÅÔÓÑ ÄÌÑ ËÁÖÄÏÊ ËÁÔÅÇÏÒÉÉ ÔÏ×ÁÒÏ×. é ÐÏÄ ËÏÎÅà ÒÁÂÏÔÙ, 
ÓËÒÉÐÔ ÇÅÎÅÒÉÔ search.htm, ×ÓÔÁ×ÌÑÑ ÔÕÄÁ select-tag ÓÏ ×ÓÅÍÉ ËÁÔÅÇÏÒÉÑÍÉ   
× ËÁÞÅÓÔ×Å option. 
   
üÔÏ ÐÅÒ×ÏÅ, ÞÔÏ ÐÒÉÛÌÏ ÍÎÅ × ÇÏÌÏ×Õ. îÏ ÔÕÔ ÐÒÏÂÌÅÍÁ Ó ÕÄÁÌÅÎÉÅÍ ÉÚ ÂÁÚÙ   
ÔÏ×ÁÒÏ×: ÉÎÄÅËÓÁÔÏÒ ÌÉÂÏ ÏÓÔÁ×ÌÑÅÔ ×ÓÅ ÔÏ×ÁÒÙ, ÌÉÂÏ ÕÄÁÌÑÅÔ ÔÏ×ÁÒÙ,
ËÏÔÏÒÙÅ ÂÙÌÉ ÐÒÏÉÎÄÅËÓÉÒÏ×ÁÎÙ ÄÌÑ ÐÒÅÄÙÄÕÝÉÈ ËÁÔÅÇÏÒÉÊ. é ×ÏÏÂÝÅ ËÁË-ÔÏ
ÎÅÍÎÏÇÏ ËÒÉ×Ï: ÓÏÚÄÁ×ÁÔØ ËÁÖÄÙÊ ÒÁÚ indexer.conf, ÚÁÐÕÓËÁÔØ ÍÎÏÇÏ ÒÁÚ
ÉÎÄÅËÓÁÔÏÒ É Ô.Ä... :(
ëÁË ÍÎÅ ÌÕÞÛÅ ÐÏÓÔÕÐÉÔØ? úÁÒÁÎÅÅ ÓÐÁÓÉÂÏ.
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Problems deleting urls with 3.1.10

2001-05-17 Thread Alexander Barkov

[EMAIL PROTECTED] wrote:
 
 I saw this question posted back in February, but I didn't see that an answer had 
been given. I'm using 3.1.10 with cachemode, with about 6.4 million urls indexed. I'm 
trying to delete a set of urls that match a certain pattern but when I attempt to do 
so I get the following:
 
 indexer -C -u http://some.url.com%;;
 You are going to delete database 'search' content
 Are you sure?(YES/no)YES
 Indexer[3617]: Error: 'Can't write to logd: Socket operation on non-socket'
 Deleting...Done
 
 Cachelogd is indeed running and indexing works just fine... Any suggestions?


Is indexing is running simultaneously? 
What happens after cachelogd restarting?
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: index just first page

2001-05-17 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Try to clear database using  ./indexer -Cw 
then start it with high verbose level:  ./indexer -v6
It will display information of every found link among
other useful information.


 
 I execute ./indexer -a ./indexer.conf
 and i've result :
indexer from mnogosearch-3.1.12/MySQL started with './indexer.conf'
[1] Done (0 seconds)
 
 I execute ./indexer -S ./indexer.conf
  i've an empty table with Total 0
 
 My indexer.conf is :
 
DBaddr mysql://...
Robots no
Follow yes
Allow *
 
Server http://apache.cadrus.fr/najean/
 
 
 (dams.com is local domain name in Intranet.)
 
 I don't understand why i index just the first page (index.htm) and not the other 
pages.
 
 Sorry for by bad English, i'm French.
 
 Thanks to answer me.
 
 
 

Reply: http://search.mnogo.ru/board/message.php?id=2233

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Sort by date?

2001-05-17 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Unfotunately, there is no sorting by date and no possibility to
display percentage.

 Hi,
 
 We are using mnoGoSearch to search in a list of newsitems. Is it possible to sort 
the result by the date a newsitem was posted instead of sorting by relevance? 
 And is it possible to show the relevance behind the page title as a percentage 
instead of a small number? Most people don't understand why there is a (1) or (3) 
behind a result.
 
 Thank you for your help.
 
 Ilan Shemes

Reply: http://search.mnogo.ru/board/message.php?id=2234

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Announce 3.1.13

2001-05-17 Thread Alexander Barkov

  Hello!

mnogosearch-3.1.13 is available from 
our site http://www.mnogosearch.org

Regards!

P.S.

ChangeLog:

 * Added installation script install.pl to simplify installation process
 * HTTPS support has been added. Thanks Dubun Guillaume
[EMAIL PROTECTED]
 * search.cgi now accepts tmplt parameter. It's can be used 
   to specify an alternative search template to be opened.  
 * Content-Language: HTTP header support for detecting document 
   language.
 * Using language of normalized words for document language detecting.  
 * Now all programs can accept alternative /var working 
   directory. This allows to put built-in and cache-mode
   databases in non-default directories without having to   
   recompile the package. indexer, spelld, search.cgi,  
   search.cgi take the path from VarDir command  in respectively
   indexer.conf, spelld.conf and search.htm. splitter and   
   cachelogd take working directory value from -w command   
   line argument.   
 * A problem with quotes in AliasProg has been fixed.   
   Thanks Justin [EMAIL PROTECTED] for reporting.
 * Fixed that sgml entities ( like amp; quot; auml; ) were
   not unescaped in META KEYWORDS and DESCRIPTION.
   Thanks Danil Lavrentyuk [EMAIL PROTECTED] for reporting.
 * A bug that basic authorization where not work when
   ServerTable is used has been fixed.   
 * A bug that META NAME=Refresh Content=... where not processed
   properly in some cases has been fixed.   
   Thanks Ivan Mikhnevich [EMAIL PROTECTED]
 * A bug that text hilighting were not work properly in some cases
   has been fixed. Thanks Thomas Olsson [EMAIL PROTECTED].
 * A bug in spelld hanging has been fixed.
 * Some bugs and possible exploits in search.cgi have been fixed.
 * Fixed a bug that socket was not closed when connect() failed.
   Thanks Ivan Mikhnevich [EMAIL PROTECTED].
 * Trap while fetching too big newsgroup lists fixed
 * Fixed a bug that Host: HTTP header were composed incorrectly
   when port is not 80.
 * Minor bug in built-in database has been fixed.
 * A bug that a line in indexer.conf, which contains only spaces,
   caused an Error in config file has been fixed.
 * A bug that indexer crashed when URL command argument has no
   correspondent Server/Realm command has been fixed.
 * ISO 10646 characters entity reference skeeping bug fixed.
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: index just first page / Allow NoCase

2001-05-19 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Please send me your indexer.conf, I'll try it on my box.

 
 Thanks to answered me, 
 
 I see link, but i don't understand what is quot;Allow NoCasequot;.
 
 Maybe the answer of my problem ?
 
 Anyone could help me please ?
 
 
 
 
 
 [1] http://apache.cadrus.fr/najean/
 [1] Server 'http://apache.cadrus.fr/najean/'
 [1] Allow NoCase  *
 [1] HTTP/1.1 200 OK
 [1] Date: Fri, 18 May 2001 08:30:24 GMT
 [1] Server: Apache/1.3.17 (Unix) PHP/4.0.4pl1
 [1] Last-Modified: Thu, 17 May 2001 09:37:41 GMT
 [1] ETag: quot;3084-434-3b039be5quot;
 [1] Accept-Ranges: bytes
 [1] Content-Length: 1076
 [1] Connection: close
 [1] text/html
 [1] HTTP/1.1 200 OK text/html 1076
 [1] quot;http://apache.cadrus.fr/najean/docpgsql/adminquot;: Allow NoCase  *
 [1] quot;http://apache.cadrus.fr/najean/docpgsql/programmerquot;: Allow NoCase  *
 [1] quot;docpgsql/postgresquot;: Allow NoCase  *
 [1] quot;docpgsql/tutorialquot;: Allow NoCase  *
 [1] quot;docpgsql/userquot;: Allow NoCase  *
 [1] quot;mail/mail.htmquot;: Allow NoCase  *
 [1] Done (0 seconds)
 

Reply: http://search.mnogo.ru/board/message.php?id=2237

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: index by time

2001-05-21 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Use Server command argument with trailing slash, i.e.:

Server  http://www.altec.com/  
  instead of
Server  http://www.altec.com


 works FINE, this is what I was having problems with in ACTUALLY:
 
 I inserted some debuging commands into search.c and got some results
 after running indexer:
 
 disallow *.com
 www.altec.com :: this url disallowed by default, deleting
 
 
 the database reads theis as url=quot;http://www.altec.comquot;
  but when it reads as url=quot;http://www.altec.com/quot; it works fine.
 can this be fixed so that indexer doesnt need a trailing backslash
 when indexing urls that have a explicit disallow for urls that are not
 files?
 

Reply: http://search.mnogo.ru/board/message.php?id=2242

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Indexer 1146 table not exist

2001-05-21 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
You have to create tables from scripts which can be found
in /create directory of sources. 
Take a look into INSTALL file.


 When I try to run the indexer to catolog my site I get the following:
 
 Indexer[429]: indexer from mnogosearch-3.1.13/MySql start with 
 'usr/local/mnogosearch/etc/indexer.conf'
 Indexer[429]: [1]Error: '#1146: Table 'mnoSearch.url' doesn't exist'
 
 My indexer.conf file is:
 # This is a minimal sample indexer config file
 
 DBAddr mysql://nogo:nogo@localhost/mnoGoSearch/
 #DBAddr mysql://root:@localhost/mnoGoSearch/
 DBMode multi
 
 DeletenoServerno
 
 Server http://smurf.lollydom.ass/
 Server http://localhost
 Server http://smurf.lollydom.ass/~prwb/   file:/home/prwb/public_html/
 
 # Allow some known extensions and directory index
 Allow *.html *.htm *.shtml *.txt */ 
 
 # Disallow everything else
 Disallow *
 
 What is this table that the error refers to, it doesn't appear to be 
 valid and its not in any sql scripts for mysql in the create dir.
 
 Thanks for your help I've been pulling my hair out trying to guess 
 this one?
 
 PB-)
 

Reply: http://search.mnogo.ru/board/message.php?id=2243

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Date/Time-Format

2001-05-21 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 Hi,
 
 who I can manipulate the output date/time-format? I will use for example german 
format (quot;25.05.2000quot;).
 
 Thanx!
 

Find this code in search.c near line 630 :


case 'M':
   UdmTime_t2HttpStr(Doc?Doc-last_mod_time:0, buf); 
   sprintf(UDM_STREND(Target),%s,buf);break;

 Doc-last_mod_time is a unix time stamp. You may use it
as a strftime()  function together with format you want.
Check strftime man page.



Reply: http://search.mnogo.ru/board/message.php?id=2244

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Webboard: Compil warning with GCC under OSX 10.0.0

2001-05-21 Thread Alexander Barkov

Is core file created? If so, try to inspect it using gdb.
Check doc/bugs.txt file for an explanation of how it
should be done.



richard riegert wrote:
 
 At 2:13 +0400 18/05/2001, Maxime Zakharov wrote:
 
That's worst effect (see the end of my log); spelld.o won't compile :(
 
 Try current version from CVS.
 
 
 That's compile now without any warning (but the all-* - that's
 would be normal?)
 
 You are really efficient. Thanks a lot.
 
 Unfortunately, the parsings of about more 50 Server always end with a
 segmentation fault (on Rhapsody and Darwin). The search seems working
 whithout problem, but I'm worry on errors.
 
 I can't use the 'period' config with this problem. It is not very
 annoying since the crontab, but.. if that could be corrected.
 
 Anyhow I'm happy with MnogoSearch, it is a great product.
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Russian letter 'io' ('')

2001-05-23 Thread Alexander Barkov

Danil Lavrentyuk wrote:
 
 Hello!
 
 How does mnoGoSearch counts russian letter 'io' ('£')?
 
 Does it counts this letter equal to russian 'ie' ('Å')?
 Or not?
 
 Have I to use this letter in ispell dictionaries or not to use?


It's not equal to ie, it's considered as a separate letter.

In ispell it's considered as separate letter two.
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: How do I exclude subdirs in indexer.conf?

2001-05-23 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
To index your site just use the only first command.

 Hi,
 
 I'd like to index http://www.mydomain.com/ with all subdirs
 and pages except every subdirectory like:
 http://www.mydomain.com/0/ or http://www.mydomain.com/123/ etc.
 (only numbers).
 
 In my indexer.conf I said:
 
 Server http://www.mydomain.com/
 Realm Regex NoMatch ^http://www\.mydomain\.com/[0-9]*/
 
 but it won't work at all :(. With the first line only
 everything is fine, but when I add the NoMatch line,
 also other pages are indexed which do not start with
 www.mydomain.com.
 
 How do I set up my indexer.conf correctly?
 
 tnx!
 
 cu Markus
 

Reply: http://search.mnogo.ru/board/message.php?id=2256

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Doc Relevance

2001-05-23 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
DR means number of unique words found in document, It is always 1
if you search for the only one word. However most relevant document
is always dislayed first. 

In 3.2 we want to add a possibility to display something like 
percentage.


 While performing a search, I realised that the document
 relevancy $DR is always get 1. This is rather weird, cos' 
 I always thought that the document relevancy value should 
 be derived from the search text. How it be possible that 
 a document which contain more occurences of the search text 
 have the same document relevancy value as documents with lesser occurences?
 
 How should I configure indexer.conf during indexing
 so that the document relevancy can be taken into account
 when a search is issued?
 
 Any help on the matter is much appreciated.
 
 --
 Jenson

Reply: http://search.mnogo.ru/board/message.php?id=2258

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: compiling 3.1.13 failed on spelld.c

2001-05-23 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Replace socklen_t with int

Thanks for reporting.

 FreeBSD 3.3
 GNU make 3.79.1
 mnoGoSearch 3.1.13
 
 ./configure --with-mysql
 gmake
 ...
 ...
 spelld.c: In function `main':  
 spelld.c:235: `socklen_t' undeclared (first use this function) 
 spelld.c:235: (Each undeclared identifier is reported only once
 spelld.c:235: for each function it appears in.)
 spelld.c:235: parse error before `addrlen' 
 spelld.c:241: `addrlen' undeclared (first use this function)   
 gmake[1]: *** [spelld.o] Error 1   
 gmake[1]: Leaving directory `/usr/local/src/mnogosearch-3.1.13/src'
 gmake: *** [all-recursive] Error 1 
 
 thanks in advance.

Reply: http://search.mnogo.ru/board/message.php?id=2259

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Cookies Support

2001-05-23 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Hello!

It's still on TODO. 

Probably one of the possible solutions is to hack the function 
UdmAddURL() and cut SESS=XXX substrings before inserting into
database.


 I posted there are serval time a question about the support by mnogosearch of the 
cookies. Someone answered that it was in the TODO list.
 
 I just would like to know if the coders have an idea of when ? Because I'm really 
interest ;p in fact it's because I'm using session (PHP) on my website and if the 
broswer (like the parser) doesn't support cookies, sessions were forward in the url 
(like SESS=ksjfhsjkdf45zefD), well the problem is when I try to delete only the 
session in my database, mysql answer me that the url is already in. After some 
research I discovered that (it goes without saying) mnogosearsh consider different 
session as different web page ...
 
 I hope that someone have understand me ;p And thank you if someone have a solution
 
 Cheers,
 

Reply: http://search.mnogo.ru/board/message.php?id=2262

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: How do I exclude subdirs in indexer.conf?

2001-05-24 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Sorry for my previous post. I didn't exactly understand what do
you want.

Try something like this:

Server http://www.mydomain.com/

Disallow regex ^http://www\.mydomain\.com/[0-9]*/

Write this Disallow command BEFORE any Allow/Diallow commands.


 I'd like to index http://www.mydomain.com/ with all subdirs
 and pages except every subdirectory like:
 http://www.mydomain.com/0/ or http://www.mydomain.com/123/ etc.
 (only numbers).
 
 In my indexer.conf I said:
 
 Server http://www.mydomain.com/
 Realm Regex NoMatch ^http://www\.mydomain\.com/[0-9]*/
 
 but it won't work at all :(. With the first line only
 everything is fine, but when I add the NoMatch line,
 also other pages are indexed which do not start with
 www.mydomain.com.
 
 How do I set up my indexer.conf correctly?
 
 tnx!
 
 cu Markus
 

Reply: http://search.mnogo.ru/board/message.php?id=2263

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Bug report

2001-05-24 Thread Alexander Barkov

Hello!

There is a name conflict between libudmsearch and php
sources. One of users found a workaround for this. Take
a look here:

   http://www.php.net/manual/en/ref.mnogo.php

We'll fix this name conflict in 3.1.14

Thanks for reporting!



Miguel Feitosa wrote:
 
 UdmSearch version: 3.1.13
 Platform:  linux pentium III  Asus dual board only one processor
 OS:rh7.1 2.4 kernel
 Database:  mysql-3.23.36
 Statistics:
 
 php-4.0.5
 Hello Developers!
 I have been using udmsearch for more than eight months and it certainly RULES in its 
field.
 
 I havent been able to compile on a new setup into php
 because I get the following errors while compiling.
 
 I am going to try to get mngosearch 3.10.10.
 
 Thanks,
 Sincerely,
 Miguel Feitosa
 Brasil - 3499 2143
 
 /bin/sh /usr/local/php-4.0.5/libtool --silent --mode=link gcc  -I. 
-I/usr/local/php-4.0.5/ -I/usr/local/php-4.0.5/main -I/usr/local/php-4.0.5 
-I/usr/include/apache -I/usr/local/php-4.0.5/Zend -I/usr//include 
-I/usr/include/freetype -I/usr/local/include -I/usr/local/mnogosearch/include 
-I/usr//include/mysql -I/usr/local/include/ucd-snmp 
-I/usr/local/php-4.0.5/ext/xml/expat/xmltok 
-I/usr/local/php-4.0.5/ext/xml/expat/xmlparse -I/usr/local/php-4.0.5/TSRM  
-I/usr/include/apache -I/usr/local/php-4.0.5/Zend -I/usr//include 
-I/usr/include/freetype -I/usr/local/include -I/usr/local/mnogosearch/include 
-I/usr//include/mysql -I/usr/local/include/ucd-snmp -DLINUX=22 -DMOD_SSL=208101 
-DEAPI -DEAPI_MM -DUSE_EXPAT -DSUPPORT_UTF8 -DXML_BYTE_ORDER=12 -g -O2   -o 
libphp4.la -rpath /usr/local/php-4.0.5/libs -avoid-version -L/usr//lib 
-L/usr/local/lib -L/usr//lib/mysql -L/usr/local/mnogosearch/lib  -R /usr//lib -R 
/usr/local/lib -R /usr//lib/mysql -R /usr/local/mnogosearch/lib stub.lo  Zend/li!
bZ!
 end.la sapi/apache/libsapi.la main/libmain.la regex/libregex.la 
ext/bcmath/libbcmath.la ext/bz2/libbz2.la ext/calendar/libcalendar.la 
ext/dbase/libdbase.la ext/ftp/libftp.la ext/gd/libgd.la ext/imap/libimap.la 
ext/ldap/libldap.la ext/mnogosearch/libmnogosearch.la ext/mysql/libmysql.la 
ext/openssl/libopenssl.la ext/pcre/libpcre.la ext/posix/libposix.la 
ext/recode/librecode.la ext/session/libsession.la ext/snmp/libsnmp.la 
ext/sockets/libsockets.la ext/standard/libstandard.la ext/xml/libxml.la 
ext/zlib/libzlib.la TSRM/libtsrm.la -lpam -lrecode -lc-client -ldl -lz -lssl -lcrypto 
-lsnmp -lmysqlclient -ludmsearch -lz -lm -lmysqlclient -lldap -llber -lttf -lz -lpng 
-lgd -lbz2 -lssl -lcrypto -lresolv -lm -ldl -lcrypt -lnsl -lresolv
 /usr/local/mnogosearch/lib/libudmsearch.a(ftp.o): In function `ftp_close\':
 /usr/local/mnogosearch-3.1.13/src/ftp.c:475: multiple definition of `ftp_close\'
 ext/ftp/.libs/libftp.al(ftp.lo):/usr/local/php-4.0.5/ext/ftp/ftp.c:179: first 
defined here
 /usr/bin/ld: Warning: size of symbol `ftp_close\' changed from 69 to 76 in ftp.o
 /usr/local/mnogosearch/lib/libudmsearch.a(ftp.o): In function `ftp_login\':
 /usr/local/mnogosearch-3.1.13/src/ftp.c:247: multiple definition of `ftp_login\'
 ext/ftp/.libs/libftp.al(ftp.lo):/usr/local/php-4.0.5/ext/ftp/ftp.c:223: first 
defined here
 /usr/bin/ld: Warning: size of symbol `ftp_login\' changed from 155 to 298 in ftp.o
 /usr/local/mnogosearch/lib/libudmsearch.a(ftp.o): In function `ftp_list\':
 /usr/local/mnogosearch-3.1.13/src/ftp.c:369: multiple definition of `ftp_list\'
 ext/ftp/.libs/libftp.al(ftp.lo):/usr/local/php-4.0.5/ext/ftp/ftp.c:419: first 
defined here
 /usr/bin/ld: Warning: size of symbol `ftp_list\' changed from 41 to 204 in ftp.o
 /usr/local/mnogosearch/lib/libudmsearch.a(ftp.o): In function `ftp_get\':
 /usr/local/mnogosearch-3.1.13/src/ftp.c:392: multiple definition of `ftp_get\'
 ext/ftp/.libs/libftp.al(ftp.lo):/usr/local/php-4.0.5/ext/ftp/ftp.c:499: first 
defined here
 /usr/bin/ld: Warning: size of symbol `ftp_get\' changed from 469 to 138 in ftp.o
 /usr/local/mnogosearch/lib/libudmsearch.a(ftp.o): In function `ftp_mdtm\':
 /usr/local/mnogosearch-3.1.13/src/ftp.c:411: multiple definition of `ftp_mdtm\'
 ext/ftp/.libs/libftp.al(ftp.lo):/usr/local/php-4.0.5/ext/ftp/ftp.c:639: first 
defined here
 /usr/bin/ld: Warning: size of symbol `ftp_mdtm\' changed from 263 to 160 in ftp.o
 collect2: ld returned 1 exit status
 
 ___
 If you want to unsubscribe send unsubscribe general
 to [EMAIL PROTECTED]
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: mysql compiling

2001-05-24 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Try  --with-mysql=/usr/



 ok, newbie to working with mysql here.  I'm trying to ./configure udmsearch-3.0.23, 
mysql distro 3.23.36 on a redhat 7.1 i386 system.  During the ./configure 
--with-mysql it fails when it tries to find the mysql.h file, saying quot;Invalid 
MySql Directory - unable to find mysql.hquot;, I've looked high and low and haven't 
found this file.  Doing a quot;which mysqlquot; gives my /usr/bin/mysql.  I tried 
./configure --with-mysql-/usr/bin/ and it didn't find what it wanted.  So the 
question is, what amI missing?  Should he MySql server be stopped during the 
configure and install? Is there a mysql file that has dissapeared from the computer, 
am I pointing to the wrong place, am I entering the command wrong?  Any 
suggestions/advice would be much appreaciated.  
 
 -jon-

Reply: http://search.mnogo.ru/board/message.php?id=2266

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: UDMsearch 3.1.13 cachelogd problem: proper action to correct

2001-05-28 Thread Alexander Barkov

Thanks for reporting!
We've fixed it in 3.1.14 sources.


[EMAIL PROTECTED] wrote:
 
 Gents,
 
 When compiling udmsearch, a typo makes cachelogd fails.
 
 When making, gcc gets as argument: -DUDM_VAR_DIR=\udmsearch_path/var\.
 
 Best action to solve definitely  properly this bug is to do:
 
 Add in src/cachelogd.c right after line 384, just before 
sprintf(pidname,%s%s,vardir,cachelogd.pid)
 the following:
 
 if (vardir[strlen(vardir)-1]!='/')
 { strcat(vardir,/); }
 
 Consider closely the -w arg as a temporary workaround if you dont dare changing the 
source code.
 
 @+
 
 ___
 If you want to unsubscribe send unsubscribe general
 to [EMAIL PROTECTED]
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: recent optimizations?

2001-05-30 Thread Alexander Barkov

Hi!

Tonu Samuel improved MySQL related code, so indexing with MySQL
now run faster. If I didn't forget something, it is the only one 
major improvement.


Damon Tkoch wrote:
 
 Hello,
 have there been any major optimizations to mnogosearch with mysql between
 3.1.11 and 3.1.14?  I just upgraded mnogosearch on one of my machines and it
 seems to run circles around the older, non-upgraded indexers.
 
 (whatever you guys did, it rocks!)
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Problem indexing

2001-05-31 Thread Alexander Barkov

Dovli wrote:
 
 Hello!
 
 I have another question.
 
 If I use the directive Server the url-s are indexed but If I I want to
 index all the url-s in a given domain using Realm I get the error no
 'Server' command for url...deleted for each and every url the indexer
 is processing.
 
 Thank you very much for your help.
 

Probably you wrote incorrect Realm argument.
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: indexer loops !

2001-06-01 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
What is the Period command argument in your indexer.conf?

 The problem now is different - the indexer doesn't want to stop !!!
 
 Does it not tag URLs it has visited already and skips them ? It looks like it is 
visiting the same ones again and again and again.
 
 I'll give SWISH a try :-)
 
 marcio
 

Reply: http://search.mnogo.ru/board/message.php?id=2300

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: search.cgi//blank html page//getting closer/

2001-06-01 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
If you include search.cgi using SSI, you have to set up an
environment variable UDM_TEMPLATE with a path to template file.
You can do it using SetEnv and PassEnv apache's httpd.conf directives.


 Well decided to try accessing search.cgi using ssi,at least I get the cant open 
template file now,instead of a blank page when accessing thru browser,still finds 
template in telnet and prints the html form,I have tried permission settings--moving 
to diff dir--do that and telnet says no good,any ideas at all !!!
 Thanks

Reply: http://search.mnogo.ru/board/message.php?id=2303

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Bug report

2001-06-01 Thread Alexander Barkov

Kelvin Chen wrote:
 
 UdmSearch version: 3.1.14
 Platform:  PII XEON
 OS:Redhat 6.2
 Database:  Oracle 8.1.16
 Statistics:Unknown
 
 4.0.1
 I don\'t know if this is a problem.
 First I found it is not so easy for meto install mnogosearch.
 
 After I have installed the whole package and indexed the site. I can\'t use 
search.cgi to search. I guess maybe I can\'t use web to connect to oracle. Since 
after I use search.cgi in shell, it works correctly.


What can you see in browser after pressing Search button?
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: *Weight not working?

2001-06-04 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Sergey, please check this. Probably there is a bug in PHP
module...


 Using 3.1.13 as a backend, I've indexed a bunch of pages. Didn't set any of the 
*Weight directives in indexer.conf, as I was happy with the defaults.
 
 However, using php-4.0.5 w/ 3.1.14 (for the built-in functions) with:
 
   udm_set_agent_param($udm,UDM_PARAM_WEIGHT_FACTOR,quot;F8421quot;);
 
 causes weird behavior. Namely, the weighting doesn't actually happen; pages that 
have the query string in the META KEYWORDS, for instance, are frequently listed lower 
than pages that just have the term in the body (and nowhere else).
 
 What gives? If you need more info, I'd be happy to oblige.
 
 Peter

Reply: http://search.mnogo.ru/board/message.php?id=2331

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: mnogosearch.url table?

2001-06-04 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 I get this error, when I try to index:
 
 Indexer[1460]: indexer from mnogosearch-3.1.14/MySQL started 
 with '/usr/local/mnogosearch/etc/indexer.conf'
 Indexer[1462]: [1] Error: '#1146: Table 'mnogosearch.url' doesn't 
 exist'
 
 Any ideas?

You didn't create tables structure. Take a look into INSTALL file.

Reply: http://search.mnogo.ru/board/message.php?id=2332

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Will Mnogo index multiple websites?

2001-06-05 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Yes, you can index several web sites.
Morphology support can be added using ispell dictionaries.
Take a look into ispell.txt which is supplied with mnogosearch
sources.


 ha? And if yes, then will it support morphology of Russian?

Reply: http://search.mnogo.ru/board/message.php?id=2334

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: 3.1.14 bad configure...

2001-06-05 Thread Alexander Barkov

[EMAIL PROTECTED] wrote:
 
 Gents,
 
 Problem with configure into 3.1.14.
 
 When compiled with openssl, it sets in makefiles (./  ./src) -L 
openssl_path/include (instead of lib), so linking fails...
 


  Hello!


Thanks for reporting!

This patch for configure.in fixes the bug.

292c292
 SSL_LFLAGS=-L$SSL_INCDIR -lcrypto -lssl
---
 SSL_LFLAGS=-L$SSL_LIBDIR -lcrypto -lssl


You may use this patch if you have autoconf installed
on your machine. Apply this patch then run:


autoconf
configure
make
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: update

2001-06-06 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 Can I update from mnogoSearch 3.1.12 to 3.1.14 and do not lose the urls database? 
(cache storage mode)
 
 Will my users search into the last database?
 
 Thanks
 

Yes, you can.

Reply: http://www.mnogosearch.org/board/message.php?id=2342

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: timeout/doc limit breaks infinite loop but I get duplicates !

2001-06-06 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:

 Look, I did not try to reindex anything. I know you claim it does not loop, but in 
my attempt it did loop. That's what I have been saying for the past messages. There's 
still no solution from you guys, that's too bad.

I can't reproduce this bug.

 Nevermind, I tried ht://Dig and it worked just fine. As did Greenstone. None of them 
looped forever like mnoGoSearch.
 
 If you are really interested in tracing this bug in mnoGoSearch, post your email 
address and I can send you my email and/or yahoo messenger ID.
 

Please do it, my email is [EMAIL PROTECTED]


Reply: http://www.mnogosearch.org/board/message.php?id=2343

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: bugs

2001-06-06 Thread Alexander Barkov

 xiao shibin wrote:
 
 version: mnogosearch 3.1.14 for windows.
 
 when I use multi-threads(6) run mnogosearch spider, mysql locked.
 
 I use mysqladmin process, the result is:
 | 4  | root | localhost | test | Query   | 460  | Locked | UPDATE url
 SET status=504,next_index_time=989192576 WHERE rec_id=12870 |
 ...
 


It seems that concurent thread made something wrong.
Can you see any other processes  in mysqladmin process?
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: passing variables to search.cgi

2001-06-06 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
You have to hack search.c.


 As best as I can determine, the only variables that I can pass to search.cgi are 
quot;ulquot;, quot;psquot;, quot;mquot;, quot;qquot;, quot;oquot;, and 
quot;tquot;.
 
 I would like to pass a few extra variables (of my own) to search.cgi.  My desire is 
that I would be able to display the values of these additional user-defined variables 
on the search results page by referencing them from inside the search.htm template.
 
 Does anyone know a way that I can do this?
 
 Thanks in advance.
 
 

Reply: http://www.mnogosearch.org/board/message.php?id=2345

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: What about the size of the index?

2001-06-06 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
Depending on storage mode, phrase support, ispell support.
Whithin 30-100 percents of original document size.


 Hi, I have a question about the size of a normal index. What Diskspace would about 
one million URL?s require?
 
 I am looking forward to your answers... Best regards Tim Block

Reply: http://www.mnogosearch.org/board/message.php?id=2346

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: update. problem

2001-06-06 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
What database and DBMode do you use?

 I have update mnogosearch today, but my indexer.conf and search.htm are the same 
what I had since version 3.1.10
 
 
 With this new search.cgi, I have always 'Sorry, but search returned no results'. I 
had to delete this new script and the old 'search.cgi' is running right actually.
 
 any idea for this problem?
 
 
 Thanks
 
 
 
 
 
 
 

Reply: http://www.mnogosearch.org/board/message.php?id=2350

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: How can I hide parts of my html-pages from the indexer?

2001-06-06 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
There are such tags, but their names are !--UdmComment--  and
!--/UdmComment--
 Hi,
 
 I want to exclude certain parts on html pages from being indexed.
 
 How can I do that? 
 
 Is it possible to use tags like
 
 lt;!-- mnogosearch-hide --gt; Text to hide lt;!-- mnogosearch-show --gt; ?
 
 I don't want to hide them from visitors so I can't use 
 lt;!-- Text to hide --gt;

Reply: http://www.mnogosearch.org/board/message.php?id=2351

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: update. problem

2001-06-06 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 DBMode  cache


Probably you used different --enable-fast-cat,--enable-fast-site
and --enable-fast-tag configure parameters during compilation
3.1.10 and 3.1.14

Reply: http://www.mnogosearch.org/board/message.php?id=2354

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Webboard: Indexing listprocessor archives?

2001-06-07 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
How does listprocessor store it's messages?

 Hi there!
 
 I would like to now, if it is possible to index listprocessor archives with 
mnogosearch. Listprocessor is a relatively old email-list-manager. If mnogosearch 
cannot do this, how hard would it be to implement it?
 
 Cheers,
   Gerhard

Reply: http://www.mnogosearch.org/board/message.php?id=2358

___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Cache mode - incorrect results returned

2001-06-07 Thread Alexander Barkov

Joe Frost wrote:
 
 Hi,
 
 I've just set up the following:
 
 RedHat 7.1 with Reiserfs
 Mnogosearch 3.1.14
 PostgreSQL 7.0.3 (as shipped with Redhat)
 
 I've set the system up to work in cache-mode and the indexer has run okay, I
 can see that there are entries in the URL table in postgres but when I do a
 search, words that I can clearly see listed in the keywords field in the URL
 table return no results.
 I know that these words are on the test site that I've indexed and the
 indexing process seemed to complete okay.
 Some words work okay and others return nothing, is this a known problem with
 cache-mode?
 
 Thanks for your help and best regards,
 


Did you run splitter ?
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: Cache mode - incorrect results returned

2001-06-07 Thread Alexander Barkov

Joe Frost wrote:
 
 
  Did you run splitter ?
 
 Yes, I ran:
 
 cachelogd
 indexer
 kill -HUP `cat /var/mnogosearch/cachelogd.pid`
 splitter -p
 splitter
 
 Is this okay?
 
 Joe


It's OK.
Have you any files under usr/local/mnogosearch/var/tree/  
directory?

How much time did splitter -p and splitter work?
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




Re: uncomplete indexing

2001-06-08 Thread Alexander Barkov

Rokky Irvayandi wrote:
 
 On Thu, 7 Jun 2001, Alexander Barkov wrote:
 
  Rokky Irvayandi wrote:
  
   Hi, I want to index all of mp3 files on a server, but i always found that
   it fails to index all of them. Some file on the same directory can't be
   indexed. I did not find something wrong on the file.
   Can anyone help me???
 
 
  How do you index them, from local disk or via http?
 
 via http
 

Probably you have no links to all those files.

You have to create a page with links and index it.
___
If you want to unsubscribe send unsubscribe general
to [EMAIL PROTECTED]




<    1   2   3   4   5   6   7   >