UdmSearch: Webboard: Failing to index titles (udmsearch 3.0.23)

2000-11-11 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
 Indexer seems to not be able to index title tags coming from a cgi.
 
 I've indexed http://www.beautycommercial.com which indexer.conf set to accept cgis, 
nphs, ?s and it spiders thru the site fine.
 Checking the mysql database shows that most of the cgis (.mxs) failed to index the 
title even though a title exists.
 
 

It does not matter for indexer what to index, static HTML, or HTML generated from 
scripts.


Reply: http://search.mnogo.ru/board/message.php?id=727

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Links problem. identified it, but I dont get the code

2000-11-11 Thread Alexander Barkov

Hi!

Igor has fixed the problem. Mario, thanks for help!


Mario Lang wrote:
 
 Hi.
 
 I just identified my problem with symlinks on ftp protocol.
 This code in ftp.c is most likely causing the problem:
 case 'l':
 ch = strstr (fname, " - ");
 if (!ch)
 break;
 ch +=4;
 if (ch[0] == '.'){
 
 len = len_h+len_p+strlen(ch);
 udm_snprintf(buf_out+cur_len, len+1, "a 
href=\"ftp://%s%s%s/\"/a",
 connp-hostname, path, ch);
 }else{
 len = len_h+strlen(ch);
 udm_snprintf(buf_out+cur_len, len+1, "a 
href=\"ftp://%s%s/\"/a",
 connp-hostname, ch);
 }
 
 ...
 
 What is the reason for checking for links to /^\./ files?
 On our ftp, we use links to files starting with a . to avoid listing
 the target paths in a normal ls.

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re[4]: UdmSearch: UdmSearch PHP Frontend - is it possible to search by the URL, not by keyword listed in dict?

2000-11-11 Thread Sergey Kartashoff

Hi!

Saturday, November 11, 2000, 12:30:22 AM, you wrote:

AS The Dict table has information in it, just not enough information!
AS Here is my allow/disallow statement:
AS # Exclude Apache and Squid directory lists in different sort order
AS Disallow \?D=A$ \?D=A$ \?D=D$ \?M=A$ \?M=D$ \?N=A$ \?N=D$ \?S=A$ \?S=D$
AS # Exclude ./.  and ./.. from directory list
AS Disallow /[.]{1,2} /\%2e /\%2f  # Retrieve only
AS directory list, check other files.
AS CheckOnly [^/]$

so, you are CheckOnly ALL urls which are not ended with '/'.
So if file.html contains links to other pages than they will not be
added into database because of this checkonly statement.

-- 
Regards, Sergey aka gluke.


__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: File HTTP search / FTP search CONF file Questions

2000-11-11 Thread Craig Small

On Tue, Nov 07, 2000 at 06:42:51PM -0800, Ari Shomair wrote:
 I've got two questions.
 A) Is it possible, with UDM search, to set up a HTTP file search, where you
 enter in the url of a site and it catalogs links to all of the files of
 certain extensions on that site? How would I go about doing this?
No sure what you mean by this. "catalogs links"?

Perhaps you mean you have a site where there is two sets of files. Stuff
you need to just grab hrefs and stuff you want to index. Like, say,
the entire Debian email website. You want to follow links for
thread.html, subject.html and  author.html, but you only want to
index (grab keywords) out of msg*.html

You'd use HrefOnly on thread subject and author so it followed the
links.

-- 
Craig Small VK2XLZ  GnuPG:1C1B D893 1418 2AF4 45EE  95CB C76C E5AC 12CA DFA5
Eye-Net Consulting http://www.eye-net.com.au/[EMAIL PROTECTED]
MIEEE [EMAIL PROTECTED] Debian developer [EMAIL PROTECTED]
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: indexer.conf - disallow problem

2000-11-11 Thread Craig Small

On Fri, Nov 10, 2000 at 02:20:05PM +0300, Martin M wrote:
 after putting a line like the following in the indexer.conf
 
 Disallow some.html
 
 everything works fine.
 But after removing the line, the some.html file
 is not reindexed.

It may have something to do with the timeouts.  It's not go to reindex
it until a week (using the defaults).
Try -a -u %some.html

  - Craig
-- 
Craig Small VK2XLZ  GnuPG:1C1B D893 1418 2AF4 45EE  95CB C76C E5AC 12CA DFA5
Eye-Net Consulting http://www.eye-net.com.au/[EMAIL PROTECTED]
MIEEE [EMAIL PROTECTED] Debian developer [EMAIL PROTECTED]
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Bug report

2000-11-11 Thread FILLON



UdmSearch version: 3.1.8
Platform:  PC Intel III 500
OS:Linux 2.2.16
Database:  Mysql 2.22.32
Statistics:0

3.1.2.3 PHP
Hi !

When I try to execute search.php I obtain :
Fatal error: Call to unsupported or undefined function read_template() in init.inc on 
line 126
I use PHP 3.0.16 ...
Could you help me ?
thx

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: CRC and DBMode MULIT - halfway

2000-11-11 Thread Catriona Murphy

Author: Catriona Murphy
Email: [EMAIL PROTECTED]
Message:
hello,
UDM is running very well on my Intranet, it has over 80K of Urls.
I have been using it without CRC in DBMode single.
Now everyone loves it and wants the searches to be much faster, and index all the 
other docs from years gone by..
Is there anyway that I can convert all the existing data in CRC  DBmode Multi, 
without reindexing?

If not, will it harm the UDM if I start using the CRC and MULTI mode now...??

any help will be appreciated... 

Cat


Reply: http://search.mnogo.ru/board/message.php?id=728

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: PgSQL: DELETE INDEX url_url;

2000-11-11 Thread The Hermit Hacker

On Sat, 11 Nov 2000, Alexander Barkov wrote:

 The Hermit Hacker wrote:
  
  okay, can someone make the following changes to the source code, so that
  the search avoids using the index ... this will at least give a temporary
  fix until our LIKE optimizer is fixed:
  
  SELECT ndict.url_id,ndict.intag
FROM ndict,url
   WHERE ndict.word_id=1971739852
 AND url.rec_id=ndict.url_id
 AND ( (url.url || ' ') LIKE 'http://www.postgresql.org/% ');
 
 
 I don't think that this is the best solution to fix search for buggy 
 LIKE optimizer then to fix search back for fixed optimizer.

After sending this out, it looks like there might be a bug in udmsearch
itself, as I went through the code itself, in 3.1.7, and it is technically
coded to do this, but it isn't sql.c:1894:

if(c-DBType==UDM_DB_PGSQL)
sprintf(UDM_STREND(c-urlstr),"(url.url || '') LIKE '%s')",URL);
else
sprintf(UDM_STREND(c-urlstr),"url.url LIKE '%s')",URL);
return(0);

Any idea why this isn't, in fact, working?

My queries are coming out as the second of the two conditions, even though
my 'connect string' looks like:

DBAddr  pgsql:[EMAIL PROTECTED]/udmsearch/

Is my 'connect string' wrong?

Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org 

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Grrrr!

2000-11-11 Thread Zenon Panoussis

 
mnogosearch 3.1.8, mysql 3.23.22

This happened: 

The search worked fine. Then I re-installed MySQL (3.23 instead 
of 3.22) and Apache, and the directory structure of both changed. 
I moved the old search.cgi to the new cgi-bin. I exported the old 
database with mysqldump and re-imported it in the new MYI/MYD 
format in the same (deleted and re-created) database. The indexer 
works fine in the new setup with the old configuration. The search 
does not; it returns "an error occured". 

This is what I tried: 

- Searched the Apache and MySQL error logs. Nothing there. Most 
  important, there are no "access denied" messages in the mysql log, 
  meaning that the search never even reaches mysql before it fails.
- Recompiled and reinstalled mnogosearch and copied the new search.cgi 
  to cgi-bin. It didn't help.
- Double-checked search.htm. This shouldn't be necessary since both 
  the database and search.htm are the same as before, but anyway. The 
  DBAddr statement is identical to the one in indexer.conf, including 
  trailing slash. So are the DBMode and charset statements. 
- Beat my wife, screamed to the dog, kicked my children and broke my 
  monitor. That didn't help either. 

Finally I straced search.cgi, but I don't understand the output. If 
you do, you'll find it below. 

Any ideas? 

Z

=strace.out=

execve("/var/www/cgi-bin/search.cgi", ["/var/www/cgi-bin/search.cgi"], [/* 24 vars 
*/]) = 0
_sysctl({{CTL_KERN, KERN_OSRELEASE}, 2, "2.2.16-22", 9, NULL, 0}) = 0
brk(0)  = 0x80908c0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x40016000
open("/etc/ld.so.preload", O_RDONLY)= -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)  = 4
fstat64(4, 0xb32c)  = -1 ENOSYS (Function not implemented)
fstat(4, {st_mode=S_IFREG|0644, st_size=21769, ...}) = 0
old_mmap(NULL, 21769, PROT_READ, MAP_PRIVATE, 4, 0) = 0x40017000
close(4)= 0
open("/usr/lib/mysql/libmysqlclient.so.9", O_RDONLY) = 4
fstat(4, {st_mode=S_IFREG|0755, st_size=196204, ...}) = 0
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0 d\0\000"..., 4096) = 4096
old_mmap(NULL, 172480, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x4001d000
mprotect(0x40036000, 70080, PROT_NONE)  = 0
old_mmap(0x40036000, 69632, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x18000) = 
0x40036000
old_mmap(0x40047000, 448, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, 
-1, 0) = 0x40047000
close(4)= 0
open("/lib/libm.so.6", O_RDONLY)= 4
fstat(4, {st_mode=S_IFREG|0755, st_size=493588, ...}) = 0
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300I\0"..., 4096) = 4096
old_mmap(NULL, 125352, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40048000
mprotect(0x40066000, 2472, PROT_NONE)   = 0
old_mmap(0x40066000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x1d000) = 
0x40066000
close(4)= 0
open("/usr/lib/libz.so.1", O_RDONLY)= 4
fstat(4, {st_mode=S_IFREG|0755, st_size=58940, ...}) = 0
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\\36\0"..., 4096) = 4096
old_mmap(NULL, 54064, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40067000
mprotect(0x40073000, 4912, PROT_NONE)   = 0
old_mmap(0x40073000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0xb000) = 
0x40073000
close(4)= 0
open("/lib/libc.so.6", O_RDONLY)= 4
fstat(4, {st_mode=S_IFREG|0755, st_size=4686077, ...}) = 0
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\230\270"..., 4096) = 4096
old_mmap(NULL, 1167368, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40075000
mprotect(0x40189000, 36872, PROT_NONE)  = 0
old_mmap(0x40189000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x113000) 
= 0x40189000
old_mmap(0x4018f000, 12296, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, 
-1, 0) = 0x4018f000
close(4)= 0
open("/lib/libnsl.so.1", O_RDONLY)  = 4
fstat(4, {st_mode=S_IFREG|0755, st_size=392107, ...}) = 0
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0p?\0\000"..., 4096) = 4096
old_mmap(NULL, 93120, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40193000
mprotect(0x401a7000, 11200, PROT_NONE)  = 0
old_mmap(0x401a7000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x13000) = 
0x401a7000
old_mmap(0x401a8000, 7104, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, 
-1, 0) = 0x401a8000
close(4)= 0
open("/lib/libcrypt.so.1", O_RDONLY)= 4
fstat(4, {st_mode=S_IFREG|0755, st_size=82333, ...}) = 0
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\200\17"..., 4096) = 4096
old_mmap(NULL, 184252, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x401aa000
mprotect(0x401af000, 163772, PROT_NONE) = 0
old_mmap(0x401af000, 4096, PROT_READ|PROT_WRITE,