Re: BOUNCE general@mnogosearch.org: Non-member submission from [Zenon Panoussis lrh@xs4all.nl]

2001-03-01 Thread Maxime Zakharov

ðÒÉ×ÅÔ,

õ ÎÅÇÏ ÓÌÕÞÁÅÍ ÎÅ ÍÕÌØÔÉÔÒÅÄÎÁÑ ×ÅÒÓÉÑ ÉÓÐÏÌØÚÕÅÔÓÑ ?
ïÞÅÎØ ÐÏÈÏÖÅ ÎÁ ÔÏ, ËÏÇÄÁ ÐÁÍÑÔÉ ÄÌÑ ÎÉÔÉ ÎÅ È×ÁÔÁÌÏ - ÉÍÅÎÎÏ × ÔÏÍ ÖÅ ÍÅÓÔÅ É 
ÔÒÁÐÁÅÔÓÑ.

On Thu, 01 Mar 2001 09:15:37 +0400
Alexander Barkov [EMAIL PROTECTED] wrote:

AB OK. Please check also this:
AB 
AB print realsize
AB print *Doc
AB 
AB 
AB 
AB Zenon Panoussis wrote:
AB   please run the following commands in gdb:
AB  
AB   frame 1
AB   print content_type
AB   print Method
AB   print Doc
AB   print Doc-content
AB   print Doc-url
AB  
AB  #0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
AB  97  _CRC32_(crc, *p) ;
AB  (gdb) frame 1
AB  #1  0x804d7f8 in UdmIndexNextURL (Indexer=0x807ca50, index_flags=4) at 
indexer.c:1150
AB  1150crc32=UdmCRC32(Doc-content, (size_t)realsize);
AB  (gdb) print content_type
AB  $1 = 0x4021c027 "application/unknown"
AB  (gdb) print Method
AB  $2 = 1
AB  (gdb) print Doc
AB  $3 = (UDM_DOCUMENT *) 0x91ef7d8
AB  (gdb) print Doc-content
AB  $4 = 0x4021c03e ""
AB  (gdb) print Doc-url
AB  $5 = 0x91f0548 "http://www.xs4all.nl/~fishman/ls/."
AB  
AB  See my (bounced) posting from [EMAIL PROTECTED] on
AB  Tue, 27 Feb 2001 15:46:37 +0100 for details about this and
AB  the other URLs that the indexer crashes on.
AB ___
AB If you want to unsubscribe send "unsubscribe general"
AB to [EMAIL PROTECTED]
AB 
AB 

--
___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Re: BOUNCE general@mnogosearch.org: Non-member submission from [Zenon Panoussis lrh@xs4all.nl]

2001-03-01 Thread Alexander Barkov

realsize -1 means that there was an error while donwloading document.

I found that there is not checking in indexer.c for this. Please
find a patch here:
http://gw.udmsearch.izhnet.ru/~bar/crc32.indexer.c.patch.gz
It should the crash.



Take a look into proto.c.  UDM_NET_ERROR (it is -1) is returned only
in two places:

1. in open_host() function, when port is 0.
2. in UdmHTTPGet() function, when select() returns an error.

I have no idea what is happening.



Zenon Panoussis wrote:
 
 Alexander Barkov skrev:
 
 
  OK. Please check also this:
 
  print realsize
  print *Doc
 
 (gdb) frame 1
 #1  0x804d7f8 in UdmIndexNextURL (Indexer=0x807ca50, index_flags=4) at indexer.c:1150
 1150crc32=UdmCRC32(Doc-content, (size_t)realsize);
 (gdb) print realsize
 $1 = -1
 (gdb) print *Doc
 $2 = {url_id = 12018, status = 0, size = 0, rating = 0, order = 0, referrer = 0, tag 
= 0, hops = 3,
   indexed = 0, url = 0x91f0548 "http://www.xs4all.nl/~fishman/ls/.", content_type = 
0x0, title = 0x0,
   keywords = 0x0, description = 0x0, text = 0x0, category = 0x0, content = 
0x4021c03e "",
   last_mod_time = 0, last_index_time = 983253816, next_index_time = 0, crc32 =  0}
 
 Z
 
 --
 oracle@everywhere: The ephemeral source of the eternal truth...
 ___
 If you want to unsubscribe send "unsubscribe general"
 to [EMAIL PROTECTED]
___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Re: BOUNCE general@mnogosearch.org: Non-member submission from [Zenon Panoussis lrh@xs4all.nl]

2001-02-28 Thread Alexander Barkov

I just not sure what version are you using.
Is it here: 

}else{
/* Unknown Content-Type */
if(Method!=UDM_HEAD){
crc32=UdmCRC32(Doc-content, (size_t)realsize);
changed=!(crc32==Doc-crc32);
if(CurSrv-use_clones){
origin=UdmFindOrigin(Indexer, crc32,
size);
origin=((origin==Doc-url_id)?0:origin);
}
}   
}



please run the following commands in gdb:


frame 1
print content_type
print Method
print Doc
print Doc-content
print Doc-url




Zenon Panoussis wrote:
 
 ./indexer -c [600 | 15000] segfaults on 3.1.11 patched (see my
 earlier postings in this thread for details)
 
 Segfault #1:
 
 #0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
 97  _CRC32_(crc, *p) ;
 (gdb) print crc
 $1 = 1181568253
 (gdb) print *p
 Cannot access memory at address 0x40499000
 (gdb) print p
 $2 = 0x40499000 Address 0x40499000 out of bounds
 (gdb) backtrace
 #0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
 #1  0x804d7f8 in UdmIndexNextURL (Indexer=0x807ca50, index_flags=4) at indexer.c:1150
 #2  0x804a050 in thread_main (arg=0x0) at main.c:256
 #3  0x804a9e4 in main (argc=3, argv=0xbab4) at main.c:596
 #4  0x4009cbfc in __libc_start_main (main=0x804a16c main, argc=3, 
ubp_av=0xbab4,
 init=0x80496a8 _init, fini=0x806abfc _fini, rtld_fini=0x4000d674 _dl_fini, 
stack_end=0xbaac)
 at ../sysdeps/generic/libc-start.c:118
 
 Segfault #2:
 
 #0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
 97  _CRC32_(crc, *p) ;
 (gdb) print crc
 $1 = 4285190670
 (gdb) print *p
 Cannot access memory at address 0x404d3000
 (gdb) print p
 $2 = 0x404d3000 Address 0x404d3000 out of bounds
 (gdb) backtrace
 #0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
 #1  0x804d7f8 in UdmIndexNextURL (Indexer=0x8094480, index_flags=4) at indexer.c:1150
 #2  0x804a050 in thread_main (arg=0x0) at main.c:256
 #3  0x804a9e4 in main (argc=3, argv=0xbab4) at main.c:596
 #4  0x4009cbfc in __libc_start_main (main=0x804a16c main, argc=3, 
ubp_av=0xbab4,
 init=0x80496a8 _init, fini=0x806abfc _fini, rtld_fini=0x4000d674 _dl_fini, 
stack_end=0xbaac)
 at ../sysdeps/generic/libc-start.c:118
 
 Segfault #3:
 
 #0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
 97  _CRC32_(crc, *p) ;
 (gdb) print crc
 $1 = 2724492306
 (gdb) print *p
 Cannot access memory at address 0x40432000
 (gdb) print p
 $2 = 0x40432000 Address 0x40432000 out of bounds
 (gdb) backtrace
 #0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
 #1  0x804d7f8 in UdmIndexNextURL (Indexer=0x8094480, index_flags=4) at indexer.c:1150
 #2  0x804a050 in thread_main (arg=0x0) at main.c:256
 #3  0x804a9e4 in main (argc=3, argv=0xbab4) at main.c:596
 #4  0x4009cbfc in __libc_start_main (main=0x804a16c main, argc=3, 
ubp_av=0xbab4,
 init=0x80496a8 _init, fini=0x806abfc _fini, rtld_fini=0x4000d674 _dl_fini, 
stack_end=0xbaac)
 at ../sysdeps/generic/libc-start.c:118
 
 Segfault #4:
 
 #0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
 97  _CRC32_(crc, *p) ;
 (gdb) print crc
 $1 = 2252292711
 (gdb) print *p
 Cannot access memory at address 0x40432000
 (gdb) print p
 $2 = 0x40432000 Address 0x40432000 out of bounds
 (gdb) backtrace
 #0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
 #1  0x804d7f8 in UdmIndexNextURL (Indexer=0x807ca50, index_flags=4) at indexer.c:1150
 #2  0x804a050 in thread_main (arg=0x0) at main.c:256
 #3  0x804a9e4 in main (argc=3, argv=0xbab4) at main.c:596
 #4  0x4009cbfc in __libc_start_main (main=0x804a16c main, argc=3, 
ubp_av=0xbab4,
 init=0x80496a8 _init, fini=0x806abfc _fini, rtld_fini=0x4000d674 _dl_fini, 
stack_end=0xbaac)
 at ../sysdeps/generic/libc-start.c:118
 
 Segfault #5:
 
 #0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
 97  _CRC32_(crc, *p) ;
 (gdb) print crc
 $1 = 879758289
 (gdb) print *p
 Cannot access memory at address 0x4054f000
 (gdb) print p
 $2 = 0x4054f000 Address 0x4054f000 out of bounds
 (gdb) backtrace
 #0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
 #1  0x804d7f8 in UdmIndexNextURL (Indexer=0x807ca50, index_flags=4) at indexer.c:1150
 #2  0x804a050 in thread_main (arg=0x0) at main.c:256
 #3  0x804a9e4 in main (argc=3, argv=0xbab4) at main.c:596
 #4  0x4009cbfc in __libc_start_main (main=0x804a16c main, argc=3, 
ubp_av=0xbab4,
 init=0x80496a8 _init, fini=0x806abfc _fini, rtld_fini=0x4000d674 _dl_fini, 
stack_end=0xbaac)
 at ../sysdeps/generic/libc-start.c:118
 
 Segfault #6:
 
 #0  0x80600ca in 

Re: BOUNCE general@mnogosearch.org: Non-member submission from [Zenon Panoussis lrh@xs4all.nl]

2001-02-28 Thread Zenon Panoussis


Alexander Barkov skrev:

 
 I just not sure what version are you using.

3.1.11 with the add_url.3.1.11.diff patch only. 

 Is it here:
 
 }else{
 /* Unknown Content-Type */
 if(Method!=UDM_HEAD){
 crc32=UdmCRC32(Doc-content, (size_t)realsize);
 changed=!(crc32==Doc-crc32);
 if(CurSrv-use_clones){
 origin=UdmFindOrigin(Indexer, crc32,
 size);
 origin=((origin==Doc-url_id)?0:origin);
 }
 }
 }
 
 please run the following commands in gdb:
 
 frame 1
 print content_type
 print Method
 print Doc
 print Doc-content
 print Doc-url

#0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
97  _CRC32_(crc, *p) ;
(gdb) frame 1
#1  0x804d7f8 in UdmIndexNextURL (Indexer=0x807ca50, index_flags=4) at indexer.c:1150
1150crc32=UdmCRC32(Doc-content, (size_t)realsize);
(gdb) print content_type
$1 = 0x4021c027 "application/unknown"
(gdb) print Method
$2 = 1
(gdb) print Doc
$3 = (UDM_DOCUMENT *) 0x91ef7d8
(gdb) print Doc-content
$4 = 0x4021c03e ""
(gdb) print Doc-url
$5 = 0x91f0548 "http://www.xs4all.nl/~fishman/ls/."

See my (bounced) posting from [EMAIL PROTECTED] on 
Tue, 27 Feb 2001 15:46:37 +0100 for details about this and 
the other URLs that the indexer crashes on. 

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Re: BOUNCE general@mnogosearch.org: Non-member submission from [Zenon Panoussis lrh@xs4all.nl]

2001-02-28 Thread Alexander Barkov

OK. Please check also this:

print realsize
print *Doc



Zenon Panoussis wrote:
  please run the following commands in gdb:
 
  frame 1
  print content_type
  print Method
  print Doc
  print Doc-content
  print Doc-url
 
 #0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
 97  _CRC32_(crc, *p) ;
 (gdb) frame 1
 #1  0x804d7f8 in UdmIndexNextURL (Indexer=0x807ca50, index_flags=4) at indexer.c:1150
 1150crc32=UdmCRC32(Doc-content, (size_t)realsize);
 (gdb) print content_type
 $1 = 0x4021c027 "application/unknown"
 (gdb) print Method
 $2 = 1
 (gdb) print Doc
 $3 = (UDM_DOCUMENT *) 0x91ef7d8
 (gdb) print Doc-content
 $4 = 0x4021c03e ""
 (gdb) print Doc-url
 $5 = 0x91f0548 "http://www.xs4all.nl/~fishman/ls/."
 
 See my (bounced) posting from [EMAIL PROTECTED] on
 Tue, 27 Feb 2001 15:46:37 +0100 for details about this and
 the other URLs that the indexer crashes on.
___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Re: BOUNCE general@mnogosearch.org: Non-member submission from [Zenon Panoussis lrh@xs4all.nl]

2001-02-27 Thread Zenon Panoussis


Hi

 I tested several pages from your site and everything seems to work fine.

What you see on the web is 3.1.10 . The 3.1.11 (patched yesterday) 
runs in a separate directory and a separate database; this way I 
can maintain the search functional while the 3.1.11 bugs are worked 
out of it :)  You can use 3.1.11 by going to 
http://search.freewinds.cx/cgi-bin/v4.cgi . On the other hand, it 
is *indexer* that's craching; the search part works fine (apart 
from the little ul= problem I reported yesterday).


 Does indexer crash always on the same URL? 

This is new since yesterday's patch: indexer crashes after a few 
minutes, always in the middle of a URL, like this

Indexer[21800]: [1] http://www.scientology-kills.org/dead.htm
Indexer[21800]: [1] http://www.xs4all.nl/~fishman/ls/.
Tue 27 08:26:06 [21283] Client #0 left
Segmentation fault (core dumped)

This particular URL is not very long and contains no spaces or 
other funny stuff; what is missing after 
   http://www.xs4all.nl/~fishman/ls/ 
is something like 
   ls02b.html 

After a crash I restart indexer and it goes on with status 0 URLs 
in the order it has them, so it won't go back and won't crash on 
the same URL.


 Please send also "backtrace" gdb command output.

#0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
97  _CRC32_(crc, *p) ;
(gdb) print crc
$1 = 1181568253
(gdb) print p
$2 = 0x40499000 Address 0x40499000 out of bounds
(gdb) print *p
Cannot access memory at address 0x40499000
(gdb) backtrace
#0  0x80600ca in UdmCRC32 (buf=0x4021c03e "", size=4294967295) at crc32.c:97
#1  0x804d7f8 in UdmIndexNextURL (Indexer=0x807ca50, index_flags=4) at indexer.c:1150
#2  0x804a050 in thread_main (arg=0x0) at main.c:256
#3  0x804a9e4 in main (argc=3, argv=0xbab4) at main.c:596
#4  0x4009cbfc in __libc_start_main (main=0x804a16c main, argc=3, ubp_av=0xbab4, 
init=0x80496a8 _init, fini=0x806abfc _fini, rtld_fini=0x4000d674 _dl_fini, 
stack_end=0xbaac)
at ../sysdeps/generic/libc-start.c:118

I'm saving the core, 


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]