SQUAT: Unknown error 1 (Closing index)

2003-06-10 Thread Dylan Martin
I've been trying to run down the problem I've been having with squatter, 
and it looks like quite a few people on the list are having the same 
problem.  Here's what I've got so far, and I'll post more if/when I get 
it.

It looks like in squat_build.c in write_trie_word_data, if len  2 it
calls write_trie_word_data on the SquatWordTable new_t.  When it breaks,
new_t has these values:  new_t-first_valid_entry = 256
new_t-last_valid_entry = 0.  When it doesn't break, first_valid_entry is 
less than or equal to last_valid entry.

I don't really know what values mean what, so I can't really say what this 
means or even if it's significant.  I'll see if I can find more.  Let me 
know if this means anything to any of you.

Thanks
-Dylan




Re: SQUAT: Unknown error 1 (Closing index)

2003-06-10 Thread Christian Schulte
Dylan Martin wrote:

I've been trying to run down the problem I've been having with squatter, 
and it looks like quite a few people on the list are having the same 
problem.  Here's what I've got so far, and I'll post more if/when I get 
it.

It looks like in squat_build.c in write_trie_word_data, if len  2 it
calls write_trie_word_data on the SquatWordTable new_t.  When it breaks,
new_t has these values:  new_t-first_valid_entry = 256
new_t-last_valid_entry = 0.  When it doesn't break, first_valid_entry is 
less than or equal to last_valid entry.

I don't really know what values mean what, so I can't really say what this 
means or even if it's significant.  I'll see if I can find more.  Let me 
know if this means anything to any of you.

Thanks
-Dylan
For me it fails at exactly the same place with the same error ! I 
collected some of the messages for which squatter reproducable fails but 
cannot say which unique liddle difference in them squatter does not 
like. They are from different mailers in different charsets and 
encodings. The only thing they seem to have in common is 
Content-Transfer-Encoding: 8 Bit in at least one body part of a mime 
message. Even messages I personally sent using squirrelmail could not be 
processed by squatter without crashing! During tracing I had a closer 
look on some I-do-not-remember-receiver-function in squatter.c  in which 
I tried to figure out if cyrus has problems in decoding the messages 
and  building the to-index-strings but that all looked reasonable. One 
thing I did not quite understand was that somehow cyrus does not seem to 
pay attention on the charset being used in the message to index. I think 
there were messages which were explicitly (and correctly) defined in 
charset iso-8859-15 (all messages for which squatter fails seem to have 
this charset in use but for my system 99% of all messages are using 
iso-8859-15 charset and so this may not be an issue) but during 
index-canonicalization 8 bit characters got replaced by 'X' characters 
and so the index would never contain words containing e.g. german 
umlauts correctly (Maybe I am totally wrong here, of course!).
I tried setting reject8bit and stopped all mta-mail-conversions but 
messages which cause squatter to crash still come in! The error I get 
seems to be

#define EPERM1  /* Operation not permitted */

--Christian



Re: SQUAT: Unknown error 1 (Closing index)

2003-06-10 Thread Dylan Martin
Unfortunately, I don't have any more time to work on this, so I'm going to 
have to give up and stop using squat until it's fixed by someone else.

It looks like something is making a SquatWordTable, but then not filling 
it in.  Inside the write_trie_word_data function, it checks to see 
if t-first_valid_entry = VECTOR_SIZE(offsets), and this is what actually 
makes it fail.  I don't understand the code well enough to figure out 
where it might be forgetting to set t-first_valid_entry to something 
other than 256.  Im assuming that (t-first_valid_entry = 
VECTOR_SIZE(offsets)) should never be true, and the fact that it is 
indicates a bug in squatter.  I could easily be wrong.

Hope this is some help to someone...
-Dylan

 Dylan Martin wrote:
 
 I've been trying to run down the problem I've been having with squatter, 
 and it looks like quite a few people on the list are having the same 
 problem.  Here's what I've got so far, and I'll post more if/when I get 
 it.
 
 It looks like in squat_build.c in write_trie_word_data, if len  2 it
 calls write_trie_word_data on the SquatWordTable new_t.  When it breaks,
 new_t has these values:  new_t-first_valid_entry = 256
 new_t-last_valid_entry = 0.  When it doesn't break, first_valid_entry is 
 less than or equal to last_valid entry.
 
 I don't really know what values mean what, so I can't really say what this 
 means or even if it's significant.  I'll see if I can find more.  Let me 
 know if this means anything to any of you.
 
 Thanks
 -Dylan
 
 For me it fails at exactly the same place with the same error ! I 
 collected some of the messages for which squatter reproducable fails but 
 cannot say which unique liddle difference in them squatter does not 
 like. They are from different mailers in different charsets and 
 encodings. The only thing they seem to have in common is 
 Content-Transfer-Encoding: 8 Bit in at least one body part of a mime 
 message. Even messages I personally sent using squirrelmail could not be 
 processed by squatter without crashing! During tracing I had a closer 
 look on some I-do-not-remember-receiver-function in squatter.c  in which 
 I tried to figure out if cyrus has problems in decoding the messages 
 and  building the to-index-strings but that all looked reasonable. One 
 thing I did not quite understand was that somehow cyrus does not seem to 
 pay attention on the charset being used in the message to index. I think 
 there were messages which were explicitly (and correctly) defined in 
 charset iso-8859-15 (all messages for which squatter fails seem to have 
 this charset in use but for my system 99% of all messages are using 
 iso-8859-15 charset and so this may not be an issue) but during 
 index-canonicalization 8 bit characters got replaced by 'X' characters 
 and so the index would never contain words containing e.g. german 
 umlauts correctly (Maybe I am totally wrong here, of course!).
 I tried setting reject8bit and stopped all mta-mail-conversions but 
 messages which cause squatter to crash still come in! The error I get 
 seems to be
 
 #define EPERM1  /* Operation not permitted */
 
 --Christian