Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-16 Thread Teodor Sigaev

Fixes are committed to CVS, hope, they will help you.
--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Hannes Dorbath
with
the imagination that it is, therefore, obscure and, on the contrary, that
what is to prove it is clear, and so we understand it easily.

41. Epigrams of Martial.--Man loves malice, but not against one-eyed men nor
the unfortunate, but against the fortunate and proud. People are mistaken in
thinking otherwise.

For lust is the source of all our actions, and humanity, etc. We must please
those who have humane and tender feelings. That epigram about two one-eyed
people is worthless, for it does not console them and only gives a point to
the author's glory. All that is only for the sake of the author is
worthless. Ambitiosa recident ornamenta.[4]

42. To call a king "Prince" is pleasing, because it diminishes his rank.

43. Certain authors, speaking of their works, say: "My book," "My
commentary," "My history," etc. They resemble middle-class people who have a
house of their own and always have "My house" on their tongue. They would do
better to say: "Our book," "Our commentary," "Our history," etc., because
there is in them usually more of other people's than their own.

44. Do you wish people to believe good of you? Don't speak.

45. Languages are 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Hannes Dorbath
to the rich, that, if they do it not in the sight of God, they
depart from the command of religion.

SECTION VI: THE PHILOSOPHERS

339. I can well conceive a man without hands, feet, head (for it is only
experience which teaches us that the head is more necessary than feet). But
I cannot conceive man without thought; he would be a stone or a brute.

340. The arithmetical machine produces effects which approach nearer to
thought than all the actions of animals. But it does nothing which would
enable us to attribute will to it, as to the animals.

341. The account of the pike and frog of Liancourt. They do it always, and
never otherwise, nor any other thing showing mind.

342. If an animal did by mind what it does by instinct, and if it spoke by
mind what it speaks by instinct, in hunting and in warning its mates that
the prey is found or lost, it would indeed also speak in regard to those
things which affect it closer, as example, "Gnaw me this cord which is
wounding me, and which I cannot reach."

343. The beak of the parrot, which it wipes, although it is clean.

344. Instinct and reason, marks of two natures.

345. Reason commands us far more imperiously than a master; for in
disobeying the one we are unfortunate, and in disobeying the other we are
fools.

346. Thought constitutes the greatness of man.

347. Man is but a reed, the most feeble thing in nature; but he is a
thinking reed. The entire universe need not arm itself to crush him. A
vapour, a drop of water suffices to kill him. But, if the universe were to
crush him, man would still be more noble than that which killed him, because
he knows that he dies and the advantage which the univer



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Hannes Dorbath
believe that matters were reversed? In short, as we often dream that
we dream, heaping dream upon dream, may it not be that this half of our
life, wherein we think ourselves awake, is itself only a dream on which the
others are grafted, from which we wake at death, during which we have as few
principles of truth and good as during natural sleep, these different
thoughts which disturb us being perhaps only illusions like the flight of
time and the vain fancies of our dreams?

These are the chief arguments on one side and the other.

I omit minor ones, such as the sceptical talk against the impressions of
custom, education, manners, country and the like. Though these influence the
majority of common folk, who dogmatise only on shallow foundations, they are
upset by the least breath of the sceptics. We have only to see their books
if we are not sufficiently convinced of this, and we shall very quickly
become so, perhaps too much.

I notice the only strong point of the dogmatists, namely, that, speaking in
good faith and sincerely, we cannot doubt natural principles. Against this
the sceptics set up in one word the uncertainty of our origin, which
includes that of our nature. The dogmatists have been trying to answer this
objection ever since the world began.

So there is open war among men, in which each must take a part and side
either with dogmatism or scepticism. For he who thinks to remain neutral is
above all a sceptic. This neutrality is the essence of the sect; he who is
not against them is essentially for them. In this appears their a



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Tom Lane
I think I found at least one part of the problem.  I was able to
reproduce a crash similar to yours by running the german_ispell
dictionary against long random words, and what I found out is that
it's possible to overrun the fixed-length "buf" buffer declared at
line 1542 of spell.c.

Run till exit from #0  CheckCompoundAffixes (ptr=0x7fffc0a66a90, 
word=0xb31ec4 "whcfbrvhbcifxdbyrizvdcnggpgtdqzzxbeyszftboyyrzwihvdkyqqefvxil
wihxmfpllimzgkpfgxmbbwyrmcxhzfsvnzszehkppvowfdblanlbjksvmpclutggsyfwfomukzqpcrzd
eleouwjglkrfexlwuqszeetodjdfacifnlshhbnsmexpahlvxdubgopjxso"..., len=285, 
CheckInPlace=0 '\0') at spell.c:1482
0x005e273e in SplitToVariants (Conf=0xb38d88, snode=0x0, orig=0x0, 
word=0xb31ec0 "nokgwhcfbrvhbcifxdbyrizvdcnggpgtdqzzxbeyszftboyyrzwihvdkyqqef
vxilwihxmfpllimzgkpfgxmbbwyrmcxhzfsvnzszehkppvowfdblanlbjksvmpclutggsyfwfomukzqp
crzdeleouwjglkrfexlwuqszeetodjdfacifnlshhbnsmexpahlvxdubgop"..., wordlen=289, 
startpos=0, minpos=-1) at spell.c:1537
1537while (level > startpos && (lenaff = CheckCompoundAffixe
s(&caff, word + level, wordlen - level, (node) ? true : false)) >= 0)
Value returned is $29 = 283
(gdb) s
1545lenaff = level - startpos + lenaff;
(gdb) 
1547if (!notprobed[startpos + lenaff - 1])
(gdb) 
1550if (level + lenaff - 1 <= minpos)
(gdb) 
1553if (lenaff > 0)
(gdb) 
1554memcpy(buf, word + startpos, lenaff);
(gdb) p lenaff
$31 = 287
(gdb) p sizeof(buf)
$32 = 256
(gdb) 

I still don't trust the limited-size stem[] arrays a bit, either.

I am now fairly seriously worried about what other buffer-overrun
bugs may exist in the tsearch code.  I would recommend a wholesale
campaign to try to break it with long random words.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Hannes Dorbath

Tom Lane wrote:

This is not the same test data as in your previous concurrent-index
problem, then?  I still had a copy of that, so I tried the case, and
it doesn't crash on that data ...


Teodor Sigaev wrote:

I tryed to reproduce the bug but without success.
Could you provide a dump of text column? 


It's not the same data, no. It's about 7 times as large as the old test 
data. I'd upload it as well, but I'm limited to 8KB/sec upstream at the 
moment, which makes it a bit of pain.



However, I may not be duplicating your dictionary exactly.  I didn't do
anything with the compound.pl script in that tarfile --- was I supposed
to?


No, compound.pl is already applied to the files in the archive. I've 
only converted them to UTF-8. I've used this dict with 8.2.x for quite a 
while in production.


Maybe it's faster when I setup a virtual machine and grant SSH access to 
it. The problem is that I'm running out of time. I'm probably not able 
to do it before friday :(


Is there anything quick I can do? Can I make the backtrace more 
meaningful by dropping -O2 or add -ggdb? Add some debug to SplitToVariants?



--
Best regards,
Hannes Dorbath

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Tom Lane
Hannes Dorbath <[EMAIL PROTECTED]> writes:
> Alvaro Herrera wrote:
>> Can you please provide a backtrace from gdb?

> I hope that contains it: http://theendofthetunnel.de/backtrace.log

Hmmm --- one thing that jumps out at me is that SplitToVariants assumes
(in four places) that the SplitVar.stem arrays can't overflow.  Now 1024
entries does sound like plenty, but perhaps a malformed dictionary could
cause the code to overrun that?  Or an extremely long input word?
Personally I'd put some checks on nstem in there ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Tom Lane
Hannes Dorbath <[EMAIL PROTECTED]> writes:
> Crash happens about 7 minutes after issuing the UPDATE statement with 
> current CVS HEAD. The table has around 5 million rows. It's always 
> reproducible.

This is not the same test data as in your previous concurrent-index
problem, then?  I still had a copy of that, so I tried the case, and
it doesn't crash on that data ...

> ISpell dict used: 
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/dicts/ispell/ispell-german-compound.tar.gz

However, I may not be duplicating your dictionary exactly.  I didn't do
anything with the compound.pl script in that tarfile --- was I supposed
to?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Teodor Sigaev

I tryed to reproduce the bug but without success.
Could you provide a dump of text column?


Hannes Dorbath wrote:
Crash happens about 7 minutes after issuing the UPDATE statement with 
current CVS HEAD. The table has around 5 million rows. It's always 
reproducible.

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Tom Lane
Hannes Dorbath <[EMAIL PROTECTED]> writes:
> Hannes Dorbath wrote:
>> Yes, I'll try to catch the row.

> I just slammed a BEFORE UPDATE FOR EACH ROW trigger with RAISE NOTICE on 
> the table to catch the row ID where it fails.

> The row ID is never the same. Neither can I reproduce it when I use the 
> last reported row ID.

So most likely it's triggered by a sequence of operations rather than
any individual row :-(.  You'll have to think in terms of producing
a multi-table-row test case.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Alvaro Herrera
Hannes Dorbath wrote:
> Hannes Dorbath wrote:
>> Yes, I'll try to catch the row.
>
> I just slammed a BEFORE UPDATE FOR EACH ROW trigger with RAISE NOTICE on 
> the table to catch the row ID where it fails.
>
> The row ID is never the same. Neither can I reproduce it when I use the 
> last reported row ID.
>
> Is NOTICE logging / printing asynchron? Is there some sort of queuing that 
> eats the ID where it really fails?
>
> Any ideas to track this down?

Turn off autovacuum and see if it makes any difference.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Hannes Dorbath

Hannes Dorbath wrote:

Yes, I'll try to catch the row.


I just slammed a BEFORE UPDATE FOR EACH ROW trigger with RAISE NOTICE on 
the table to catch the row ID where it fails.


The row ID is never the same. Neither can I reproduce it when I use the 
last reported row ID.


Is NOTICE logging / printing asynchron? Is there some sort of queuing 
that eats the ID where it really fails?


Any ideas to track this down?


--
Best regards,
Hannes Dorbath

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Hannes Dorbath

Alvaro Herrera wrote:

Can you please provide a backtrace from gdb?


I hope that contains it: http://theendofthetunnel.de/backtrace.log


It'd be nice to have text string which cause segfault and sql script for 
configuration.


Yes, I'll try to catch the row.


--
Best regards,
Hannes Dorbath

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Oleg Bartunov

It'd be nice to have text string which cause segfault
and sql script for configuration.

Oleg
On Tue, 15 Jan 2008, Hannes Dorbath wrote:

Crash happens about 7 minutes after issuing the UPDATE statement with current 
CVS HEAD. The table has around 5 million rows. It's always reproducible.


ISpell dict used: 
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/dicts/ispell/ispell-german-compound.tar.gz

(iconved to UTF-8)

Welcome to psql 8.3RC1, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
  \h for help with SQL commands
  \? for help with psql commands
  \g or terminate with semicolon to execute query
  \q to quit

test=# \dFd+ german_ispell
List of text search dictionaries
Schema | Name  | Template  | Init options 
| Description

+---+---+---+-
public | german_ispell | pg_catalog.ispell | dictfile = 'german', afffile = 
'german', stopwords = 'german' |

(1 row)

test=# \dF+ german_ispell
Text search configuration "pg_catalog.german_ispell"
Parser: "pg_catalog.default"
 Token  |   Dictionaries
-+---
asciihword  | german_ispell,german_stem
asciiword   | german_ispell,german_stem
email   | simple
file| simple
float   | simple
host| simple
hword   | german_ispell,german_stem
hword_asciipart | german_ispell,german_stem
hword_numpart   | simple
hword_part  | german_ispell,german_stem
int | simple
numhword| simple
numword | simple
sfloat  | simple
uint| simple
url | simple
url_path| simple
version | simple
word| german_ispell,german_stem


test=# SELECT ts_debug('autofahren');
   ts_debug
-
(asciiword,"Word, all 
ASCII",autofahren,"{german_ispell,german_stem}",german_ispell,"{autofahren,auto,fahren}")

(1 row)

test=# \d+ test
Table "public.test"
Column |   Type   | Modifiers | Description
+--+---+-
id | integer  | not null default nextval('test_id_seq'::regclass) |
text   | text | not null  |
tsv| tsvector |   |
Indexes:
   "test_pkey" PRIMARY KEY, btree (id)
Has OIDs: no

test=# UPDATE test SET tsv = to_tsvector(text);
server closed the connection unexpectedly
   This probably means the server terminated abnormally
   before or while processing the request.
The connection to the server was lost. Attempting reset: WARNING: terminating 
connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the 
current transaction and exit, because another server process exited 
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat 
your command.

Failed.
!>

Jan 15 11:32:50 brainchild postmaster[14815] general protection rip:5bb287 
rsp:7fff4ec80290 error:0


Jan 15 11:32:50 brainchild postgres[14788]: [3-1] LOG:  server process (PID 
14815) was terminated by signal 11: Segmentation fault
Jan 15 11:32:50 brainchild postgres[14788]: [4-1] LOG:  terminating any other 
active server processes
Jan 15 11:32:50 brainchild postgres[14843]: [3-1] WARNING:  terminating 
connection because of crash of another server process
Jan 15 11:32:50 brainchild postgres[14843]: [3-2] DETAIL:  The postmaster has 
commanded this server process to roll back the current transaction and exit, 
because another server
Jan 15 11:32:50 brainchild postgres[14843]: [3-3]  process exited abnormally 
and possibly corrupted shared memory.
Jan 15 11:32:50 brainchild postgres[14843]: [3-4] HINT:  In a moment you 
should be able to reconnect to the database and repeat your command.
Jan 15 11:32:50 brainchild postgres[14788]: [5-1] LOG:  all server processes 
terminated; reinitializing
Jan 15 11:32:50 brainchild postgres[14788]: [6-1] LOG:  shmctl(1343488, 0, 0) 
failed: Invalid argument
Jan 15 11:32:50 brainchild postgres[14844]: [7-1] LOG:  database system was 
interrupted; last known up at 2008-01-15 11:32:33 CET
Jan 15 11:32:50 brainchild postgres[14844]: [8-1] LOG:  database system was 
not properly shut down; automatic recovery in progress
Jan 15 11:32:50 brainchild postgres[14844]: [9-1] LOG:  redo starts at 
8/DC07E0A0
Jan 15 11:32:54 brainchild postgres[14844]: [10-1] LOG:  unexpected pageaddr 
8/BD59C000 in log file 8, segment 236, offset 5881856
Jan 15 11:32:54 brainchild postgres[14844]: [11-1] LOG:  redo don

Re: [GENERAL] Segmentation fault with 8.3 FTS ISpell

2008-01-15 Thread Alvaro Herrera
Hannes Dorbath wrote:

> test=# UPDATE test SET tsv = to_tsvector(text);
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.

> Jan 15 11:32:50 brainchild postmaster[14815] general protection rip:5bb287 
> rsp:7fff4ec80290 error:0
>
> Jan 15 11:32:50 brainchild postgres[14788]: [3-1] LOG:  server process (PID 
> 14815) was terminated by signal 11: Segmentation fault

Can you please provide a backtrace from gdb?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/