Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Grzegorz Jaśkiewicz
On Thu, Jan 29, 2009 at 6:53 PM, Tommy Gildseth
 wrote:
> Thanks a lot. Exceptional response time :D
> Less than 2.5 hours from problem reported, till a patch was made. Don't
> think there's many projects or commercial products that can compete with
> that ;-)

Oh, wait , it still has to go through 14 days Q&A, doesn't it ?
:D

-- 
GJ

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tommy Gildseth

Teodor Sigaev wrote:
I reproduced the bug with a help of Grzegorz's point for 64-bit box. So, 
patch is attached and I'm going to commit it





Thanks a lot. Exceptional response time :D
Less than 2.5 hours from problem reported, till a patch was made. Don't 
think there's many projects or commercial products that can compete with 
that ;-)


--
Tommy Gildseth

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev

char" issue?  Does this affect the old contrib/tsearch2 code?


Checked - No, that was improvement for 8.3 :).

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tom Lane
Teodor Sigaev  writes:
> Tom Lane wrote:
>> Hmm, seems it's not so much a "64 bit" error as a "signed vs unsigned
>> char" issue?  

> Yes, but I don't understand why it worked in 32-bit box.

You were casting to unsigned int.  So the offset added to the base
pointer for, say, 255 in the char would be equivalent to -1 on a 32-bit
box, or 0x on 64-bit.  The latter would likely provoke SIGSEGV
due to indexing out of the allocated process workspace, the former just
in scribbling on the byte adjacent to where it should have.  Still
broken, but not a segfault.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tom Lane
Gregory Stark  writes:
> I really think he should just change all the "unsigned int" into "unsigned
> char" and not do the type punning with pointer casts. That's just evil.

Oh, I see.  That would work too, but I don't really see that it's a huge
improvement.

What *would* be an improvement IMHO is to declare the pointer as
unsigned char * in the first place ;-).  However, that might result
in having to introduce a lot of casts elsewhere ...

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Gregory Stark

Tom Lane  writes:

> Gregory Stark  writes:
>> Maybe I'm missing something but I don't understand how this fixes the 
>> problem.
>> s is a "char*" so type punning it to an unsigned char * before dereferencing
>> it is really the same as casting it to unsigned char directly
>
> No, it isn't.  If char is signed then you'll get quite different results
> from a high-bit-set byte value, because sign extension will happen
> before the value is reinterpreted as unsigned.

What I wrote is correct. There's no sign extension if you're casting from
signed char to unsigned char since there's no extension.

I really think he should just change all the "unsigned int" into "unsigned
char" and not do the type punning with pointer casts. That's just evil.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tom Lane
Teodor Sigaev  writes:
> Tom Lane wrote:
>> Please try to make the commits in the next eight hours, as we have
>> release wraps scheduled for tonight.

> Minor versions or beta of 8.4?

This is just back-branch update releases.  8.4 beta is still a good
ways off :-(

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Grzegorz Jaśkiewicz
On Thu, Jan 29, 2009 at 4:06 PM, Gregory Stark  wrote:
> Gregory Stark  writes:

> Ah, I understand how this fixes the problem. You were casting to unsigned
> *int* not unsigned char so it was sign extending first and then overflowing.
:)

> It still seems to me if you put a few "unsigned" in variable declarations you
> could remove piles upon piles of casts and make all of the code more readable.

which is one of the main problems I see with that code, overall.


-- 
GJ

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Gregory Stark
Gregory Stark  writes:

> Teodor Sigaev  writes:
>
>> I reproduced the bug with a help of Grzegorz's point for 64-bit box. So, 
>> patch
>> is attached and I'm going to commit it
> ...
>
>> !Conf->flagval[(unsigned int) *s] = (unsigned char) val;
> ...
>> !Conf->flagval[*(unsigned char*) s] = (unsigned char) val;
>
> Maybe I'm missing something but I don't understand how this fixes the problem.

Ah, I understand how this fixes the problem. You were casting to unsigned
*int* not unsigned char so it was sign extending first and then overflowing.
So char<255> was coming out as MAX_INT instead of 255.

#include 

main()
{
  volatile signed char a = -1;
  printf("ud=%ud\n", (unsigned int)a);
}

$ ./a.out
ud=4294967295d


If you just make these all casts to (unsigned char) it should work just as
well as the pointer type punning -- and be a whole lot less scary.

> What really boggles me is why you don't just use unsigned chars everywhere and
> remove all of these casts. or would that just move the casts to strcmp and
> company?

It still seems to me if you put a few "unsigned" in variable declarations you
could remove piles upon piles of casts and make all of the code more readable.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev

To be honest, looking through that file, I am quite worried about few
points. I don't know too much about insights of ispell, but I see few
suspicious things in mkSPNode too.
I generally don't want to get involve in reviewing code for stuff I
don't know, But if Teodor (and Oleg) don't mind, I can raise my
points, and see if anything useful comes out of it.

If you see bug/mistake/suspicious point, please, don't be quiet



Also, about that patch - it doesn't seem to apply cleanly to 8.4,
perhaps that file has changed too much (I based my 'review' above on
8.4's code)

will tweak
--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Devrim GÜNDÜZ
On Thu, 2009-01-29 at 19:00 +0300, Teodor Sigaev wrote:
> > Please try to make the commits in the next eight hours, as we have
> > release wraps scheduled for tonight.
> 
> Minor versions or beta of 8.4? 

Minor versions.

-- 
Devrim GÜNDÜZ, RHCE
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
   http://www.gunduz.org


signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tom Lane
Gregory Stark  writes:
> Maybe I'm missing something but I don't understand how this fixes the problem.
> s is a "char*" so type punning it to an unsigned char * before dereferencing
> it is really the same as casting it to unsigned char directly

No, it isn't.  If char is signed then you'll get quite different results
from a high-bit-set byte value, because sign extension will happen
before the value is reinterpreted as unsigned.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev



Tom Lane wrote:

Teodor Sigaev  writes:

I reproduced the bug with a help of Grzegorz's point for 64-bit box.


Hmm, seems it's not so much a "64 bit" error as a "signed vs unsigned
char" issue?  


Yes, but I don't understand why it worked in 32-bit box.


Does this affect the old contrib/tsearch2 code?


Will check.


Please try to make the commits in the next eight hours, as we have
release wraps scheduled for tonight.


Minor versions or beta of 8.4? if last, I'd like to commit btree_gin and 
fast_update_gin.  For both patches all pointed issues was resolved and Jeff, 
seems, haven't objections.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Gregory Stark
Teodor Sigaev  writes:

> I reproduced the bug with a help of Grzegorz's point for 64-bit box. So, patch
> is attached and I'm going to commit it
...

> ! Conf->flagval[(unsigned int) *s] = (unsigned char) val;
...
> ! Conf->flagval[*(unsigned char*) s] = (unsigned char) val;

Maybe I'm missing something but I don't understand how this fixes the problem.
s is a "char*" so type punning it to an unsigned char * before dereferencing
it is really the same as casting it to unsigned char directly and casting it
to unsigned int really ought to have done the same thing anyways.

All of the changes are of this type so I can't see how your patch could have
fixed the problem.

And in general casting the pointer before dereferencing it is a whole lot
scarier code which should raise eyebrows a lot faster than just a simple cast
to unsigned char like you had it originally.

What really boggles me is why you don't just use unsigned chars everywhere and
remove all of these casts. or would that just move the casts to strcmp and
company?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tom Lane
Teodor Sigaev  writes:
> I reproduced the bug with a help of Grzegorz's point for 64-bit box.

Hmm, seems it's not so much a "64 bit" error as a "signed vs unsigned
char" issue?  Does this affect the old contrib/tsearch2 code?

Please try to make the commits in the next eight hours, as we have
release wraps scheduled for tonight.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Grzegorz Jaśkiewicz
On Thu, Jan 29, 2009 at 3:32 PM, Teodor Sigaev  wrote:
>
>
>> Than I have quite few notes about that function:
>> - affix is not checked on entry, and should be unsigned,
>
> Could be Assert( affix>=0 && affix < Conf->nAffixData )
>
wouldn't that crash pg backend too ?

The structure that this file parses, does it come straight from ispell
file, or is it being already parsed (and checked for errors) ?


-- 
GJ

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Grzegorz Jaśkiewicz
On Thu, Jan 29, 2009 at 3:26 PM, Teodor Sigaev  wrote:
> I reproduced the bug with a help of Grzegorz's point for 64-bit box. So,
> patch is attached and I'm going to commit it

:)

To be honest, looking through that file, I am quite worried about few
points. I don't know too much about insights of ispell, but I see few
suspicious things in mkSPNode too.
I generally don't want to get involve in reviewing code for stuff I
don't know, But if Teodor (and Oleg) don't mind, I can raise my
points, and see if anything useful comes out of it.

Also, about that patch - it doesn't seem to apply cleanly to 8.4,
perhaps that file has changed too much (I based my 'review' above on
8.4's code)


-- 
GJ

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev




Than I have quite few notes about that function:
- affix is not checked on entry, and should be unsigned,


Could be Assert( affix>=0 && affix < Conf->nAffixData )


- for sake of safety uint32_t should be used instead of unsigned int,
in the cast

see patch

- there should be some safety limit for lenght of str,

It's a C-string

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev
I reproduced the bug with a help of Grzegorz's point for 64-bit box. So, patch 
is attached and I'm going to commit it


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/
*** src/backend/tsearch/spell.c.orig2009-01-29 18:18:03.0 +0300
--- src/backend/tsearch/spell.c 2009-01-29 18:20:09.0 +0300
***
*** 521,527 
(errcode(ERRCODE_CONFIG_FILE_ERROR),
 errmsg("multibyte flag character is not 
allowed")));
  
!   Conf->flagval[(unsigned int) *s] = (unsigned char) val;
Conf->usecompound = true;
  }
  
--- 521,527 
(errcode(ERRCODE_CONFIG_FILE_ERROR),
 errmsg("multibyte flag character is not 
allowed")));
  
!   Conf->flagval[*(unsigned char*) s] = (unsigned char) val;
Conf->usecompound = true;
  }
  
***
*** 654,660 
ptr = repl + (ptr - prepl) + 1;
while (*ptr)
{
!   aflg |= Conf->flagval[(unsigned int) 
*ptr];
ptr++;
}
}
--- 654,660 
ptr = repl + (ptr - prepl) + 1;
while (*ptr)
{
!   aflg |= Conf->flagval[*(unsigned char*) 
ptr];
ptr++;
}
}
***
*** 735,741 
  
if (*s && pg_mblen(s) == 1)
{
!   Conf->flagval[(unsigned int) *s] = 
FF_COMPOUNDFLAG;
Conf->usecompound = true;
}
oldformat = true;
--- 735,741 
  
if (*s && pg_mblen(s) == 1)
{
!   Conf->flagval[*(unsigned char*) s] = 
FF_COMPOUNDFLAG;
Conf->usecompound = true;
}
oldformat = true;
***
*** 791,797 

(errcode(ERRCODE_CONFIG_FILE_ERROR),
 errmsg("multibyte flag 
character is not allowed")));
  
!   flag = (unsigned char) *s;
goto nextline;
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0 || STRNCMP(recoded, 
"COMPOUNDMIN") == 0 ||
--- 791,797 

(errcode(ERRCODE_CONFIG_FILE_ERROR),
 errmsg("multibyte flag 
character is not allowed")));
  
!   flag = *(unsigned char*) s;
goto nextline;
}
if (STRNCMP(recoded, "COMPOUNDFLAG") == 0 || STRNCMP(recoded, 
"COMPOUNDMIN") == 0 ||
***
*** 851,857 
  
while (str && *str)
{
!   flag |= Conf->flagval[(unsigned int) *str];
str++;
}
  
--- 851,857 
  
while (str && *str)
{
!   flag |= Conf->flagval[*(unsigned char*) str];
str++;
}
  

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Grzegorz Jaśkiewicz
if it's
static uint32
makeCompoundFlags(IspellDict *Conf, int affix)
{
uint32  flag = 0;
char   *str = Conf->AffixData[affix];

while (str && *str)
{
flag |= Conf->flagval[(unsigned int) *str];
str++;
}

return (flag & FF_DICTFLAGMASK);
}


Than I have quite few notes about that function:
- affix is not checked on entry, and should be unsigned,
- for sake of safety uint32_t should be used instead of unsigned int,
in the cast
- there should be some safety limit for lenght of str,

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tommy Gildseth

Teodor Sigaev wrote:

How do I make a backtrace?


- if you have coredump, just execute gdb /PATH1/postgres gdb /PATH2/core 
and type bt. Linux doesn't make core by default, so you allow to do it 
by ulimit -c unlimited for postgresql user
- connect to db, and attach gdb to backend process: gdb /PATH1/postgres 
BACKEND_PID and type run in gdb, next, execute CREATE DICTIONARY and 
type bt in gdb


I am completely unfamiliar with gdb, but hopefully, this is what you are 
after?



(gdb) continue
Continuing.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 182898870848 (LWP 606)]
0x005a6f65 in makeCompoundFlags ()
(gdb) bt
#0  0x005a6f65 in makeCompoundFlags ()
#1  0x005a712a in mkSPNode ()
#2  0x005a709b in mkSPNode ()
#3  0x005a709b in mkSPNode ()
#4  0x005a709b in mkSPNode ()
#5  0x005a71a2 in mkSPNode ()
#6  0x005a734d in NISortDictionary ()
#7  0x005a4dae in dispell_init ()
#8  0x00638823 in OidFunctionCall1 ()
#9  0x004f4800 in verify_dictoptions ()
#10 0x004f48da in DefineTSDictionary ()
#11 0x0059c4fe in PortalRunUtility ()
#12 0x0059c5dc in PortalRunMulti ()
#13 0x0059bf28 in PortalRun ()
#14 0x005977d2 in exec_simple_query ()
#15 0x0059a874 in PostgresMain ()
#16 0x00572821 in BackendRun ()
#17 0x00572112 in BackendStartup ()
#18 0x0057034c in ServerLoop ()
#19 0x0056f938 in PostmasterMain ()
#20 0x00529f5b in main ()


--
Tommy Gildseth

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Grzegorz Jaśkiewicz
to get coredump:
sudo su - postgres
ulimit -c unlimited
pg_ctl restart


Also, Teodor - mind the fact, that his machine is 64, and you've
tested it on 32bits.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev

How do I make a backtrace?


- if you have coredump, just execute gdb /PATH1/postgres gdb /PATH2/core and 
type bt. Linux doesn't make core by default, so you allow to do it by ulimit -c 
unlimited for postgresql user
- connect to db, and attach gdb to backend process: gdb /PATH1/postgres 
BACKEND_PID and type run in gdb, next, execute CREATE DICTIONARY and type bt in gdb




Teodor Sigaev wrote:

Could you provide a backtrace? Do you use unchanged norwegian.stop file?
I'm not able to reproduce the bug - postgres just works.

Tommy Gildseth wrote:
While trying to create a new dictionary for use with PostgreSQL text 
search, I get a segfault. My Postgres version is 8.3.5








--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tommy Gildseth

I tried without specifying a StopWords-list as well, but same thing happens.


Teodor Sigaev wrote:

Could you provide a backtrace? Do you use unchanged norwegian.stop file?
I'm not able to reproduce the bug - postgres just works.

Tommy Gildseth wrote:
While trying to create a new dictionary for use with PostgreSQL text 
search, I get a segfault. My Postgres version is 8.3.5






--
Tommy Gildseth

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tommy Gildseth
Yes, originaly I used a customized norwegian.stop-file, but I changed 
that to the one that comes with PostgreSQL, and got the same error.


How do I make a backtrace?

Teodor Sigaev wrote:

Could you provide a backtrace? Do you use unchanged norwegian.stop file?
I'm not able to reproduce the bug - postgres just works.

Tommy Gildseth wrote:
While trying to create a new dictionary for use with PostgreSQL text 
search, I get a segfault. My Postgres version is 8.3.5






--
Tommy Gildseth

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Tommy Gildseth

It works on one of my servers:
SELECT version();
 version
-
 PostgreSQL 8.3.4 on i686-pc-linux-gnu, compiled by GCC cc (GCC) 3.2.3 
20030502 (Red Hat Linux 3.2.3-59)


The one it fails on is running:
SELECT version();
version

 PostgreSQL 8.3.5 on x86_64-unknown-linux-gnu, compiled by GCC cc (GCC) 
3.2.3 20030502 (Red Hat Linux 3.2.3-59)


Oleg Bartunov wrote:

Tommy,

I tried your example and didn't find any problem.
My postgresql version is 8.3.3 and I didn't use stopwords, since I don't
have them.

arxiv=# select version();
version

 PostgreSQL 8.3.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.2.3
(1 row)

arxiv=# select ts_lexize('no_ispell', 
'overbuljongterningpakkmesterassistent');

  ts_lexize
--
 {over,buljong,terning,pakk,mester,assistent}
(1 row)

Time: 922.364 ms


Oleg

On Thu, 29 Jan 2009, Tommy Gildseth wrote:

While trying to create a new dictionary for use with PostgreSQL text 
search, I get a segfault. My Postgres version is 8.3.5


The dictionary I use is the "Norwegian Bokm?l & Nynorsk (Norway) pack 
for OOo 2.x" downloaded from 
http://wiki.services.openoffice.org/wiki/Dictionaries#Norwegian_.28Norway.29 



I've also uploaded the dictionary and affix-file used to this location:
http://folk.uio.no/tommygi/postgres/nb_no.dict
http://folk.uio.no/tommygi/postgres/nb_no.affix

These are unpacked from the zip-file I got from the above location, 
and converted to UTF-8 with the following commands:

 - iconv -f latin1 -t utf-8 nb_NO.dic > nb_no.dict
 - iconv -f latin1 -t utf-8 nb_NO.aff > nb_no.affix

The command I use is this:
CREATE TEXT SEARCH DICTIONARY no_ispell (
   TEMPLATE = ispell,
   DictFile = nb_no,
   AffFile =  nb_no,
   StopWords = norwegian
);


I've uploaded a strace of the process at 
http://folk.uio.no/tommygi/postgres/strace.txt
This is captured with strace -p. Since this is about the extent 
of my knowledge of strace, please let me know if there's any other 
options I should specify.


Relevant parts of the postgres log file is included below.

[2009-01-29 13:55:11.721 CET] [pgtest02] [:] [] [26466] [0] LOG:  
server process (PID 12002) was terminated by signal 11: Segmentation 
fault
[2009-01-29 13:55:11.721 CET] [pgtest02] [:] [] [26466] [0] LOG: 
terminating any other active server processes
[2009-01-29 13:55:11.725 CET] [pgtest02] [:] [] [26466] [0] LOG:  all 
server processes terminated; reinitializing
[2009-01-29 13:55:11.931 CET] [pgtest02] [:] [] [12018] [0] LOG: 
database system was interrupted; last known up at 2009-01-29 13:42:05 CET
[2009-01-29 13:55:11.931 CET] [pgtest02] [:] [] [12018] [0] LOG: 
database system was not properly shut down; automatic recovery in 
progress
[2009-01-29 13:55:11.934 CET] [pgtest02] [:] [] [12018] [0] LOG:  
record with zero length at 0/4A2F9A0
[2009-01-29 13:55:11.934 CET] [pgtest02] [:] [] [12018] [0] LOG:  redo 
is not required
[2009-01-29 13:55:11.986 CET] [pgtest02] [fotoportal:postgres] 
[192.168.6.137(49650)] [12019] [0] FATAL:  the database system is in 
recovery mode
[2009-01-29 13:55:11.988 CET] [pgtest02] [fotoportal:postgres] 
[192.168.6.137(49655)] [12020] [0] FATAL:  the database system is in 
recovery mode
[2009-01-29 13:55:12.010 CET] [pgtest02] [:] [] [12023] [0] LOG: 
autovacuum launcher started
[2009-01-29 13:55:12.011 CET] [pgtest02] [:] [] [26466] [0] LOG: 
database system is ready to accept connections






Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83



--
Tommy Gildseth
DBA, Gruppe for databasedrift
Universitetet i Oslo, USIT
m: +47 45 86 38 50
t: +47 22 85 29 39

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Teodor Sigaev

Could you provide a backtrace? Do you use unchanged norwegian.stop file?
I'm not able to reproduce the bug - postgres just works.

Tommy Gildseth wrote:
While trying to create a new dictionary for use with PostgreSQL text 
search, I get a segfault. My Postgres version is 8.3.5



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Text search segmentation fault

2009-01-29 Thread Oleg Bartunov

Tommy,

I tried your example and didn't find any problem.
My postgresql version is 8.3.3 and I didn't use stopwords, since I don't
have them.

arxiv=# select version();
version

 PostgreSQL 8.3.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.2.3
(1 row)

arxiv=# select ts_lexize('no_ispell', 'overbuljongterningpakkmesterassistent');
  ts_lexize
--
 {over,buljong,terning,pakk,mester,assistent}
(1 row)

Time: 922.364 ms


Oleg

On Thu, 29 Jan 2009, Tommy Gildseth wrote:

While trying to create a new dictionary for use with PostgreSQL text search, 
I get a segfault. My Postgres version is 8.3.5


The dictionary I use is the "Norwegian Bokm?l & Nynorsk (Norway) pack for OOo 
2.x" downloaded from 
http://wiki.services.openoffice.org/wiki/Dictionaries#Norwegian_.28Norway.29


I've also uploaded the dictionary and affix-file used to this location:
http://folk.uio.no/tommygi/postgres/nb_no.dict
http://folk.uio.no/tommygi/postgres/nb_no.affix

These are unpacked from the zip-file I got from the above location, and 
converted to UTF-8 with the following commands:

 - iconv -f latin1 -t utf-8 nb_NO.dic > nb_no.dict
 - iconv -f latin1 -t utf-8 nb_NO.aff > nb_no.affix

The command I use is this:
CREATE TEXT SEARCH DICTIONARY no_ispell (
   TEMPLATE = ispell,
   DictFile = nb_no,
   AffFile =  nb_no,
   StopWords = norwegian
);


I've uploaded a strace of the process at 
http://folk.uio.no/tommygi/postgres/strace.txt
This is captured with strace -p. Since this is about the extent of my 
knowledge of strace, please let me know if there's any other options I should 
specify.


Relevant parts of the postgres log file is included below.

[2009-01-29 13:55:11.721 CET] [pgtest02] [:] [] [26466] [0] LOG:  server 
process (PID 12002) was terminated by signal 11: Segmentation fault
[2009-01-29 13:55:11.721 CET] [pgtest02] [:] [] [26466] [0] LOG: terminating 
any other active server processes
[2009-01-29 13:55:11.725 CET] [pgtest02] [:] [] [26466] [0] LOG:  all server 
processes terminated; reinitializing
[2009-01-29 13:55:11.931 CET] [pgtest02] [:] [] [12018] [0] LOG: database 
system was interrupted; last known up at 2009-01-29 13:42:05 CET
[2009-01-29 13:55:11.931 CET] [pgtest02] [:] [] [12018] [0] LOG: database 
system was not properly shut down; automatic recovery in progress
[2009-01-29 13:55:11.934 CET] [pgtest02] [:] [] [12018] [0] LOG:  record with 
zero length at 0/4A2F9A0
[2009-01-29 13:55:11.934 CET] [pgtest02] [:] [] [12018] [0] LOG:  redo is not 
required
[2009-01-29 13:55:11.986 CET] [pgtest02] [fotoportal:postgres] 
[192.168.6.137(49650)] [12019] [0] FATAL:  the database system is in recovery 
mode
[2009-01-29 13:55:11.988 CET] [pgtest02] [fotoportal:postgres] 
[192.168.6.137(49655)] [12020] [0] FATAL:  the database system is in recovery 
mode
[2009-01-29 13:55:12.010 CET] [pgtest02] [:] [] [12023] [0] LOG: autovacuum 
launcher started
[2009-01-29 13:55:12.011 CET] [pgtest02] [:] [] [26466] [0] LOG: database 
system is ready to accept connections






Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Text search segmentation fault

2009-01-29 Thread Tommy Gildseth
While trying to create a new dictionary for use with PostgreSQL text 
search, I get a segfault. My Postgres version is 8.3.5


The dictionary I use is the "Norwegian Bokmål & Nynorsk (Norway) pack 
for OOo 2.x" downloaded from 
http://wiki.services.openoffice.org/wiki/Dictionaries#Norwegian_.28Norway.29


I've also uploaded the dictionary and affix-file used to this location:
http://folk.uio.no/tommygi/postgres/nb_no.dict
http://folk.uio.no/tommygi/postgres/nb_no.affix

These are unpacked from the zip-file I got from the above location, and 
converted to UTF-8 with the following commands:

  - iconv -f latin1 -t utf-8 nb_NO.dic > nb_no.dict
  - iconv -f latin1 -t utf-8 nb_NO.aff > nb_no.affix

The command I use is this:
CREATE TEXT SEARCH DICTIONARY no_ispell (
TEMPLATE = ispell,
DictFile = nb_no,
AffFile =  nb_no,
StopWords = norwegian
);


I've uploaded a strace of the process at 
http://folk.uio.no/tommygi/postgres/strace.txt
This is captured with strace -p. Since this is about the extent of 
my knowledge of strace, please let me know if there's any other options 
I should specify.


Relevant parts of the postgres log file is included below.

[2009-01-29 13:55:11.721 CET] [pgtest02] [:] [] [26466] [0] LOG:  server 
process (PID 12002) was terminated by signal 11: Segmentation fault
[2009-01-29 13:55:11.721 CET] [pgtest02] [:] [] [26466] [0] LOG: 
terminating any other active server processes
[2009-01-29 13:55:11.725 CET] [pgtest02] [:] [] [26466] [0] LOG:  all 
server processes terminated; reinitializing
[2009-01-29 13:55:11.931 CET] [pgtest02] [:] [] [12018] [0] LOG: 
database system was interrupted; last known up at 2009-01-29 13:42:05 CET
[2009-01-29 13:55:11.931 CET] [pgtest02] [:] [] [12018] [0] LOG: 
database system was not properly shut down; automatic recovery in progress
[2009-01-29 13:55:11.934 CET] [pgtest02] [:] [] [12018] [0] LOG:  record 
with zero length at 0/4A2F9A0
[2009-01-29 13:55:11.934 CET] [pgtest02] [:] [] [12018] [0] LOG:  redo 
is not required
[2009-01-29 13:55:11.986 CET] [pgtest02] [fotoportal:postgres] 
[192.168.6.137(49650)] [12019] [0] FATAL:  the database system is in 
recovery mode
[2009-01-29 13:55:11.988 CET] [pgtest02] [fotoportal:postgres] 
[192.168.6.137(49655)] [12020] [0] FATAL:  the database system is in 
recovery mode
[2009-01-29 13:55:12.010 CET] [pgtest02] [:] [] [12023] [0] LOG: 
autovacuum launcher started
[2009-01-29 13:55:12.011 CET] [pgtest02] [:] [] [26466] [0] LOG: 
database system is ready to accept connections



--
Tommy Gildseth
DBA, Gruppe for databasedrift
Universitetet i Oslo, USIT
m: +47 45 86 38 50
t: +47 22 85 29 39

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general