[HACKERS] 7.0.3 reproduceable serious select error

2001-01-18 Thread Rob van Nieuwkerk

Hello,

I've selected postgresql 7.0.3 for our (critical) application and while
doing my first experiments I've found a bug which makes me worry very
much.

The problem is that a SELECT with a certain LIKE condition in combination
with a GROUP BY does not find the proper records when there is an index on
the particular column present.  When the index is removed the SELECT *does*
return the right answer.

Fortunately I managed to strip down our database and create a simple
single table with which the bug can be easily reproduced.

I've been searching in the Postgres bug-database and this problem
might be related to this report:

http://www.postgresql.org/bugs/bugs.php?4~111

Below you find a psql-session that demonstrates the bug.

I've made a dump of the test-database available as:

http://dutepp0.et.tudelft.nl/~robn/demo.dump.bz2

(it is 46100 bytes long in compressed form but 45 MB when uncompressed,
 I tried to trim it down but then the bug isn't reproducable anymore !)

The table is filled with all Spaces execpt for the "town" column.


Sysinfo:  

- well-maintained Linux Red Hat 6.2
- kernel 2.2.18
- Intel Pentium III
- postgresql-7.0.3-2 RPMs from the Postgresql site
  (the problem also occurs with locally rebuilt Source RPM)

Any help is much appreciated !

        Friendly greetings,
Rob van Nieuwkerk


psql session:
***
demo=> \d 
 List of relations
Name| Type  | Owner 
+---+---
 demo_table | table | robn
(1 row)

demo=> \d demo_table
   Table "demo_table"
 Attribute |   Type   | Modifier 
---+--+--
 postcode  | char(7)  |
 odd_even  | char(1)  |
 low   | char(5)  |
 high  | char(5)  |
 street| char(24) | 
 town  | char(24) |   
 area  | char(1)  |

demo=> \di
No relations found.
demo=> SELECT town FROM demo_table WHERE town  LIKE 'ZWO%' GROUP BY town;
   town   
--
 ZWOLLE
(1 row)

demo=> SELECT town FROM demo_table WHERE town  LIKE 'Z%' GROUP BY town;

<<<<<< here 86 towns are correctly found (output removed) >>>>>>

demo=> CREATE INDEX demo_table_town_idx ON demo_table(town);
CREATE
demo=> SELECT town FROM demo_table WHERE town  LIKE 'Z%' GROUP BY town;
 town
--
(0 rows)
<<<<<< This is wrong !! >>>>>>>

demo=> SELECT town FROM demo_table WHERE town  LIKE 'ZWO%' GROUP BY town;
   town
--
 ZWOLLE
(1 row)

demo=> DROP INDEX demo_table_town_idx;
DROP
demo=> SELECT town FROM demo_table WHERE town  LIKE 'Z%' GROUP BY town;

<<<<<< here 86 towns are correctly found again >>>>>>
***



Re: [HACKERS] 7.0.3 reproduceable serious select error

2001-01-18 Thread Rob van Nieuwkerk

Tom Lane wrote:
> 
> [EMAIL PROTECTED] (Rob van Nieuwkerk) writes:
> > The problem is that a SELECT with a certain LIKE condition in combination
> > with a GROUP BY does not find the proper records when there is an index on
> > the particular column present.  When the index is removed the SELECT *does*
> > return the right answer.
> 
> Are you running the postmaster in a non-ASCII locale?  This sounds like
> the old LIKE index optimization problem that we've struggled with for
> quite a while now.  7.1 works around it by disabling the optimization
> in non-ASCII locales, which is unpleasant but at least it gives right
> answers ...

Hi Tom,

I don't think I'm running postmaster in a non-ASCII locale.
At least I did not explicitly do anything to accomplish it.
I'm running with the default settings from the RPMs and didn't
change any default setting.

I peeked in some manual pages but couldn't find info quickly about
this setting.  Please tell me how to check it if you want to know !

Thank you for your reaction.

greetings,
Rob van Nieuwkerk



[HACKERS] Re: 7.0.3 reproduceable serious select error

2001-01-18 Thread Rob van Nieuwkerk


Hi Mark,

I just checked: the "demo.dump" file does not contain any characters
above 0x7F; it's just plain ASCII.  So that can't be the reason.

    greetings,
    Rob van Nieuwkerk


> Rob van Nieuwkerk wrote:

Ehm .., *you* wrote this ! :-)

> I tried to reproduce this bug on 7.0.2 and 7.0.3 with both 8K and 32K block
> sizes, and could not reproduce the error.
> 
> I am running RedHat 6.2 kernel 2.2.16.
> 
> I don't know enough to even be close, but I wonder if there are any subtle
> differences between the way characters are treated for indexes vs the way they
> are treated for table scans? If there are even slight differences in the way
> this happens, a misinterpretation of ascii conversions for instance, (I am
> assuming you may be using ascii characters above 0x7F), it could behave
> something like this, and explain why I wouldn't see it. .Like I said, however,
> I don't know  much so don't read too much into what I say.


> > Hello,
> >
> > I've selected postgresql 7.0.3 for our (critical) application and while
> > doing my first experiments I've found a bug which makes me worry very
> > much.
> >
> > The problem is that a SELECT with a certain LIKE condition in combination
> > with a GROUP BY does not find the proper records when there is an index on
> > the particular column present.  When the index is removed the SELECT *does*
> > return the right answer.
> >
> > Fortunately I managed to strip down our database and create a simple
> > single table with which the bug can be easily reproduced.
> >
> > I've been searching in the Postgres bug-database and this problem
> > might be related to this report:
> >
> > http://www.postgresql.org/bugs/bugs.php?4~111
> >
> > Below you find a psql-session that demonstrates the bug.
> >
> > I've made a dump of the test-database available as:
> >
> > http://dutepp0.et.tudelft.nl/~robn/demo.dump.bz2
> >
> > (it is 46100 bytes long in compressed form but 45 MB when uncompressed,
> >  I tried to trim it down but then the bug isn't reproducable anymore !)
> >
> > The table is filled with all Spaces execpt for the "town" column.
> >
> > Sysinfo:
> > 
> > - well-maintained Linux Red Hat 6.2
> > - kernel 2.2.18
> > - Intel Pentium III
> > - postgresql-7.0.3-2 RPMs from the Postgresql site
> >   (the problem also occurs with locally rebuilt Source RPM)
> >
> > Any help is much appreciated !
> >
> > Friendly greetings,
> > Rob van Nieuwkerk
> >
> > psql session:
> > ***
> > demo=> \d
> >  List of relations
> > Name| Type  | Owner
> > +---+---
> >  demo_table | table | robn
> > (1 row)
> >
> > demo=> \d demo_table
> >Table "demo_table"
> >  Attribute |   Type   | Modifier
> > ---+--+--
> >  postcode  | char(7)  |
> >  odd_even  | char(1)  |
> >  low   | char(5)  |
> >  high  | char(5)  |
> >  street| char(24) |
> >  town  | char(24) |
> >  area  | char(1)  |
> >
> > demo=> \di
> > No relations found.
> > demo=> SELECT town FROM demo_table WHERE town  LIKE 'ZWO%' GROUP BY town;
> >town
> > --
> >  ZWOLLE
> > (1 row)
> >
> > demo=> SELECT town FROM demo_table WHERE town  LIKE 'Z%' GROUP BY town;
> >
> > <<<<<< here 86 towns are correctly found (output removed) >>>>>>
> >
> > demo=> CREATE INDEX demo_table_town_idx ON demo_table(town);
> > CREATE
> > demo=> SELECT town FROM demo_table WHERE town  LIKE 'Z%' GROUP BY town;
> >  town
> > --
> > (0 rows)
> > <<<<<< This is wrong !! >>>>>>>
> >
> > demo=> SELECT town FROM demo_table WHERE town  LIKE 'ZWO%' GROUP BY town;
> >town
> > --
> >  ZWOLLE
> > (1 row)
> >
> > demo=> DROP INDEX demo_table_town_idx;
> > DROP
> > demo=> SELECT town FROM demo_table WHERE town  LIKE 'Z%' GROUP BY town;
> >
> > <<<<<< here 86 towns are correctly found again >>>>>>
> > ***
> 





Re: [HACKERS] 7.0.3 reproduceable serious select error

2001-01-18 Thread Rob van Nieuwkerk


Tom Lane wrote:

> > I don't think I'm running postmaster in a non-ASCII locale.
> > At least I did not explicitly do anything to accomplish it.
> 
> Did you have LANG, LOCALE, or any of the LC_xxx family of
> environment variables set when you started the postmaster?
> Some Linux distros tend to set those in system profile scripts ...

Checking whith ps and looking in /proc reveiled that postmaster indeed
had LANG set to "en_US" in its environment.  I disabled the system script
that makes this setting, restarted postgres/postmaster and reran my tests.

The problem query returns the *right* answer now !
Turning LANG=en_US back on gives the old buggy behaviour.

I know very little about this LANG, LOCALE etc. stuff.
But for our application it is very important to support "weird" characters
like "éõåÊ ..." etc. for names.  Basically we need all letter symbols
in ISO-8859-1 (Latin 1).  A quick experiment shows that without the
LANG setting I can still insert & select strings containing these
symbols.

Do I lose any postgresql functionality by just getting rid of the LANG
environment variable ?  Will I be able to use full ISO-8859-1 in table
fields without problems ?

Please tell if you want me to do any other tests !

greetings,
Rob van Nieuwkerk



Re: [HACKERS] like and optimization

2001-01-20 Thread Rob van Nieuwkerk

On Sun, 21 Jan 2001 00:25:17 + (UTC), Tom Lane <[EMAIL PROTECTED]> wrote:
>Juriy Goloveshkin <[EMAIL PROTECTED]> writes:
>> Hello, I didn't know pgsql-sources close,
>> so I wrote this code just as example of idea.
>> Can somebody review and make patch for pgsql?
>
>AFAICT this only deals with the issue of single-byte characters that
>sort in an order different from their numeric order.  The existing
>make_greater_string() code already deals with that case.  Where it
>falls down is cases where sorting is context-dependent (multi-pass
>sort rules, digraphs, things like that).  But I don't see anything
>here that would make such cases work.
>
>If you're trying to tell us that the 7.0.* code works correctly for
>KOI8-R locale, we'd be glad to re-enable LIKE optimization for that
>locale ...

Hello,

I have no knowledge of postgres internals at all (yet !), and I'm not
quite sure what this thread is exactly about.

But if anybody thinks that selects with LIKE on indexed columns with
single-byte non-ASCII characters are working OK: they are not !!  See my
posting and following thread "7.0.3 reproduceable serious select error"
from a couple of days ago.

I made a reproduceable example of things going wrong with a "en_US"
locale which is the widely-used (single-byte) ISO-8859-1 Latin 1 charset.

Please excuse me if this has nothing to do with what you are talking
about.  I'm just very eager to get rid of this (for our application)
extremely nasty bug !

friendly greetings,
Rob van Nieuwkerk



[HACKERS] pg_dump consistent on live database with 7.0.3 ?

2001-03-11 Thread Rob van Nieuwkerk

Hello,

We have built a database application build around PostgreSQL v7.03.  
And we're very happy with PostgreSQL functionality & performance.

In a live situation 50-100 clients are inserting and updating database
records.  It is very important that we have all the data available on
the back-up servers in case the main server fails.

After looking briefly at the the RServ 0.1 replication software we decided
it was not an option (for now) (it had serious problems in our tests,
there is no documentation available and inquiries about it to PosgreSQL,
Inc. remain unanswered unfortunately).

So we decided to use a compromise and do frequent pg_dump runs from the
backup machines to the main server.  It all seems to work fine.

But I still have some worries: is it OK to run pg_dump on *live* databases ?
Do you get consistent dumps from it when there is inserting and updating
going on ? 

In the v7.0.3 documentation I can't find any info on this issue.
But in the v7.1 docu:

http://www.de.postgresql.org/devel-corner/docs/admin/backup.html

I read:

   "Dumps created by pg_dump are internally consistent, that is, updates
to the database while pg_dump is running will not be in the dump."

That looks great.  But we are using v7.03.  Does this mean that the v7.1
documentation is just more detailed and that the same applies to v7.03 ?
Or is this a feature that is available only with v7.1 but not with v7.03 ?

Any help is much appreciated !

Friendly greetings,
    Rob van Nieuwkerk

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html