[HACKERS] string freeze

2002-10-28 Thread Dennis Björklund
Is there a planned string freeze for postgresql?

There was a (good) change some weeks ago that altered lots of witespace.
Remember that it's enough to change a single space to make that
translation of that string not work. To do "psql --help" and get half the
entries in swedish and half in english would be ugly. I sent an update a
couple of days ago that fixes the swedish translation. I'm not sure about
the status of the other languages. Now I'm just planning for the release
and to prevent us from getting into a situation where strings are altered
the day before the release or something.

There have been a lot work with the translations and would be nice if it
works in the release.

-- 
/Dennis


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Implicit coercions, choosing types for constants, etc

2002-11-11 Thread Dennis Björklund
On Sun, 10 Nov 2002, Tom Lane wrote:

> In the last episode of this long-running issue, we decided that life
> would be better if we make small-enough integer constants be initially
> typed as int2 rather than int4, and then auto-promote them up to int4
> when necessary.

What kind of type system do postgresql (or SQL in general) use? I don't 
know much about the postgresql codebase but I do know something about type 
inference of functional languages.

It seems to me that one would like to have a polymorphic typesystem with 
constraints to handle overloading, subtyping or whatever is needed in 
SQL.

>  This would solve problems with, for example,
>   SELECT ... WHERE smallintcol = 42
> not using an index.

Using a suitable typesystem 42 could be said to have a type like

isInt z => z

which should be read that z is the type and isInt z is a constraint on z
saying that z is an integer type (that means for example that z can never
be instantiated to Bool). Then the use of smallintcol = 42 where
smallintcol has type int2 and where equality = is overloaded for types
Int2 -> Int2 -> Bool, Int4 -> Int4 -> Bool (and so on) would force 42 to
be the type Int2 in this case, since the first argument of = had type
Int2.

Is there anywhere I can read about the typesystem in SQL in general and 
postgresql in particular?

There are a number of type systems in the functional world with properties 
like this. Some very advanced and some simpler. I have a feeling from 
reading this list that the type inference in postgresql as a bit ad hook. 
But i've not read the source at all so it might be unfair to say such a 
thing.

In the functional language Haskell there is also a defaulting rule that is
used if you end up with constants like 42 still with type isInt z => z. If
the type was left like this it just imply that any numeric type for 42
would do. In this case maybe z is defaulted to Int4. In most cases they
way you use the constant would force it to be a particular type.

Of the examples you gave in the letter I don't see anything that shouldn't 
work with a more advanced typesystem like this. But I'm sure there are 
other strange constructs in postgresql that I don't know about.

I would love to make some small toy implementation to try out things but
in the nearest future I don't have time for that. But this interests me so
maybe I'll give it a try some day (like next summer :-). At least I would
like to know more about how it works in postgresql today. It's possible
that what I talk about do not apply to SQL or that postgresql already
implements similar things.

-- 
/Dennis


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] 7.4 Wishlist

2002-12-03 Thread Dennis Björklund
On Tue, 3 Dec 2002, Magnus Naeslund(f) wrote:

> Now convert this query so that it only evaluates the date_part thing
> ONCE:
> 
> select t.id, date_part('days',now()-t.stamp) from table_name t where
> date_part('days',now()-t.stamp) > 20;

Something like this could work:

select *
  from (select t.id, date_part('days',now()-t.stamp) AS d
  from table_name t) AS t1
 where t1.d > 20;

That aside I also would like some sort of local names. Something like the
let construct used in many functional languages (not exaclty what you want
above, but still):

let t1 = select * from foo;
t2 = select * from bar;
in select * from t1 natural join t2;

But even though I would like to give name to subexpressions like above, I
still think postgresql should stick to standards as close as possible.

-- 
/Dennis


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] 7.4 Wishlist

2002-12-03 Thread Dennis Björklund
On 3 Dec 2002, Hannu Krosing wrote:

> the standard way of doing it would be SQL99's WITH :

Great! I havn't looked too much at sql99 yet so I've missed this. It's 
exactly what I want. Now I know what I will use in the future (when it's 
all implemented).

-- 
/Dennis


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] PostgreSQL Password Cracker

2003-01-02 Thread Dennis Björklund
On Fri, 3 Jan 2003, Justin Clift wrote:

> Very Cool.  The URL for the .pgpass stuff is:
> 
> http://developer.postgresql.org/docs/postgres/libpq-files.html

There is a typo on that page. First it talkes about the file .pgpass and 
then it says: "chmod 0600 .pgaccess".

I had no idea that one could store the passwords like this. This feature
is something I'm going to use from now on (now that I know about it).

-- 
/Dennis


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



[HACKERS] translation stats

2003-02-18 Thread Dennis Björklund
What is the translation stats on

  http://webmail.postgresql.org/~petere/nls.php

based on? I've not updated my translation in a long time (since 7.3.0)  
and it's still at 100% (except the big file that wasn't 100% before).  
Seems strange that there havn't been any new or changed strings since
then. Not that I'm complaining, no changes is nice :-)

-- 
/Dennis


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



[HACKERS] website charset

2003-03-07 Thread Dennis Björklund
There is no declaration of charset in the main webpage. Something like



would be nice. What is worse is that there are several charsets used. In
the International part I have to set mozilla to use utf-8 for it to be
rendered correctly and in the Event part I have to tell mozilla to use
iso-8859-1. There is not one setting that can show the whole page.

There really should be only one charset on the page, and it should be
specified so that the browser can show it right and not have to guess.

-- 
/Dennis


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


[HACKERS] Character encoding

2003-06-09 Thread Dennis Björklund
I've been playing with character encodings and found a problem/bug. I
still use 7.3.2, so it's possible (but I think not) that some of this have
been fixed.

When you run psql with a different language then english the strings are
usually in a character set that is not pure ascii. For example to
represent swedish you need either latin1 or unicode. Therefor the po file
for swedish is in latin1.

Now, these strings are used to create queries that are sent to postgres. 
For example if you perform \d in a swedish psql you get

# \d
 Lista med relationer
 Schema |   Namn|   Typ   | Ägare
+---+-+
 public | boz   | tabell  | dennis
 public | boz_a_seq | sekvens | dennis

where Owner is translated to "Ägare". The problem is now if
the database uses utf-8. Then psql still seems to create queries with
latin1 characters in them which is invalid in utf-8. So I get this:

# \d
ERROR:  Invalid UNICODE character sequence found (0xe47273)

It has to be translated to utf-8 before it's sent to the backend.

Actually, in the example above it's not the string "Ägare" that gives the
error message but the value that maps relkind 's' to 'särskild' in
swedish. Seems like column names and column values are treated different 

My guess is that the backend don't care what the column name is and
just sends it back. Which is broken if there are different character 
encodings at play.

I have also another problem with character sets. I have a unicode
database, and when I set the client encoding to unicode I get nice utf-8
strings back. However, my terminal can not show them so when I run psql I
get strings like "armbåge" (which is what a utf-8 string looks like in
latin1). My client program written using libpq works fine and I get good
utf-8 back.

However, I tried to set the client encoding in psql to latin1 so that it
would show the strings correctly. Then the string above really should be
showed as "armbåge", but it is showed as "armbge".

It should work fine since I know that my strings really are latin1 strings
even when represented as utf-8. Also, the manual says that it should work
for also characters where there is no conversion, it should then become
the hexdecimal value in parentheses.

-- 
/Dennis


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Character encoding

2003-06-10 Thread Dennis Björklund
On Tue, 10 Jun 2003, Peter Eisentraut wrote:

> we should try to keep the (translated) column headers within the client,
> to side-step this issue.  Do you want to investigate that?

That is the obvious solution, there is no real need to send the strings to
the server in the first place.

The problem is not just with column headers though as it's also used in
the data in the table. (which of course has the same solution, examin the
data when it get to the client and substitute for the translated string).

I'll take a look at it and probably fix it by the weekend (if not sooner).

-- 
/Dennis


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[HACKERS] Alter strings that don't belong to the application

2003-06-11 Thread Dennis Björklund
I've been looking into the code of psql to fix a problem with charsets and
noticed that psql changes the strings it gets back from functions like
PQfname() and PQgetvalue().

I can't find in the docs anything that says that it is okay to alter the
returned string. It works since it's allocated and stored in the PGresult
(I assume), so it's local to one result. But it's not obvious to me just
looking at the API that it's allowed.

Or should I read that the return value is "char *" and not "const char *" 
as it's okay to alter it?

-- 
/Dennis


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Alter strings that don't belong to the application

2003-06-12 Thread Dennis Björklund
On Thu, 12 Jun 2003, Tom Lane wrote:

> I'd call that a bug in psql.  Where do you see it happening exactly?

It's the utf-8 validation function (mbvalidate) that removes characters 
that it does not understand.

> No, I think the return value ought to be treated as const char *.
> We're hesitant to actually declare it that way because we'd cause compile
> errors in many client applications that aren't being const-paranoid.

An easy solution is to add the const as a macro that can be turned off by 
those apps.

ps. I'm being blacklisted by your email server. The reason I get back is
that my letter come from a telia IP-number. Telia is the major ISP in
Sweden with something like 80-90% of the market. I actually don't belong
to those since I use another ISP (bostream), but I guess they got their
adresses from telia or something. Well, not really a problem. I send
everything to the list anyway.

-- 
/Dennis


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Alter strings that don't belong to the application

2003-06-12 Thread Dennis Björklund
On Thu, 12 Jun 2003, Tom Lane wrote:

> I think the string ought to be copied first ... although that might
> create memory-leak problems.  Could you take a look at fixing this,
> while you're in the area?

Sure.

> I can whitelist you if you have a stable IP address --- is that a static
> or dynamic assignment?

It's static and I've had it for years.

-- 
/Dennis


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


[HACKERS] sql_ascii

2003-06-12 Thread Dennis Björklund
Why do we have SQL_ASCII?

I could understand it if we only could store 7-bit strings there. But 
SQL_ASCII lets you store 8-bit values. Should I understand SQL_ASCII 
simply as 8-bit strings of unknown charset?

In the first database I created I used it to store latin1 strings, and
that was a mistake. I'm sure others have made the same mistake.

I've looked at the code and as far as I can tell postgres converts ascii
to utf-8 using a function in conv.c called pg_ascii2mic() which simply
strips off the upper bit. Still when I have a SQL_ASCII database and a
table with strings like "Björklund" (notice the ö which is not 7-bit
ascii) and change client encoding to UTF-8 I still get 8-bit values out
(yes, it's stange, that's why I ask).

I know how to fix my database, no problem. I'm just interested in why I
get 8-bit values out when I've set the client encoding to utf-8. I guess 
my problem is the same as above, I don't know what sql_ascii really is.

I don't need to know these things to fix the small problems in psql (which
I'll do in the weekend), but I figured that it's just as good to
understand all of it an anyway.

-- 
/Dennis


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[HACKERS] client encodings

2003-06-15 Thread Dennis Björklund
I've fixed the problems in psql that was there before:

 * psql alters the strings in a PQresult
 * psql sends non validating strings to server

This is however not the solution to the general problem with client 
encodings. When you normally run psql in a terminal, the encoding used by 
that terminal is the only reasonable encoding one can use. However, if you 
redirect the output you very well might want to produce a utf-8 file even 
if the terminal does not suppert it. So it could be usable to change the 
client encoding in psql.

However, if you want to produce a utf-8 file, how should that work with 
respect to gettext()? If the message catalog is in latin1 then we need to 
know that and convert that into utf-8.

The easiest way as I see it is to demand that all po files are stored in
utf-8 and then you can convert that into whatever client encoding you have
set in psql. Of course you can't make that translation lossfree in
general, but if you have a language that demands some characters that
don't exist in the target charset you have lost anyway. The best you can
do is to convert it to something similar (or even just through it away).

To store all po files as utf-8 is not a big problem. The translator can
very well still work using some other charset and then you use iconv to
convert it before checking it in. As long as you don't change that file
(and use other characters) the translator can later on use iconv again to
get it back to his charset. The good thing about this is that psql knows
what charset all strings are in and can convert when needed.

Would it be acceptable to have all po-files as utf-8?

-- 
/Dennis


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] confused with encodings

2003-06-16 Thread Dennis Björklund
On Mon, 16 Jun 2003, Oleg Bartunov wrote:

> I thought I could specify different encodings
> for different databases and these encodings will be used in text operations
> (sort, upper,lower), not just for conversion.

En encoding does not imply any sort order. UTF-8 can be used to store 
strings in many languages, each having different sort order (and other 
properties). It's the locale that determines these things.

It would be nice to be able to set the locale per database, or even per 
column.

-- 
/Dennis


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] client encodings

2003-06-17 Thread Dennis Björklund
On Mon, 16 Jun 2003, Peter Eisentraut wrote:

> > However, if you want to produce a utf-8 file, how should that work with
> > respect to gettext()? If the message catalog is in latin1 then we need to
> > know that and convert that into utf-8.
> 
> I don't think all gettext implementations support automatic character set
> conversion.

I agree. They don't.

>  We might have to roll our own sometime

That was why I asked if we could simply have all message catalogs as 
utf-8, then we know what charset the strings are in and can easily convert 
it to whatever we have set our client encoding to.

> but for now it's not an option.
 
What has to be decided is if we are going to generate output that is only 
in the client encoding or not. If you just output the strings in the 
message catalog then we will not produce validating output. Then the best 
thing we can do is simply to take the message catalog string and discard 
everything that does not work in the current client encoding.

-- 
/Dennis


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Feature freeze and the great elog changeover

2003-06-25 Thread Dennis Björklund
On Wed, 25 Jun 2003, Paul Ramsey wrote:

> Ignorance on my part, probably. You mentioned elog() so I grepped for it 
> and found lots of this stuff:
> 
> elog(FATAL, "data directory %s was not found", checkdir)
> elog(FATAL, "could not read permissions of directory %s: %m", 
>checkdir);
> 
> I am probably just misunderstanding something.

It's taken care of by the gettext system.

One thing that I would like to see in the future (but probably wont for
many years still) is a log file that is saved in a format like

"data directory %s was not found", "/usr/"

where you can view the log with some (gui och command-line) tool that
translates it when you look at it. Then anyone can look at the log using
their language and you don't have to decide one one language when you 
start the server.

Maybe something for postgresql X.

-- 
/Dennis


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: CVS tip compile failure (was Re: [HACKERS] Missing array support)

2003-06-28 Thread Dennis Björklund
On Sat, 28 Jun 2003, Joe Conway wrote:

> > Do I need to do something to get new language files?
> 
> causing me problems. Did a "cvs add" get missed somewhere, or am I doing 
> something wrong?

Yes, a couple of cvs add was forgotten.

Peter made an update with the comment "Merge PO file updates from 7.3
branch.". I checked out a new copy with tag REL7_3_2 and there are the
missing files (at least the one I checked, but probably the rest also).

-- 
/Dennis


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] FROM clause omitted

2003-07-16 Thread Dennis Björklund
I had a bug in one of my queries that wasn't detected by pg because if 
filled in the from clause by itself. Take for example a querie like

select foo.a;

which I guess is transformed to

select foo.a
from foo;

Is this really a good thing to do? Is it part of the standard? Can it be
turned of? In my case it hid a bug and made my query work but produce the
wrong result.

Isn't this yet another case of "helpful" parsing that will only hurt in
the end? Look at how hard it is to parse html-pages because all browsers
accept broken code, but different broken code.

What about an example like this (the transformed code above but with alias
x added):

select foo.a
from foo x;

By adding the alias x the query still workes but gives a different result.

-- 
/Dennis


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] [ADMIN] concat_ws

2003-08-03 Thread Dennis Björklund
On Sun, 3 Aug 2003, Tom Lane wrote:

> I added code to inline_function to stop inlining if a parameter
> expression to be substituted multiple times has cost greater than
> 10*cpu_operator_cost (which roughly means that it contains more than
> 10 operators or functions).

When is this inlining taking place and what is the logic? I just want to
make sure that there is no code in pg that will unfold forever, say for
example for a recursive fac() function. From the above it sounds like that
might be a problem.

-- 
/Dennis


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[HACKERS] cvs branch to use?

2003-08-14 Thread Dennis Björklund
I'm looking over the last bits of translations of pg. Is it to late to 
update the translations for 7.4? Should I work on the cvs head version or 
is there some branching going on?

-- 
/Dennis


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Date input changed in 7.4 ?

2003-08-28 Thread Dennis Björklund
On Wed, 27 Aug 2003, Kevin Brown wrote:

> There are some cases where it's extremely useful for PostgreSQL to
> accept dates of any format it knows about (ambiguities should be
> resolved either by looking at the current DateStyle or, failing that, by
> applying the recognition in a well-defined order

And the argument bhen this was that it only leads to wrong data. As I see
it, the only time you have dates in different styles is when you get it
from a human entering dates. Then he/she will enter 01/30/03 and it is
interpreted as 2003 January 30, he/she feels happy and enters another date
in january, say 01/10/03 and now maybe it is interpreted as 2003 October
1. Of course that error is not noticed since it worked the previous time..

Even when the dates are generated by a program one should set the
datertyle to match what the program outputs, otherwise one are in trouble
anyway. If the program generate 01/10/03 pg must know what it means and
can not just guess.

I think it is a great change. Having the database guess what you mean
should at least not be the default. Having GuessDates as a variable could
be useful and I thought that was the decision back then (what the
variable was called I don't remember).

-- 
/Dennis


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] ALTER TABLE ... TO ... to change related names

2003-08-30 Thread Dennis Björklund
On Sat, 30 Aug 2003, Tom Lane wrote:

> It'd probably be reasonable to rename only those sequences that are
> connected to the target table/column by internal dependencies --- this
> indicates that they were created by a SERIAL column definition and not
> by manual operations.

I don't understand why the serial columns sequence should be visible as
other sequences. As a user (and not thinking of how it would be
implemented) I think it's much more logical if the serial column sequence
is hidden in the namespace of the table in some way (there is no such
namespace now I guess). Anyway, so that you can use it like this:

create table foo (x serial);
select nextval('foo.x');
select * from foo.x;

This also solves the problem to know what the sequence name is which you
have to know what to use in currval() and such.

Renaming a sequence in this setting is the same as renaming the column. Of 
someone tries to use nextval('foo.x') somewhere else and then rename the 
column they would expect that the nextval above would not work any more.

Also, just to make it clear, I think the notation foo.x should only work
when it is a serial column. If someone has created the sequence explictly
(visible outside the table) and have given it a name, then that is the
name to use. The database should not try to figure out that sequence from
the default or anything like that.

-- 
/Dennis


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] ALTER TABLE ... TO ... to change related names

2003-08-30 Thread Dennis Björklund
On Sat, 30 Aug 2003, Tom Lane wrote:

> > I don't understand why the serial columns sequence should be visible as
> > other sequences.
> 
> Backwards compatibility, if nothing else.  Are you prepared to break
> every existing dump file that has
>   CREATE TABLE ser (f1 serial);
>   SELECT pg_catalog.setval('ser_f1_seq', 1, false);

I think breaking backward compability not always bad. If

a) it the change is sound and good for the long term
b) there is a solution that can be included for a number
   of revisions and where you preferably can turn on/off
   the feature with a variable.

The question is if the change is important enough to warrent such a 
breakage. Logically I think it's preferable.

> > create table foo (x serial);
> > select nextval('foo.x');
> 
> This conflicts with the existing provisions for accessing sequences
> using ordinary schema-qualified names ('schema.sequence').

That can of course be fixed by using some other selector then . (dot)  
like nextval('foo#x') or one can keep the dot and resolve it any way. But
I'm not trying to come up with the best syntax here. I was more interested
in the namespace change, to keep serial sequence names out of the "global"
namespace.

The exact syntax is the last problem. The backward compability is a much
bigger issue (with 7.3 since older dumps has no problem).
 
> The work I would actually like to see getting done in this area is
> the existing TODO item about using Oracle-compatible syntax for nextval
> et al, namely that you can write
>   sequence.nextval
> or
>   schema.sequence.nextval
> rather than nextval('sequence') or nextval('schema.sequence').

It's a separate issue that I also would welcome.

-- 
/Dennis


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])