[GENERAL] v8.3 + UTF8 errors when restoring DB

2008-03-09 Thread Mitchell D. Russell
New to the list, so please forgive me in advance :) 


I've been running 8.2 on windows server 2003 for quite some time now. The
database that I take care of stores records with various languages in it
(russian, chinese, etc) and has been working fine.

I did a PG_dump of the database tonight, and went to restore it to my newly
created database (WIN2152 encoding), which worked fine, but when I go to
retrieve the data via my code (.NET c#), I get errors like :

character 0x8f of encoding WIN1252 has no equivalent in UTF8
character 0x81 of encoding WIN1252 has no equivalent in UTF8
character 0x9d of encoding WIN1252 has no equivalent in UTF8

The database driver is set to unicode mode, so I turned that off, but of
course end up with gibberish instead of the russian and chinese that were
there before.

So then I tried creating a new database with unicode formatting and
restoring my 8.2 dump file to that, when I do that, I get these errors
(using PSQL):

ERROR: invalid byte sequence for encoding UTF8: 0x92
HINT: This error can also happen if the byte sequence does not match the
encoding expected by the server, which is controlled by client_encoding.

CONTEXT: COPY cin_tbl_data, line 6
ERROR: invalid byte sequence for encoding UTF8: 0x99
HINT: This error can also happen if the byte sequence does not match the
encoding expected by the server, which is controlled by client_encoding.

so I just don't know what to do. my 8.2 database used SET client_encoding =
'SQL_ASCII'; according to the dump file - so I can't wrap my head around why
I can't get this data to come out.

Any help would be very much appreciated.



Re: [GENERAL] v8.3 + UTF8 errors when restoring DB

2008-03-09 Thread Mitchell D. Russell
Just as an update to this matter, I created another new database with
SQL_ASCII encoding and imported the data, and it worked fine, however, I
don't think that's the best solution because SQL_ASCII's not a good way to
have the database encoding set, right?

 

Thanks again for any suggestions.

 

Mitch

 

 

From: Mitchell D. Russell [mailto:[EMAIL PROTECTED] 
Sent: Sunday, March 09, 2008 5:46 AM
To: 'pgsql-general@postgresql.org'
Subject: v8.3 + UTF8 errors when restoring DB

 

New to the list, so please forgive me in advance :) 


I've been running 8.2 on windows server 2003 for quite some time now. The
database that I take care of stores records with various languages in it
(russian, chinese, etc) and has been working fine.

I did a PG_dump of the database tonight, and went to restore it to my newly
created database (WIN2152 encoding), which worked fine, but when I go to
retrieve the data via my code (.NET c#), I get errors like :

character 0x8f of encoding WIN1252 has no equivalent in UTF8
character 0x81 of encoding WIN1252 has no equivalent in UTF8
character 0x9d of encoding WIN1252 has no equivalent in UTF8

The database driver is set to unicode mode, so I turned that off, but of
course end up with gibberish instead of the russian and chinese that were
there before.

So then I tried creating a new database with unicode formatting and
restoring my 8.2 dump file to that, when I do that, I get these errors
(using PSQL):

ERROR: invalid byte sequence for encoding UTF8: 0x92
HINT: This error can also happen if the byte sequence does not match the
encoding expected by the server, which is controlled by client_encoding.

CONTEXT: COPY cin_tbl_data, line 6
ERROR: invalid byte sequence for encoding UTF8: 0x99
HINT: This error can also happen if the byte sequence does not match the
encoding expected by the server, which is controlled by client_encoding.

so I just don't know what to do. my 8.2 database used SET client_encoding =
'SQL_ASCII'; according to the dump file - so I can't wrap my head around why
I can't get this data to come out.

Any help would be very much appreciated.



Re: [GENERAL] v8.3 + UTF8 errors when restoring DB

2008-03-09 Thread Martijn van Oosterhout
On Sun, Mar 09, 2008 at 06:12:14AM -0400, Mitchell D. Russell wrote:
 Just as an update to this matter, I created another new database with
 SQL_ASCII encoding and imported the data, and it worked fine, however, I
 don't think that's the best solution because SQL_ASCII's not a good way to
 have the database encoding set, right?

SQL_ASCII means no encoding. Everything becomes a bunch of bytes and
there's no checking whatsoever.

 I did a PG_dump of the database tonight, and went to restore it to my newly
 created database (WIN2152 encoding), which worked fine, but when I go to
 retrieve the data via my code (.NET c#), I get errors like :

Any particular reason you're using that encoding? Why not just use UTF-8?

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [GENERAL] v8.3 + UTF8 errors when restoring DB

2008-03-09 Thread Dean Gibson (DB Administrator)

On 2008-03-09 01:45, Mitchell D. Russell wrote:


New to the list, so please forgive me in advance :)


I've been running 8.2 on windows server 2003 for quite some time now. 
The database that I take care of stores records with various languages 
in it (russian, chinese, etc) and has been working fine.


I did a PG_dump of the database tonight, and went to restore it to my 
newly created database (WIN2152 encoding), which worked fine, but when 
I go to retrieve the data via my code (.NET c#), I get errors like :


character 0x8f of encoding WIN1252 has no equivalent in UTF8
character 0x81 of encoding WIN1252 has no equivalent in UTF8
character 0x9d of encoding WIN1252 has no equivalent in UTF8

The database driver is set to unicode mode, so I turned that off, but 
of course end up with gibberish instead of the russian and chinese 
that were there before.


So then I tried creating a new database with unicode formatting and 
restoring my 8.2 dump file to that, when I do that, I get these errors 
(using PSQL):


ERROR: invalid byte sequence for encoding UTF8: 0x92
HINT: This error can also happen if the byte sequence does not match 
the encoding expected by the server, which is controlled by 
client_encoding.


CONTEXT: COPY cin_tbl_data, line 6
ERROR: invalid byte sequence for encoding UTF8: 0x99
HINT: This error can also happen if the byte sequence does not match 
the encoding expected by the server, which is controlled by 
client_encoding.


so I just don't know what to do. my 8.2 database used SET 
client_encoding = 'SQL_ASCII'; according to the dump file - so I can't 
wrap my head around why I can't get this data to come out.


Any help would be very much appreciated.



Well, you partially hosed yourself when you did the dump using a 
'client_encoding' of 'SQL_ASCII'.  If the 8.2 database is still around, 
redo pg_dump using the '-E' parameter to set the client_encoding to 
UTF-8, then import the data into your 8.3 database created with a UTF-8 
encoding.


If you don't have the 8.2 server around any more, it's a bit more 
complicated.  How did you do the pg_dump (what format)?


--
Mail to my list address MUST be sent via the mailing list.
All other mail to my list address will bounce.



Re: [GENERAL] ER Diagram design tools (Linux)

2008-03-09 Thread David Fetter
On Sat, Mar 08, 2008 at 10:49:02AM -0800, Colin Fox wrote:
 David Fetter wrote:
  On Fri, Mar 07, 2008 at 01:22:49PM -0800, Colin Fox wrote:

  I've created a pg_foundry project for this.
 
  Assuming the project gets approved, I'll post the link here.
  
 
  Wouldn't it be better just to send that XSLT to the upstream
  project?
 
  Cheers, David.

 I don't know which project you mean. The postgresql project?

Actually, I meant the xmltoddl project. http://xml2ddl.berlios.de/

 I have a couple of files that go along with it - instructions, some
 documentation, examples, etc.
 
 I'm certainly more than willing to provide this to the PG team, if
 they're interested.
 
 It just seems that the pg_foundry is a logical place to put all the
 project information.

See above :)

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] loading a psql input file on win xp

2008-03-09 Thread Reid Thompson

akshay bhat wrote:

HELLO
i have psql file which is to be used for loading a database

it was downloaded from this link http://conceptnet.media.mit.edu/
the description says
The ConceptNet 3 database, as a PostgreSQL input file. You will need to 
be running a PostgreSQL server to install ConceptNet 3.

i was earlier
suggested to use following command
to add it to database

psql -d your_database_name -f conceptnet-2007-09-25.psql

however i do not know where to execute this command
i am using a windows xp with with postgresql installed in

D:\Program Files\PostgreSQL\8.3\


the file is located in folder

D:\conc\

now can someone please tell me where to execute above command
should i use dos (i mean command prompt )

please help sorry for my foolish question

--
akshay uday bhat.
t.y.c.e.
department of chemical engineering
university institute of chemical technology
mumbai India

On action alone be thy interest,
Never on its fruits.
Let not the fruits of action be thy motive,
Nor be thy attachment to inaction.

Ask and it shall be given to you; seek and you shall find; knock and it 
shall be opened to you


you'll also probably have to add the parameter -h 127.0.0.1

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how do you write aggregate function

2008-03-09 Thread Justin
I got the aggregate function for weighted average done.   I finely left 
alone for more than 10 minutes  to actual get it written.  It takes 2 
value input  Weight and the Value.  it will sum the weighted entries to 
create the constant then does the normal formula, but does not 
percentage number but averaged number.  A simple change on the return 
line it can do percentages.


I did a couple of  things a little odd .  instead of doing a multi 
dimensional array i did a one dimensional array where the 1st row is 
Weight and the  2nd row is Value.   This made the loop through the array 
look stupid. 

I tested it across 50,000 records with a group by it took 3.3 seconds to 
run.


without the group by clause performance is terrible taking several 
minutes just to do the sfunc part. 371,563ms


The Array seems to have performance hit any advice?It could be the 
way i'm appending to the Array which has a performance hit as the array 
gets bigger and bigger ?



[EMAIL PROTECTED] wrote:

Richard Broersma wrote:

On Tue, Mar 4, 2008 at 8:44 AM, Justin [EMAIL PROTECTED] wrote:


I searched the archive of the mail list and did not find anything


Search the documentation. There are a couple great examples posted at

http://www.postgresql.org/docs/8.0/interactive/sql-createaggregate.html



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

create or replace function wcost_average_sf (numeric[],  numeric, numeric)
returns numeric[] as 
$Body$
declare 
_state numeric[];
begin 
_state := $1;
_state := array_append(_state, $2);
_state := array_append(_state, $3);
return _state;
end;
$Body$
LANGUAGE 'plpgsql' VOLATILE;

create or replace function wcost_average_ff (numeric[] )
returns numeric as 
$Body$
declare 
iState   alias for $1 ;
_sumedWeight numeric ;
_sumedWxVnumeric ;
_elmentCount integer ;
_icounterinteger ;
begin 
_elmentCount := array_upper(iState,1) ;
_sumedWeight := 0 ;
_sumedWxV:= 0 ;
_icounter:= 0 ;

loop
_sumedWeight := _sumedWeight + iState[_icounter + 1] ;
_icounter := _icounter + 2 ;

if ( _icounter = _elmentCount ) then
exit; 
end if ;
end loop ; 

_icounter := 0;
loop 
_sumedWxV := _sumedWxV + ( (iState[_icounter + 
1]/_sumedWeight) * iState[_icounter+2]) ;
_icounter := _icounter + 2 ;

if ( _icounter = _elmentCount ) then
exit; 
end if ;
end loop ;
return _sumedWxV;
end;
$Body$
LANGUAGE 'plpgsql' VOLATILE;


create aggregate wcost_average (numeric, numeric)(
sfunc = wcost_average_sf, 
stype = numeric[],
initcond = '{0,0}',
finalfunc = wcost_average_ff
);
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how do you write aggregate function

2008-03-09 Thread dmp
Array appends are usually a performance hit, as you said. I'm not sure 
though with
PostgreSQL. Why not try it with two arrays and see what happens. At 
least you would

reducing the single array and the eliminating the append.

danap.

I got the aggregate function for weighted average done.   I finely 
left alone for more than 10 minutes  to actual get it written.  It 
takes 2 value input  Weight and the Value.  it will sum the weighted 
entries to create the constant then does the normal formula, but does 
not percentage number but averaged number.  A simple change on the 
return line it can do percentages.


I did a couple of  things a little odd .  instead of doing a multi 
dimensional array i did a one dimensional array where the 1st row is 
Weight and the  2nd row is Value.   This made the loop through the 
array look stupid.
I tested it across 50,000 records with a group by it took 3.3 seconds 
to run.


without the group by clause performance is terrible taking several 
minutes just to do the sfunc part. 371,563ms


The Array seems to have performance hit any advice?It could be the 
way i'm appending to the Array which has a performance hit as the 
array gets bigger and bigger ?



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how do you write aggregate function

2008-03-09 Thread Justin
If i use two arrays how do i append new entries into these arrays.  
array_append only allows for 1 dimension.


what other methods are there to add new entries to the array. 


dmp wrote:
Array appends are usually a performance hit, as you said. I'm not sure 
though with
PostgreSQL. Why not try it with two arrays and see what happens. At 
least you would

reducing the single array and the eliminating the append.

danap.

I got the aggregate function for weighted average done.   I finely 
left alone for more than 10 minutes  to actual get it written.  It 
takes 2 value input  Weight and the Value.  it will sum the weighted 
entries to create the constant then does the normal formula, but does 
not percentage number but averaged number.  A simple change on the 
return line it can do percentages.


I did a couple of  things a little odd .  instead of doing a multi 
dimensional array i did a one dimensional array where the 1st row is 
Weight and the  2nd row is Value.   This made the loop through the 
array look stupid.
I tested it across 50,000 records with a group by it took 3.3 seconds 
to run.


without the group by clause performance is terrible taking several 
minutes just to do the sfunc part. 371,563ms


The Array seems to have performance hit any advice?It could be 
the way i'm appending to the Array which has a performance hit as the 
array gets bigger and bigger ?





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] v8.3 + UTF8 errors when restoring DB

2008-03-09 Thread Mitchell D. Russell
Dean:

 

I did the dump as so:

 

Psql –Upostgres databasename  c:\temp\dump.sql

 

I think the database was set to SQL_ASCII before I dumped it, because when I 
did the 2nd restore last night to a new SQL_ASCII database, it worked fine, so 
that’s where I am at right now, the database is running fine on 8.3 as 
SQL_ASCII but I don’t want to keep it like that.

 

So what do you suggest I do now? Make another 8.3 database in UTF8 and dump the 
8.3 SQL_ASCII one using –E with UTF8? What will it do to the characters that 
can’t be interpreted?

 

Thanks

Mitch

 

 

 

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dean Gibson (DB 
Administrator)
Sent: Sunday, March 09, 2008 9:58 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] v8.3 + UTF8 errors when restoring DB

 


Well, you partially hosed yourself when you did the dump using a 
'client_encoding' of 'SQL_ASCII'.  If the 8.2 database is still around, redo 
pg_dump using the '-E' parameter to set the client_encoding to UTF-8, then 
import the data into your 8.3 database created with a UTF-8 encoding.

If you don't have the 8.2 server around any more, it's a bit more complicated.  
How did you do the pg_dump (what format)?




-- 
Mail to my list address MUST be sent via the mailing list.
All other mail to my list address will bounce.


Re: [GENERAL] Watch your PlanetPostgreSQL.org blogs

2008-03-09 Thread Josh Berkus
Lewis,

  I have no problem with having a policy or following a policy.  My
  problem is:  Where is it?  When was the last time someone's blog was
  booted from planetpostgresql?  Is it documented, i.e. an open 
  discussed event?

Actually, the policy was discussed on pgsql-www mailing list in November, I 
believe, due to formatting issues on the Postgresql.org home page.  Devrim 
says he e-mailed you (and every other planetpostgresql.org feed) about it in 
January.

The policy will also be going up somewhere linked from planetpostgresql.org 
since we've had this discussion; not having in on the web was an oversight.

-- 
Josh Berkus
PostgreSQL Project
Core Team Member
(any opinions expressed are my own)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Fwd: Re: [GENERAL] how do you write aggregate function]

2008-03-09 Thread Justin


DMP   you did give me an idea on how to call the append array 


sfunc looks like this

create or replace function wcost_average_sf (numeric[],  numeric, numeric)
returns numeric[] as
$Body$
  begin
  return array_append(array_append($1, $2), $3);
  end;
$Body$
LANGUAGE 'plpgsql' VOLATILE;

this yanked out 140,000 ms aka 2.3 minutes  on the run time,  a big improvement but no 
where, where i want it to be.


are there speed improvements in the other languages TCL

dmp wrote:
Array appends are usually a performance hit, as you said. I'm not sure 
though with
PostgreSQL. Why not try it with two arrays and see what happens. At 
least you would

reducing the single array and the eliminating the append.

danap.

I got the aggregate function for weighted average done.   I finely 
left alone for more than 10 minutes  to actual get it written.  It 
takes 2 value input  Weight and the Value.  it will sum the weighted 
entries to create the constant then does the normal formula, but does 
not percentage number but averaged number.  A simple change on the 
return line it can do percentages.


I did a couple of  things a little odd .  instead of doing a multi 
dimensional array i did a one dimensional array where the 1st row is 
Weight and the  2nd row is Value.   This made the loop through the 
array look stupid.
I tested it across 50,000 records with a group by it took 3.3 seconds 
to run.


without the group by clause performance is terrible taking several 
minutes just to do the sfunc part. 371,563ms


The Array seems to have performance hit any advice?It could be 
the way i'm appending to the Array which has a performance hit as the 
array gets bigger and bigger ?







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how do you write aggregate function

2008-03-09 Thread dmp

Justin,
I'm really not a expert in this area, on how to code this, or functions in
PostgreSQL. All I could offer is some ideas which you might try. Sometimes
this is all it takes. Perhaps someone else will respond that might be more
helpful. With that said I have read in the documentation the use of other
languages and if you are looking for a higher performance, that is the
way I would said its going to come about perhaps.

danap.


DMP   you did give me an idea on changing how to call the append array
sfunc looks like this

create or replace function wcost_average_sf (numeric[],  numeric, 
numeric)

returns numeric[] as
$Body$
   begin
   return array_append(array_append($1, $2), $3);
   end;
$Body$
LANGUAGE 'plpgsql' VOLATILE;

this yanked out 140,000 ms   on the run time,  a big improvement but 
no where i want it to be.


are there speed improvements in the other languages TCL

dmp wrote:

Array appends are usually a performance hit, as you said. I'm not 
sure though with
PostgreSQL. Why not try it with two arrays and see what happens. At 
least you would

reducing the single array and the eliminating the append.

danap.

I got the aggregate function for weighted average done.   I finely 
left alone for more than 10 minutes  to actual get it written.  It 
takes 2 value input  Weight and the Value.  it will sum the weighted 
entries to create the constant then does the normal formula, but 
does not percentage number but averaged number.  A simple change on 
the return line it can do percentages.


I did a couple of  things a little odd .  instead of doing a multi 
dimensional array i did a one dimensional array where the 1st row is 
Weight and the  2nd row is Value.   This made the loop through the 
array look stupid.
I tested it across 50,000 records with a group by it took 3.3 
seconds to run.


without the group by clause performance is terrible taking several 
minutes just to do the sfunc part. 371,563ms


The Array seems to have performance hit any advice?It could be 
the way i'm appending to the Array which has a performance hit as 
the array gets bigger and bigger ?






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] v8.3 + UTF8 errors when restoring DB

2008-03-09 Thread Dean Gibson (DB Administrator)

On 2008-03-09 11:49, Mitchell D. Russell wrote:


Dean:

 


I did the dump as so:  psql –Upostgres databasename  c:\temp\dump.sql


I assume you meant pg_dump, not psql.

 

I think the database was set to SQL_ASCII before I dumped it, because 
when I did the 2^nd restore last night to a new SQL_ASCII database, it 
worked fine, so that’s where I am at right now, the database is 
running fine on 8.3 as SQL_ASCII but I don’t want to keep it like that.


 

So what do you suggest I do now? Make another 8.3 database in UTF8 and 
dump the 8.3 SQL_ASCII one using –E with UTF8? What will it do to the 
characters that can’t be interpreted?


 


I don't think your idea will work.

Here's what I would try:  Your 'dump.sql' file is an ASCII file, which 
you can edit.  Hopefully you have an editor that can handle it (or if on 
Linux, you can use the 'sed' program to do what I'm about to suggest):


The file should contain a SET client_encoding = SQL_ASCII line near 
the beginning.  Make a copy of that file, and change that line to SET 
client_encoding = WIN1252 in the copied file, and then try importing 
the modified dump file into a UTF-8 database.


If that doesn't work, I have another suggestion if you are running on Linux.


--
Mail to my list address MUST be sent via the mailing list.
All other mail to my list address will bounce.



Re: [GENERAL] v8.3 + UTF8 errors when restoring DB

2008-03-09 Thread Raymond O'Donnell

On 09/03/2008 18:49, Mitchell D. Russell wrote:

I think the database was set to SQL_ASCII before I dumped it, because 
when I did the 2^nd restore last night to a new SQL_ASCII database, it 
worked fine, so that’s where I am at right now, the database is running 
fine on 8.3 as SQL_ASCII but I don’t want to keep it like that.


As an aside, I'm wondering if you're confusing the encoding of the 
database and the encoding used by the client (in this case pg_dump). 
They are two different things; the database encoding is set at creation 
time, while the encoding used by the client is usually set when that 
client connects (with a SET client_encoding statement); and Postgres 
translates happily between the two.


Forgive me if I'm wrongjust reading between the lines. :-)

Ray.

---
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
[EMAIL PROTECTED]
---

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] loading a psql input file on win xp

2008-03-09 Thread Raymond O'Donnell

On 09/03/2008 15:34, Reid Thompson wrote:


you'll also probably have to add the parameter -h 127.0.0.1


psql defaults to connecting to the local machine unless you tell it 
otherwise, so this is redundant; though it certainly won't do any harm 
either.


Ray.

---
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
[EMAIL PROTECTED]
---

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] data import - duplicates

2008-03-09 Thread brian



Webb Sprague wrote:

I haven't tested but this is what I would do (uses arrays, which are
handy when you need them), with the names changed to protect the
innocent:

begin;
-- create a table with some duplicates in one of the columns (y is ck);
wsprague=# select x, x%4 as y into temp fbar from generate_series(1,10) as x(a);
SELECT

wsprague=# select * from fbar;
 x  | y
+---
  1 | 1
  2 | 2
  3 | 3
  4 | 0
  5 | 1
  6 | 2
  7 | 3
  8 | 0
  9 | 1
 10 | 2
(10 rows)

-- create a table with the pk, an array of the duplicate keys, and the
length of that array
select y, x_list, array_upper(x_list, 1) as x_len into baz from
(select y, array_accum(x) as x_list from fbar group by y ) a ;

wsprague=# select * from baz;
 y |  x_list  | x_len
---+--+---
 3 | {3,7}| 2
 2 | {2,6,10} | 3
 1 | {1,5,9}  | 3
 0 | {4,8}| 2
(4 rows)

-- delete all rows that don't have ck in the first element of the pk list
wsprague=# delete from fbar where not exists (select 1 from baz where
fbar.x=baz.x_list[1]);DELETE 6
wsprague=# select * from fbar;
 x | y
---+---
 1 | 1
 2 | 2
 3 | 3
 4 | 0
(4 rows)

commit;

I hope that makes sense.  It should be easy to make into a function
(like clean_table(table_name text, pk_name text, dup_key_name text).
I don't know how well it will work with indexes.  You could probably
even write a function to do the entire import inside postgres,
starting with a copy and moving to merging tables and ending with some
consistency checks, and thus benefit from  transaction isolation of
the whole process




Hey, neat idea. I changed things a little:

test=# SELECT * FROM table1;
 id | ck | content
++-
  1 |  1 | string1
  2 |  2 | string2
  3 |  2 | string3
  4 |  4 | string1
  5 |  5 | string1
  6 |  1 | string3
  7 |  2 | string1
(7 rows)

test=# SELECT key_list, array_upper(key_list, 1) AS key_num, content
test-# INTO table2 FROM (
test-# SELECT array_accum(ck) AS key_list, content FROM table1 GROUP BY 
content

test=# ) a;
SELECT

test=# SELECT * FROM table2;
 key_list  | key_num | content
---+-+-
 {2,1} |   2 | string3
 {1,4,5,2} |   4 | string1
 {2}   |   1 | string2
(3 rows)


I had to modify your DELETE suggestion a bit because the table1.id (your 
x) is non-unique (although it must be within a given key_list). It'll be 
a performance hog though because it's comparing each string.



test-# DELETE FROM table1 WHERE NOT EXISTS (
test-# SELECT 1 FROM table2
test-# WHERE table1.content = table2.content
test-# AND table1.ck = table2.key_list[1]
test-# );
DELETE 4

test=# SELECT * FROM table1;
 id | ck | content
++-
  1 |  1 | string1
  2 |  2 | string2
  3 |  2 | string3
(3 rows)

From here I can drop the table1.ck column. But, in order to place the 
table2.key_list ints into a join table, I need the PK from table1. I 
have the content column with which to fetch that from table1 but that 
hardly seems an efficient method. So, I modified your idea for gathering 
the dupe CKs a little:


test-# SELECT key_list, array_upper(key_list, 1) AS key_num, content
test-# INTO table2 FROM (
test-# SELECT array_cat(ARRAY[array_accum(id)], ARRAY[array_accum(ck)]) 
AS key_list, content

test-# FROM table1 GROUP BY content) a;
SELECT

test=# select * from table2;
   key_list| key_num | content
---+-+-
 {{3,6},{2,1}} |   2 | string3
 {{1,4,5,7},{1,4,5,2}} |   2 | string1
 {{2},{2}} |   2 | string2
(3 rows)

Huzzah! IOW, I accumulate both the PKs and ck from table1 for each 
string. Then, I can ditch all but the first PK, which should make 
transferring these values in to the join table a snap.


# lightbulb comes on ...

Actually, what I'll do is remove the extraneous PKs and use them to 
DELETE FROM table1. Much more efficient, that.


Thanks for your ideas. And if anyone else has any suggestion please feel 
free to chime in. When I get the rest of the data I'll post back here on 
how it went in case google points someone here in the future.


b

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cumulative (Running) Sum

2008-03-09 Thread Matt Culbreth
On Mar 8, 4:46 pm, [EMAIL PROTECTED] (hubert depesz lubaczewski)
wrote:


 you might find this 
 useful:http://www.depesz.com/index.php/2007/08/17/rownum-anyone-cumulative-s...

 please also read comments.

 depesz


This is very helpful.  It sees that there's a way to do it without
hacking it into the actual PG source, which I was tempted to do.
Thanks,


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Nested xpath() calls lose namespace support

2008-03-09 Thread Matt Magoffin
I was trying to extract XML fragments within a pl/pgsql function by nested
xpath() function calls, and found that when namespaces are used in the
XML, only the first xpath() call is able to correctly use namespaces.

First here is an example that works, when no namespaces are used:

BEGIN;
CREATE TEMPORARY TABLE tmp_xpath_test (x xml) ON COMMIT DROP;
INSERT INTO tmp_xpath_test VALUES (
'foobar x=ybar1/barbar x=ybar2/bar/foo'::xml);
SELECT (xpath('/foo/bar[1]', t.x))[1] FROM tmp_xpath_test t;
SELECT xpath('/bar/@x', (xpath('/foo/bar[1]', t.x))[1]) FROM
tmp_xpath_test t;
COMMIT;

The first select is just there to show the result of the inner call to
xpath() in the second select, and the second select returns:

SELECT xpath('/bar/@x', (xpath('/foo/bar[1]', t.x))[1]) FROM
tmp_xpath_test t;
 xpath
---
 {y}
(1 row)

Now if I use XML with namespaces, the first SELECT works, but the second
never returns the expected results:

BEGIN;
CREATE TEMPORARY TABLE tmp_xpath_test (x xml) ON COMMIT DROP;
INSERT INTO tmp_xpath_test VALUES (
'a:foo xmlns:a=a:urna:bar x=ybar1/a:bara:bar
x=ybar2/a:bar/a:foo'::xml);
SELECT (xpath('/a:foo/a:bar[1]', t.x, ARRAY[ARRAY['a','a:urn']]))[1] FROM
tmp_xpath_test t;
SELECT xpath('/a:bar/@x', (xpath('/a:foo/a:bar[1]', t.x,
ARRAY[ARRAY['a','a:urn']]))[1],
ARRAY[ARRAY['a','a:urn']]) FROM tmp_xpath_test t;
COMMIT;

The select results are

SELECT (xpath('/a:foo/a:bar[1]', t.x, ARRAY[ARRAY['a','a:urn']]))[1] FROM
tmp_xpath_test t;
   xpath
---
 a:bar x=ybar1/a:bar
(1 row)

SELECT xpath('/a:bar/@x', (xpath('/a:foo/a:bar[1]', t.x,
ARRAY[ARRAY['a','a:urn']]))[1],
lms_kia( ARRAY[ARRAY['a','a:urn']]) FROM tmp_xpath_test t;
 xpath
---
 {}
(1 row)

For the second select, I expected a single XML text node containing y,
just like from the no-namespace result.

Am I doing anything wrong here?

-- m@




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] loading a psql input file on win xp

2008-03-09 Thread Reid Thompson

On Sun, 2008-03-09 at 19:41 +, Raymond O'Donnell wrote:
 On 09/03/2008 15:34, Reid Thompson wrote:
 
  you'll also probably have to add the parameter -h 127.0.0.1
 
 psql defaults to connecting to the local machine unless you tell it 
 otherwise, so this is redundant; though it certainly won't do any harm 
 either.
 
Not redundent.  Unless something has changed, psql's default connection
method is via unix socket 5432 and this socket does not exist on
windows. The -h is required to have psql connect to the network socket
which is available on windows.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] large object import

2008-03-09 Thread postgre
Hi all,
I am having a stored function in plperlU which is called from php script then. 
It select data from a table, export them to a file and zips the file. The 
problem is that it should store this file into temporary table. Then it should 
return some identificator to php, so that the user can download it via php. 
Problem is that postgreSQL doesn't supports server-side large object operations 
for non superuser roles.

Can someone please give me a suggestion how can I solve this problem. I mean if 
I can import the file some other way or if there are some other usual 
procedures how to do this?

Thanks in advance,
Lukas Houf

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] gsoc ideas

2008-03-09 Thread longlong
hi,all.

i want some advice about ideas for gsoc. i don't konw if it is appropriate
that i send a email here, so if you feel uncomfortable, please accept my
apology.

1.release8.2 make COPY TO can copy the output of an arbitrary SELECT
statement. so i think maybe COPY FROM can get data from output and 'insert
into' some column that designated. the format of the command will be
discussed.

2.this come from TODO list: COPY always behaviors like a unit of work thar
consists of some insert commands, if any error, it rollback. but sometimes
we only care the data should be inserted. in that situation, i used to use
trycatch insert row by row to skip the error, because it will take
much time to examine every row. so:
Allow COPY to report error lines and continue.
this is a good idea.

3.sometimes, i want to copy data from one database to another. i think using
COPY will simple the code. i want the content from COPY TO not store in the
file, but in the memory, and i can COPY FROM the memory(i don't kown COPY
with STDIN and STDOUT can do this or not.).

how do you think of these ideas?