Re: [SQL] Sql select like question.

2005-05-26 Thread hatuan
My version is "PostgreSQL 8.0.3 on i686-pc-mingw32, compiled by GCC gcc.exe
(GCC) 3.4.2 (mingw-special)"
- Original Message -
From: "Tom Lane" <[EMAIL PROTECTED]>
To: "hatuan" <[EMAIL PROTECTED]>
Cc: 
Sent: Thursday, May 26, 2005 4:19 AM
Subject: Re: [SQL] Sql select like question.
> "hatuan" <[EMAIL PROTECTED]> writes:
> > I have a select query Select * from A where id like 'US%'
> > In pgAdmin i have error : "ERROR:  unsupported type: 136452".
>
> What PG version is this?
>
> There were some bugs of that ilk in 8.0..8.0.2, but I think they
> are all fixed in 8.0.3.  Leastwise your example doesn't fail for
> me ...
>
> regards, tom lane


---(end of broadcast)---
TIP 8: explain analyze is your friend


[SQL] unsubscribe

2005-05-26 Thread U K Laxmi
 
 



__ 
Do you Yahoo!? 
Yahoo! Small Business - Try our new Resources site
http://smallbusiness.yahoo.com/resources/

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [SQL] Sql select like question.

2005-05-26 Thread Tom Lane
"hatuan" <[EMAIL PROTECTED]> writes:
> My version is "PostgreSQL 8.0.3 on i686-pc-mingw32, compiled by GCC gcc.exe
> (GCC) 3.4.2 (mingw-special)"

Well, if it's 8.0.3 then you've found a new variant of the problem, but
we cannot fix it without a complete test case.  Please read the bug
reporting guidelines:
http://www.postgresql.org/docs/8.0/static/bug-reporting.html

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] Sql select like question.

2005-05-26 Thread hatuan
I have fixed it. I have pg8.03 but database is 8.0. I dump database and
reload it, it work well.
Thank for your help.
- Original Message -
From: "Tom Lane" <[EMAIL PROTECTED]>
To: "hatuan" <[EMAIL PROTECTED]>
Cc: 
Sent: Thursday, May 26, 2005 4:19 AM
Subject: Re: [SQL] Sql select like question.


> "hatuan" <[EMAIL PROTECTED]> writes:
> > I have a select query Select * from A where id like 'US%'
> > In pgAdmin i have error : "ERROR:  unsupported type: 136452".
>
> What PG version is this?
>
> There were some bugs of that ilk in 8.0..8.0.2, but I think they
> are all fixed in 8.0.3.  Leastwise your example doesn't fail for
> me ...
>
> regards, tom lane


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] Duplicated records

2005-05-26 Thread Andrew Hammond
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

[EMAIL PROTECTED] wrote:
> Thanks
> CTAS (Create Table As Select) command works fine!!! With great performance.
> I think it is the best way to correct the data...(apparently)
> I didnt know about "select DISTINCT". I am going to read about it.

You might want to wrap this in some locking and throw in some
constraints to avoid dupes in the future...

BEGIN;
LOCK lanctos IN ACCESS EXCLUSIVE;
CREATE TABLE lanctos_distinct AS SELECT DISTINCT * FROM lanctos;
DROP TABLE lanctos;
ALTER TABLE lanctos_distinct RENAME TO lanctos;
ALTER TABLE lanctos ALTER id SET NOT NULL;
CREATE UNIQUE INDEX lanctos_id_idx ON lanctos (id);
ALTER TABLE lanctos ADD CONSTRAINT lanctos_id_pkey PRIMARY KEY (id);
COMMIT;

As always, don't forget to ANALYZE the new table.

- --
Andrew Hammond416-673-4138[EMAIL PROTECTED]
Database Administrator, Afilias Canada Corp.
CB83 2838 4B67 D40F D086 3568 81FC E7E5 27AF 4A9A
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFClfpbgfzn5SevSpoRAnj3AJ9xvCmMiC9yWNmS9XLFZWO3o4vNcACfboz+
T442LzdAAV1DbIoj24rCJeA=
=vrDU
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[Fwd: [SQL] unsubscribe]

2005-05-26 Thread noor
 Original Message 
Subject: [SQL] unsubscribe
From:"U K Laxmi" <[EMAIL PROTECTED]>
Date:Thu, May 26, 2005 1:52 pm
To:  "hatuan" <[EMAIL PROTECTED]>
 "Tom Lane" <[EMAIL PROTECTED]>
Cc:
--






__
Do you Yahoo!?
Yahoo! Small Business - Try our new Resources site
http://smallbusiness.yahoo.com/resources/

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq




---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] ARRAY() returning NULL instead of ARRAY[] resp. {}

2005-05-26 Thread Joe Conway

Tom Lane wrote:

I think he's got a good point, actually.  We document the ARRAY-with-
parens-around-a-SELECT syntax as

The resulting one-dimensional array will have an element for
each row in the subquery result, with an element type matching
that of the subquery's output column.

To me, that implies that a subquery result of no rows generates a
one-dimensional array of no elements, not a null array.


OK, looks like I'm outnumbered.

But as far as I know, we have never had a way to produce a 
one-dimensional empty array. Empty arrays thus far have been dimensionless.


Assuming we really want an empty 1D array, I created the attached patch. 
This works fine, but now leaves a few oddities to be dealt with, e.g.:


regression=# select array_dims(array(select 1 where false));
 array_dims

 [1:0]
(1 row)

Any thoughts on how this should be handled for an empty 1D array?


The point Markus is complaining about seems like it should
be easily fixable.


Well, "easily" is a relative term. My Postgres hacking neurons have 
gotten kind of rusty lately -- but then maybe that was your underlying 
point ;-)


Joe

Index: src/backend/executor/nodeSubplan.c
===
RCS file: /cvsroot/pgsql/src/backend/executor/nodeSubplan.c,v
retrieving revision 1.69
diff -c -r1.69 nodeSubplan.c
*** src/backend/executor/nodeSubplan.c	6 May 2005 17:24:54 -	1.69
--- src/backend/executor/nodeSubplan.c	26 May 2005 18:52:16 -
***
*** 215,220 
--- 215,221 
  	ListCell   *pvar;
  	ListCell   *l;
  	ArrayBuildState *astate = NULL;
+ 	Oid			element_type = planstate->ps_ResultTupleSlot->tts_tupleDescriptor->attrs[0]->atttypid;
  
  	/*
  	 * We are probably in a short-lived expression-evaluation context.
***
*** 259,268 
  	 *
  	 * For EXPR_SUBLINK we require the subplan to produce no more than one
  	 * tuple, else an error is raised. For ARRAY_SUBLINK we allow the
! 	 * subplan to produce more than one tuple. In either case, if zero
! 	 * tuples are produced, we return NULL. Assuming we get a tuple, we
! 	 * just use its first column (there can be only one non-junk column in
! 	 * this case).
  	 */
  	result = BoolGetDatum(subLinkType == ALL_SUBLINK);
  	*isNull = false;
--- 260,269 
  	 *
  	 * For EXPR_SUBLINK we require the subplan to produce no more than one
  	 * tuple, else an error is raised. For ARRAY_SUBLINK we allow the
! 	 * subplan to produce more than one tuple. In the former case, if zero
! 	 * tuples are produced, we return NULL. In the latter, we return an
! 	 * empty array. Assuming we get a tuple, we just use its first column
! 	 * (there can be only one non-junk column in this case).
  	 */
  	result = BoolGetDatum(subLinkType == ALL_SUBLINK);
  	*isNull = false;
***
*** 432,458 
  		}
  	}
  
! 	if (!found)
  	{
  		/*
  		 * deal with empty subplan result.	result/isNull were previously
! 		 * initialized correctly for all sublink types except EXPR, ARRAY,
  		 * and MULTIEXPR; for those, return NULL.
  		 */
  		if (subLinkType == EXPR_SUBLINK ||
- 			subLinkType == ARRAY_SUBLINK ||
  			subLinkType == MULTIEXPR_SUBLINK)
  		{
  			result = (Datum) 0;
  			*isNull = true;
  		}
  	}
- 	else if (subLinkType == ARRAY_SUBLINK)
- 	{
- 		Assert(astate != NULL);
- 		/* We return the result in the caller's context */
- 		result = makeArrayResult(astate, oldcontext);
- 	}
  
  	MemoryContextSwitchTo(oldcontext);
  
--- 433,459 
  		}
  	}
  
! 	if (subLinkType == ARRAY_SUBLINK)
! 	{
! 		if (!astate)
! 			astate = initArrayResult(element_type, oldcontext);
! 		/* We return the result in the caller's context */
! 		result = makeArrayResult(astate, oldcontext);
! 	}
! 	else if (!found)
  	{
  		/*
  		 * deal with empty subplan result.	result/isNull were previously
! 		 * initialized correctly for all sublink types except EXPR
  		 * and MULTIEXPR; for those, return NULL.
  		 */
  		if (subLinkType == EXPR_SUBLINK ||
  			subLinkType == MULTIEXPR_SUBLINK)
  		{
  			result = (Datum) 0;
  			*isNull = true;
  		}
  	}
  
  	MemoryContextSwitchTo(oldcontext);
  
***
*** 925,930 
--- 926,932 
  	ListCell   *l;
  	bool		found = false;
  	ArrayBuildState *astate = NULL;
+ 	Oid			element_type = planstate->ps_ResultTupleSlot->tts_tupleDescriptor->attrs[0]->atttypid;
  
  	/*
  	 * Must switch to child query's per-query memory context.
***
*** 1010,1016 
  		}
  	}
  
! 	if (!found)
  	{
  		if (subLinkType == EXISTS_SUBLINK)
  		{
--- 1012,1033 
  		}
  	}
  
! 	if (subLinkType == ARRAY_SUBLINK)
! 	{
! 		/* There can be only one param... */
! 		int			paramid = linitial_int(subplan->setParam);
! 		ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]);
! 
! 		prm->execPlan = NULL;
! 
! 		if (!astate)
! 			astate = initArrayResult(element_type, oldcontext);
! 
! 		/* We build the result in query context so it won't disappear */

[SQL] 3 tables, slow count(*), order by Seq Scan in Query Plan

2005-05-26 Thread ogjunk-pgjedan
Hello,

I have 3 tables (2 tables + 1 lookup table that ties them) and running
a straight-forward aggregate count(*) query with a couple of joins
takes about 10 seconds (and I need it to be sub-second or so).
Also, I am wondering if this approach is scalable with my row-counts
and my hardware (below).

My slow query is this:
--
SELECT keyword.name, count(*)
FROM user_data, user_data_keyword, keyword
WHERE (user_data.user_id = 1)
  AND (user_data.id = user_data_keyword.user_data_id)
  AND (user_data_keyword.keyword_id = keyword.id)
  GROUP BY keyword.name
  ORDER BY COUNT(*) DESC LIMIT 10;


QUERY PLAN from EXPLAIN ANALYZE:

 Limit  (cost=27820.00..27820.03 rows=10 width=114) (actual
time=9971.322..9971.401 rows=10 loops=1)
   ->  Sort  (cost=27820.00..27822.50 rows=1000 width=114) (actual
time=9971.312..9971.338 rows=10 loops=1)
 Sort Key: count(*)
 ->  HashAggregate  (cost=27767.67..27770.17 rows=1000
width=114) (actual time=9955.457..9963.051 rows=2005 loops=1)
   ->  Hash Join  (cost=4459.64..27738.80 rows=5774
width=114) (actual time=1140.776..9919.852 rows=5516 loops=1)  
  Hash Cond: ("outer".keyword_id = "inner".id)
 ->  Hash Join  (cost=4437.14..27600.81 rows=5774
width=4) (actual time=21.781..7804.329 rows=5516 loops=1)
   Hash Cond: ("outer".user_data_id =
"inner".id)
   ->  Seq Scan on user_data_keyword 
(cost=0.00..17332.29 rows=1154729 width=8) (actual time=2.717..3834.186
rows=1154729 loops=1)
   ->  Hash  (cost=4433.94..4433.94 rows=1278
width=4) (actual time=18.862..18.862 rows=0 loops=1)
 ->  Index Scan using
ix_user_data_user_id_data_id on user_data  (cost=0.00..4433.94
rows=1278 width=4) (actual time=0.234..13.454 rows=1149 loops=1)
   Index Cond: (user_id = 1)
 ->  Hash  (cost=20.00..20.00 rows=1000 width=118)
(actual time=1118.616..1118.616 rows=0 loops=1)
   ->  Seq Scan on keyword  (cost=0.00..20.00
rows=1000 width=118) (actual time=1.140..609.577 rows=105440 loops=1)
 Total runtime: 9972.704 ms
(15 rows)


Ouch :)

I'm trying to analyze the query plan (I'm not very good at it,
obviously), and I see 2 Sequential Scans, one on the _big_
"user_data_keyword" table with about 60% of the total cost, and one of
the "keyword".
I also see HashAggregate with a high cost and a long actual time.

I'm not sure what to do, which indices I need to add, as the
"user_data_keyword" and "keyword" tables already have PK-based btree
indices:
  "user_data_keyword" has: ... btree (user_data_id, keyword_id)
  "keyword"   has: ... btree (id)


Here are my 3 tables:

user_data (about 300K rows currently, will grow to 10M+)
-
  id   (PK),
  user_id  (FK)
  ... other columns ...

user_data_keyword (lookup table - size always 4 x user_data)
-
  user_data_id (FK)
  keyword_id   (FK)
  PK(user_data_id, keyword_id)

keyword (size about 10 x smaller than user_data_keyword)
---
  id   (PK)
  name VARCHAR(64) NOT NULL UNIQUE
  add_date TIMEZONE



Is there any way of speeding up my query?

Also, given the number of rows expected in those tables:
user_data: 10M
user_data_keyword: 40M
keyword:4M

Any ideas how a query like this will scale when i hit those numbers?

This particular query speed numbers are from 7.4.6 on a 1.5GHz laptop,
but the production server is running PG 8.0.3 on a 3.0GHz P4 with 2
SATA disks in RAID1 config and 1GB RAM.  How realistic is it to get
sub-second responses on such hardware given the above numbers?

Thanks,
Otis


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq