Re: [GENERAL] pgsql 7.4.1 copy - end-of-copy marker corrupt

2004-02-06 Thread Jenny Zhang
On Thu, 2004-02-05 at 22:52, Martijn van Oosterhout wrote:
> On Fri, Feb 06, 2004 at 01:11:24AM -0500, Tom Lane wrote:
> > Jenny Zhang <[EMAIL PROTECTED]> writes:
> > >  ERROR:  end-of-copy marker corrupt
> > > CONTEXT:  COPY author, line 49: "49\aoD{\BABABABABARENG 
> > > \t5p.\19840426"
> > 
> > > and this is the line in the data file it complains:
> > > ===
> > > 49\aoD{\BABABABABARENG 
> > > \t5p.\19840426\.XfqWM+Ke,8+Ae-9PWPn)}#6eVa_Qu ...
> > ^^
> > 
> > It doesn't like the "\.", with good reason I should think.  Are you sure
> > 7.3 would take this file?
> > 
> > Anyway, either double or remove that backslash, depending on what you
> > want.
> 
> Umm, the copy is using the escape character as column seperator. I wonder if
> we should catch and warn for that.
> 
> I don't think there is a sensible way that can actually work.

Using \ as delimiter is a bad choice.  But it did work under 7.3.3.
It would be nice to have an error message for it.

Thanks,
Jenny


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] update slows down in pl/pgsql function

2003-12-16 Thread Jenny Zhang
Oops, I named the var name the same as the column name.  Changing it to
something else solved the problem.

Thanks,
Jenny
On Tue, 2003-12-16 at 15:54, Stephan Szabo wrote:
> On Tue, 16 Dec 2003, Jenny Zhang wrote:
> 
> > I have stored procedure written in pl/pgsql which takes about 13 seconds
> > to finish.  I was able to identify that the slowness is caused by one
> > update SQL:
> >
> > UPDATE shopping_cart SET sc_sub_total=sc_subtotal, sc_date=now()
> > WHERE sc_id=sc_id;
> 
> Umm, is that exactly the condition you're using? Isn't that going to
> update the entire table?
> 
> ---(end of broadcast)---
> TIP 9: the planner will ignore your desire to choose an index scan if your
>   joining column's datatypes do not match


---(end of broadcast)---
TIP 8: explain analyze is your friend


[GENERAL] update slows down in pl/pgsql function

2003-12-16 Thread Jenny Zhang
I have stored procedure written in pl/pgsql which takes about 13 seconds
to finish.  I was able to identify that the slowness is caused by one
update SQL:

UPDATE shopping_cart SET sc_sub_total=sc_subtotal, sc_date=now()
WHERE sc_id=sc_id;

If I comment this sql out, the stored procedure returns within 1 second.

What puzzles me is that if I execute the same update SQL in psql
interface, it returns very fast.  The following is the explain analyze
output for that SQL.  

#>explain analyze UPDATE shopping_cart SET sc_sub_total=1, sc_date=now()
where sc_id=260706;
QUERY
PLAN 
--
 Index Scan using shopping_cart_pkey on shopping_cart  (cost=0.00..5.01
rows=1 width=144) (actual time=0.22..0.37 rows=1 loops=1)
   Index Cond: (sc_id = 260706::numeric)
 Total runtime: 1.87 msec
(3 rows)

Is it true that using pl/pgsql increases the overhead that much?

TIA,
Jenny
-- 
Jenny Zhang
Open Source Development Lab
12725 SW Millikan Way, Suite 400
Beaverton, OR 97005
(503)626-2455 ext 31



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] passing array as argument and returning an array in

2003-12-16 Thread Jenny Zhang
I got this when I was searching for something else.  I will forward this
to you.
http://archives.postgresql.org/pgsql-general/2003-11/msg00852.php
CREATE or REPLACE FUNCTION foo(integer[]) RETURNS int AS
'DECLARE
a alias for $1;
index   integer := 1;
total   integer := 0;
BEGIN
WHILE a[index] > 0
LOOP
total := total + a[index];
index := index + 1;
END LOOP;

RETURN total;
END;
' LANGUAGE 'plpgsql';



test=> select foo('{1,2}');
 foo
-
   3
(1 row)
On Tue, 2003-12-16 at 03:25, K. Deepa wrote:
> Hi all,
>I am using postgresql7.4. How to handle arrays in plpgsql. How can
> I pass an array. Is it possible to retrieve values from an array by
> indexing it like
> 
> argument : '{1,2,3}'
> Return value : varchar array
> 
> Variables :
> -
> 
> a alias for $1
> b _varchar
> 
> Usage :
> -
> 
> b[1] = a[1];
> b[2] = a[2];
> 
> return b;
> 
> Is it possible.
> 
> TIA,
-- 
Jenny Zhang
Open Source Development Lab
12725 SW Millikan Way, Suite 400
Beaverton, OR 97005
(503)626-2455 ext 31



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[GENERAL] deadlock detected when inserting data

2003-12-15 Thread Jenny Zhang
I am running OSDL-dbt1 - an e-commerce workload
(http://www.osdl.org/lab_activities/kernel_testing/osdl_database_test_suite/osdl_dbt-1/)
against PostgreSQL: 7.3.3.  During the test, I saw a lot of messages in
the database log file:

NOTICE:  _sc_id=0, _add_flag=0, _itemcount=0, _i_id=597, _pp_i_id=159,
c_id=32760
ERROR:  deadlock detected
WARNING:  Error occurred while executing PL/pgSQL function shopping_cart
WARNING:  line 311 at SQL statement
WARNING:  ROLLBACK: no transaction in progress

The shopping_cart transaction does update/insert to shopping_cart and
shopping_cart_line tables, and if it is a new shopping_cart, it queries
the SEQUENCE scid to get a new shopping_cart.

I tried to search the mailing list, and got the impression that if there
are foreign key related to the tables, then deadlock happens.  But that
is not my case.  I do not have foreign key either referring to this two
tables, or defined on those tables.

Can anyone give some suggestions how I should analyze this problem?  Or
are there any documentation?  I am not driving the database very hard. 
The only comparison I have are the previous runs I did against SAPDB,
which performs better.

Let me know if you need more info.

TIA,

Jenny
-- 
Jenny Zhang
Open Source Development Lab
12725 SW Millikan Way, Suite 400
Beaverton, OR 97005
(503)626-2455 ext 31



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[GENERAL] how to get accurate values in pg_statistic

2003-08-22 Thread Jenny Zhang
Hi,

Last week I reported getting different execution plans for the same
query against the same database.  I did further investigation.  After
building the database, I did:
vacuumdb -z DBT3
psql DBT3 -c "analyze supplier"
psql DBT3 -c "analyze part"
psql DBT3 -c "analyze partsupp"
psql DBT3 -c "analyze customer"
psql DBT3 -c "analyze orders"
psql DBT3 -c "analyze lineitem"
psql DBT3 -c "analyze nation"
psql DBT3 -c "analyze region"

Then I check the data in pg_class and pg_statistic and get the execution
plans.

I did this twice (say run_8 and run_9).  I found that while the pg_class
are the same for run_8 and run_9, the pg_statistic output are
different.  I posted the query, the execution plan, pg_class and
pg_statistic output at:
http://www.osdl.org/archive/jenny/

The difference in pg_statistic results in a different execution plan for
query 18.sql.  In fact, I updated the pg_statistic table for run_9 with
the values got from run_8, then it gives me the run_8 execution plan,
and the cost is very close(within 1).  This makes me to believe that the
optimizer picks the plan based on the values in pg_statistic and
pg_class.  

But why the pg_statistic value changes each time I build the database? 
Did I do something wrong?

Thanks for your help,
Jenny



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[GENERAL] factors determining the execution plan

2003-08-18 Thread Jenny Zhang
The osdl-dbt3 test starts with building and vacuuming the database.  
The execution plans were taken after the vacuuming.

I did two tests with the same database parameters:
1. run two osdl-dbt3 runs on one box without rebooting the stystem. 
Though the execution plans are the same, the costs are different.  The
system status are different for the two runs, for example, some page
cache are not released after the first one.  Does that make the cost
different?

2. run two osdl-dbt3 runs on two boxes.  The hardware of the two boxes
are the same.  And each run starts from scratch (building linux kernel,
pgsql ect, and reboot).  To my surprise, not only the cost are different
between the two runs, the execution plan changed for Query 9.  
The execution plans can be found at:
http://khack.osdl.org/stp/277780/results/plan/power_query9.txt
http://khack.osdl.org/stp/29/results/plan/power_query9.txt

My test leads me to the following questions:
What are the factors which determine the execution plan and cost?
Does PostgreSQL check the system resource dynamically?  
If Inserts and Updates happened but vacuuming is not executed, dose the
execution plan change?

Thanks,
-- 
Jenny Zhang
Open Source Development Lab Inc 
12725 SW Millikan Way
Suite 400
Beaverton, OR 97005
(503)626-2455 ext 31


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] [osdldbt-general] Re: [PERFORM] OSDL Database Test Suite 3 is

2003-08-04 Thread Jenny Zhang
Thanks all for your feedback.

I think I should explain more about how to use this test kit. 

The main purpose of putting the test kit on Scalability Test
Platform(STP) is that testers can run the workload against the database
with different parameters and Linux kernels to see performance
differences.  Though the test kit picks up default parameters if they
are not provided, the command line parameters overwrite the default
ones.  Currently, the following parameters are supported:
-s  -n  -d '' -r
<{0|1}> -x <{0|1}> 

where:
-s  is tpc-h database scale factor, right now, only SF=1
is available.

-n  is the number of throughput test streams, which
corresponds number of simultaneous database connections during
throughput test.

-d  is the database parameters used when starting
postmaster.  for example: 
-B 12 -c effective_cache_size=393216 -c sort_mem=524288 -c
stats_command_string=true -c stats_row_level=true -c
stats_block_level=true

-r {0|1}: indicates if the database dir base//pgsql_tmp is
put on a separate disk drive

-x {0|1}: indicates if the WAL is put on a separate disk drive.

The other comments are in-lined:

On Mon, 2003-08-04 at 06:33, Manfred Koizar wrote:
> | effective_cache_size   | 1000
> 
> With 4GB of memory this is definitely too low and *can* (note that I
> don't say *must*) lead the planner to wrong decisions.
> 
I changed the default to effective_cache_size=393216 as calculated by
Scott.  Another way to check the execution plan is to go to the results
dir:
http://khack.osdl.org/stp/276917/results
There is a 'power_plan.out' file to record the execution plan.  I am
running a test with the changed effective_cache_size, I will see how it
affect the plan.

> | shared_buffers | 15200
> 
> ... looks reasonable.  Did you test with other values?
I have only one with shared_buffers=120 at:
http://khack.osdl.org/stp/276847/
The performance degraded.  
> 
> | sort_mem   | 524288
> 
> This is a bit high, IMHO, but might be ok given that DBT3 is not run
> with many concurrent sessions (right?).
> http://khack.osdl.org/stp/276912/results/plot/vmstat_swap.png shows
> some swapping activity towards the end of the run which could be
> caused by a too high sort_mem setting.
Right, I run only 4 streams.  Setting this parameter lower caused more
reading/writing to the pgsql/tmp.  I guess the database has to do it if
it can not do sorting in memory. 

On 4 Aug 2003 at 15:33, Manfred Koizar wrote:
> 
> I could not get postgresql .conf so I will combine the comments.
It is under database monitor data: database parameters
> 
> 1. Effective cache size, already mentioned
> 2. Sort memory already mentioned.
> 3. Was WAL put on different drive?
That run did not put WAL on different drive.  I changed it this morning
so that it is configurable.  Also I changed the result page so that the
testers can tell from the result page.
> 4. Can you try with autovacuum daemon and 7.4beta when it comes out..
I'd be happy to run it.  We would like to improve out Patch Life
Management(PLM) system so that it can accept PG patches and run
performance tests on those patches.  Right now PLM only manages Linux
Kernel patches.  I would like to ask the PostgreSQL community if this
kind of tools is of interest.
> 5. What was the file system? Ext2/Ext3/reiser/XFS?
> 
> 
> 
It is Ext2.  Yeah, it is not reported on the page.
> Is there any comparison available for other databases.. Could be interesting to 
> see..:-)
> 
> 
> 

Let me know if you have any suggestions about how to improve the test
kit (parameters, reported information, etc.), or how to make it more
useful to PG community.

Thanks,
-- 
Jenny Zhang
Open Source Development Lab Inc 
12725 SW Millikan Way
Suite 400
Beaverton, OR 97005
(503)626-2455 ext 31


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly