[HACKERS] Timestamps

2004-02-09 Thread Slavisa Garic
Hi all,

I am not sure if this should be sent to this mailing list. If i am wrong
could someone please direct me to the correct one so I can subscribe
there.

I wanted to ask a simple question. Say I have a table with the timestamp
field. What is the best way to say get all the records that were created
say 2 hours before the query. One of the options would be to generate the
timestamp in the correct format and then send a query in the format 
SELECT * from  where create_time < 

Is there a better way than this?

Any help would be greatly appreciated,
Regards,
Slavisa



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] COPY from question

2004-02-04 Thread Slavisa Garic
Hi Kevin,

On Tue, 3 Feb 2004, Kevin Brown wrote:

> Slavisa Garic wrote:
> > Using pg module in python I am trying to run the COPY command to populate
> > the large table. I am using this to replace the INSERT which takes about
> > few hours to add 7 entries where copy takes minute and a half. 
> 
> That difference in speed seems quite large.  Too large.  Are you batching
> your INSERTs into transactions (you should be in order to get good
> performance)?  Do you have a ton of indexes on the table?  Does it have
> triggers on it or some other thing (if so then COPY may well wind up doing
> the wrong thing since the triggers won't fire for the rows it inserts)?
> 
> I don't know what kind of schema you're using, but it takes perhaps a
> couple of hours to insert 2.5 million rows on my system.  But the rows
> in my schema may be much smaller than yours.

You are right about the indexes. There is quite a few of them (5-6 without
looking at the schema). The problem is that I do need those indexes as I
have a lot of SELECTs on that table and inserts are only happening once.

You are also right about the rows (i think) as I have about 15-20 columns.
This could be split into few other table and it used to be but I have
merged them because of the requirement for the faster SELECTs. With the
current schema there most of my modules that access the database are not
required to do expensive JOINs as they used to. Because faster SELECTs are
more important to me then faster INSERTs I had to do this. THis wasn't a
problem for me until I have started creating experiments which had more
than 20 thousand jobs which translates to 20 thousand rows in this big
table.

I do batch INSERTs into one big transaction (1000 rows at a time). While i
did get some improvement compared to the single transaction per insert it
was still not fast enough (well not for me :) ). Could you please
elaborate on the triggers? I have no idea what kind of triggers there are
in PGSQL or relational databases.

With regards to my problem, I did solve it by piping the data into the
COPY stdin. Now I have about 75000 rows inserted in 40 seconds which is
extremely good for me.

Thank you for your help,
Regards,
Slavisa

 
> -- 
> Kevin Brown [EMAIL PROTECTED]
> 
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster
> 


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[HACKERS] COPY from question

2004-02-02 Thread Slavisa Garic
Hi,

I have a question about the COPY statement. I am using PGSQL(7.3.4)  with
python-2.3 on RedHat v8 machine. The problem I have is the following.

Using pg module in python I am trying to run the COPY command to populate
the large table. I am using this to replace the INSERT which takes about
few hours to add 7 entries where copy takes minute and a half. Now
these stats come from the NetBSD machine I also use which doesn't have
this problem but has same python and same pgsql installed.

My understanding is that COPY workes FROM 'filename' or STDIN where the
last characters are '.\\n'. I tried using the copy from 'filename' and as
I said NetBSD is not complaining where I get the following error on Linux
machine even if permissions on the data file are 777:

  _pg.error: ERROR:  COPY command, running in backend with effective uid
  26, could not open file '/home/slavisa/.nimrod/experiments/demo/ejdata'
  for reading.  Errno = Permission denied (13).

I can't figure out why would this be occuring so I wanted to switch to
FROM STDIN option but I got stuck here due to lack of knowledge I have to
admit. 

What I would like to ask anyone who knows anything about this. If you know
what the problem is with FROM file option or you know how to get COPY FROM
STDIN working from within the python (or any other) program, help would be
greatly appreciated,

Regards,
Slavisa


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] INSERT extremely slow with large data sets

2003-11-14 Thread Slavisa Garic
Hi Dann

Here is the schema and also could you just be more specific on COPY
command. ALso does talking dirrectly to API speed things up ? (I am new to
databases but i am learning quickly) 

-- NimrodEnfJob --
 
create table NimrodEnfJob(
exp_id  INTEGER not null references NimrodEnfExperiment,
task_id INTEGER not null references NimrodTask,
pgroup_id   INTEGER not null references NimrodParameterGroup,
agent_idINTEGER references NimrodAgent on delete set null,
jobname varchar(255) not null,
admin   char(1) not null default 'F'
check (admin in ('F', 'T')),
tasknamevarchar(255) not null,
param_text  TEXT not null,
openchar(1) not null default 'F'
check (open in ('F', 'T')),
control varchar(8) not null default 'start'
check (control in ('nostart', 'start', 'stop')),
status  varchar(16) not null default 'ready'
check (status in ('ready', 'executing', 'failed',
'done')),
cpulength   real not null,
sleeptime   real not null,
filesizereal not null,
cputime   real,
waittimereal,
filetimereal,
filebytes   integer,
priorityinteger not null default 100,
create_time timestamp not null default CURRENT_TIMESTAMP,
start_time timestamp,
finish_time timestamp,
budget  real not null default 0.0,
servername  varchar(255),
error_info  varchar(255) not null default '',
more_info   TEXT not null default '',
primary key (exp_id, jobname),
foreign key (exp_id, taskname) references NimrodEnfTask
);

Also these are the indexes on this table. I created them on the columnt
that are most commonly accessed:
create unique index nej_idx
ON NimrodEnfJob (exp_id, pgroup_id);
 
create unique index nej_idx1
ON NimrodEnfJob (pgroup_id);
 
create index nej_idx2
ON NimrodEnfJob (status);
 
create unique index nej_idx3
ON NimrodEnfJob (status, pgroup_id);
 
create index nej_idx4
ON NimrodEnfJob (status, agent_id);
 
create index nej_idx5
ON NimrodEnfJob (agent_id);

I did notice that removing those indexes doesn't import by much. Similar
behaviour is observed but it just takes a bit less time to insert (0.01
less then usually at 6 records) 

Regards,
Slavisa

On Fri, 14 Nov 2003, Dann Corbit
wrote:

> > -Original Message-
> > From: Slavisa Garic [mailto:[EMAIL PROTECTED] 
> > Sent: Thursday, November 13, 2003 11:37 PM
> > To: [EMAIL PROTECTED]
> > Subject: [HACKERS] INSERT extremely slow with large data sets
> > 
> > 
> > Hi Everyone,
> > 
> > This is my first post here so please tell me to go somewhere 
> > else if this is the wrong place to post questions like this.
> > 
> > I am using PostgreSQL 7.3.2 and have used earlier versions 
> > (7.1.x onwards) and with all of them I noticed same problem 
> > with INSERTs when there is a large data set. Just to so you 
> > guys can compare time it takes to insert one row into a table 
> > when there are only few rows present and when there are thousands:
> > 
> > Rows PresentStart Time  Finish Time
> > 
> > 100 1068790804.12   1068790804.12
> > 10001068790807.87   1068790807.87
> > 50001068790839.26   1068790839.27
> > 1   1068790909.24   1068790909.26
> > 2   1068791172.82   1068791172.85
> > 3   1068791664.06   1068791664.09 
> > 4   1068792369.94   1068792370.0
> > 5   1068793317.53   1068793317.6
> > 6   1068794369.38   1068794369.47
> > 
> > As you can see if takes awfully lots of time for me just to 
> > have those values inserted. Now to make a picture a bit 
> > clearer for you this table has lots of information in there, 
> > about 25 columns. Also there are few indexes that I created 
> > so that the process of selecting values from there is faster 
> > which by the way works fine. Selecting anything takes under 5 seconds.
> > 
> > Any help w

Re: [HACKERS] INSERT extremely slow with large data sets

2003-11-14 Thread Slavisa Garic


On Fri, 14 Nov 2003, Alvaro Herrera wrote:

> On Fri, Nov 14, 2003 at 06:36:41PM +1100, Slavisa Garic wrote:
> 
> > Rows PresentStart Time  Finish Time
> > 
> > 100 1068790804.12   1068790804.12
> > 10001068790807.87   1068790807.87
> > 50001068790839.26   1068790839.27
> > 1   1068790909.24   1068790909.26
> > 2   1068791172.82   1068791172.85
> > 3   1068791664.06   1068791664.09 
> > 4   1068792369.94   1068792370.0
> > 5   1068793317.53   1068793317.6
> > 6   1068794369.38   1068794369.47
> 
> > [too slow]
> 
> Ok, so inserting 6 rows seems to take 0.09 seconds, and inserting
> 5000 takes only 0.01.  And your problem is exactly what?

You didn't understand the question. Inserting ONE ROW when there are already
5000 ROWS present takes 0.01 seconds. Inserting ONE ROW when there are
already 6 ROWS present takes 0.09 secods. In other words in takes
about 9 times more time to insert ONE ROW when there is a larger set of
data already in the database. As my experiments will grow and more data
will be inserted this is getting to take too long. Inserting 7 rows
takes about just over an hour. INserting 5000 takes about minute and an
half.

I don't know if this the behaviour to be expected so that is why i posted
the question and that is my problem. I also wanted to know what can be
done to improve this if it can be,

Regards,
Slavisa
 
> -- 
> Alvaro Herrera ()
> "Coge la flor que hoy nace alegre, ufana. ¿Quién sabe si nacera otra mañana?"
> 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] INSERT extremely slow with large data sets

2003-11-14 Thread Slavisa Garic
Hi Everyone,

This is my first post here so please tell me to go somewhere else if this
is the wrong place to post questions like this.

I am using PostgreSQL 7.3.2 and have used earlier versions (7.1.x onwards)
and with all of them I noticed same problem with INSERTs when there is a
large data set. Just to so you guys can compare time it takes to insert
one row into a table when there are only few rows present and when there
are thousands:

Rows PresentStart Time  Finish Time

100 1068790804.12   1068790804.12
10001068790807.87   1068790807.87
50001068790839.26   1068790839.27
1   1068790909.24   1068790909.26
2   1068791172.82   1068791172.85
3   1068791664.06   1068791664.09 
4   1068792369.94   1068792370.0
5   1068793317.53   1068793317.6
6   1068794369.38   1068794369.47

As you can see if takes awfully lots of time for me just to have those
values inserted. Now to make a picture a bit clearer for you this table
has lots of information in there, about 25 columns. Also there are few
indexes that I created so that the process of selecting values from there
is faster which by the way works fine. Selecting anything takes under 5
seconds.

Any help would be greatly appreciated even pointing me in the right
direction where to ask this question. By the way I designed the database
this way as my application that uses PGSQL a lot during the execution so
there was a huge need for fast SELECTs. Our experiments are getting larger
and larger every day so fast inserts would be good as well.

Just to note those times above are of INSERTs only. Nothing else done that
would be included in those times. Machine was also free and that was the
only process running all the time and the machine was Intel(R) Pentium(R)
4 CPU 2.40GHz.

Regards,
Slavisa


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] INSERT extremely slow with large data sets

2003-11-14 Thread Slavisa Garic
Hi Everyone,
 
This is my first post here so please tell me to go somewhere else if this
is the wrong place to post questions like this.
 
I am using PostgreSQL 7.3.2 and have used earlier versions (7.1.x onwards) 
and with all of them I noticed same problem with INSERTs when there is a
large data set. Just to so you guys can compare time it takes to insert
one row into a table when there are only few rows present and when there
are thousands:

Rows PresentStart Time  Finish Time

100 1068790804.12   1068790804.12
10001068790807.87   1068790807.87
50001068790839.26   1068790839.27
1   1068790909.24   1068790909.26
2   1068791172.82   1068791172.85
3   1068791664.06   1068791664.09 
4   1068792369.94   1068792370.0
5   1068793317.53   1068793317.6
6   1068794369.38   1068794369.47

As you can see if takes awfully lots of time for me just to have those
values inserted. Now to make a picture a bit clearer for you this table 
has lots of information in there, about 25 columns. Also there are few
indexes that I created so that the process of selecting values from there
is faster which by the way works fine. Selecting anything takes under 5
seconds.

Any help would be greatly appreciated even pointing me in the right
direction where to ask this question. By the way I designed the database
this way as my application that uses PGSQL a lot during the execution so
there was a huge need for fast SELECTs. Our experiments are getting larger
and larger every day so fast inserts would be good as well.

Just to note those times above are of INSERTs only. Nothing else done that
would be included in those times. Machine was also free and that was the
only process running all the time and the machine was Intel(R) Pentium(R)
4 CPU 2.40GHz.

Regards,
Slavisa




---(end of broadcast)---
TIP 8: explain analyze is your friend