Re: [PERFORM] printing results of query to file in different times

2017-09-05 Thread Mariel Cherkassky
I'm using an extension that is called orafce.
Yes I add the raise notice in order to see what happening but it doesnt
work faster. The execution plan isnt relevant because It happens for many
queries not for a specific one. I didnt understand what do you mean by
REPEATABLE_READ .

2017-08-31 16:24 GMT+03:00 George Neuner :

> On Thu, 24 Aug 2017 16:15:19 +0300, Mariel Cherkassky <
> mariel.cherkas...@gmail.com> wrote:
>
> >I'm trying to understand what postgresql doing in an issue that I'm
> having.
> >Our app team wrote a function that runs with a cursor over the results of
> a
> >query and via the utl_file func they write some columns to a file. I dont
> >understand why, but postgresql write the data into the file in the fs in
> >parts. I mean that it runs the query and it takes time to get back results
> >and when I see that the results back postgresql write to file the data and
> >then suddenly stops for X minutes. After those x minutes it starts again
> to
> >write the data and it continues that way until its done. The query returns
> >total *100* rows. I want to understand why it stops suddenly. There arent
>
> >any locks in the database during this operation.
> >
> >my function looks like that :
> >
> >func(a,b,c...)
> >cursor cr for
> >select ab,c,d,e.
> >begin
> >raise notice - 'starting loop time - %',timeofday();
> > for cr_record in cr
> >Raise notice 'print to file - '%',timeofday();
> >utl_file.write(file,cr_record)
> > end loop
> >end
> >
> >I see the log of the running the next output :
> >
> >starting loop 16:00
> >print to file : 16:03
> >print to file : 16:03
> >print to file : 16:07
> >print to file : 16:07
> >print to file : 16:07
> >print to file : 16:010
> >
> >..
> >
> >Can somebody explain to me this kind of behavior ? Why is it taking some
> >much time to write and in different minutes after the query already been
> >executed and finished ? Mybe I'm getting from the cursor only part of the
> >rows ?
>
>
> First I'd ask where did you get  utl_file  from?  Postrgesql has no such
> facility, so you must be using an extension.  And not one I'm familiar with
> either -  EnterpriseDB has a utl_file implementation in their Oracle
> compatibility stuff, but it uses "get" and "put" calls rather than "read"
> and "write".
>
> Second, raising notices can be slow - I assume you added them to see what
> was happening?  How does the execution time compare if you remove them?
>
> I saw someone else asked about the execution plan, but I'm not sure that
> will help here because it would affect only the initial select ... the
> cursor would be working with the result set and should be able to skip
> directly to the target rows.  I might expect several seconds for the loop
> with I/O ... but certainly not minutes unless the server is severely
> overloaded.
>
> One thing you might look at is the isolation level of the query.  If you
> are using READ_COMMITTED or less, and the table is busy, other writing
> queries may be stepping on yours and even potentially changing your result
> set during the cursor loop.  I would try using REPEATABLE_READ and see what
> happens.
>
>
> George
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


[PERFORM] slow index scan performance

2017-09-05 Thread Soni M
Consider these 2 index scan produced by a query

->  Index Scan using response_log_by_activity on public.response_log rl2
 (cost=0.00..51.53 rows=21 width=8) (actual time=9.017..9.056 rows=0
loops=34098)
   Output: rl2.activity_id, rl2.feed_id
   Index Cond: (rl2.activity_id =
rl.activity_id)
   Filter: rl2.success
   Buffers: shared hit=3357159
read=153313
   ->  Index Scan using activity_pkey on
public.activity a  (cost=0.00..51.10 rows=1 width=12) (actual
time=0.126..0.127 rows=1 loops=34088)
 Output: a.status_id, a.activity_id,
a.visit_id
 Index Cond: (a.activity_id =
rl.activity_id)
 Buffers: shared hit=137925 read=32728


And it's size

conscopy=# select
pg_size_pretty(pg_relation_size('response_log_by_activity'::regclass));
 pg_size_pretty

 7345 MB
(1 row)

conscopy=# select
pg_size_pretty(pg_relation_size('activity_pkey'::regclass));
 pg_size_pretty

 8110 MB
(1 row)

Index scan on response_log_by_activity is far slower. The table has just
been repacked, and index rebuilt, but still slow.

Is there any other way to make it faster ?

Why Buffers: shared hit=3,357,159 read=153,313 on response_log_by_activity
is much bigger than Buffers: shared hit=137925 read=32728 on activity_pkey
while activity_pkey size is bigger ?

-- 
Regards,

Soni Maula Harriz


Re: [PERFORM] slow index scan performance

2017-09-05 Thread Soni M
It's Postgres 9.1.24 on RHEL 6.5

On Tue, Sep 5, 2017 at 8:24 PM, Soni M  wrote:

> Consider these 2 index scan produced by a query
>
> ->  Index Scan using response_log_by_activity on public.response_log rl2
>  (cost=0.00..51.53 rows=21 width=8) (actual time=9.017..9.056 rows=0
> loops=34098)
>Output: rl2.activity_id, rl2.feed_id
>Index Cond: (rl2.activity_id =
> rl.activity_id)
>Filter: rl2.success
>Buffers: shared hit=3357159
> read=153313
>->  Index Scan using activity_pkey on
> public.activity a  (cost=0.00..51.10 rows=1 width=12) (actual
> time=0.126..0.127 rows=1 loops=34088)
>  Output: a.status_id, a.activity_id,
> a.visit_id
>  Index Cond: (a.activity_id =
> rl.activity_id)
>  Buffers: shared hit=137925 read=32728
>
>
> And it's size
>
> conscopy=# select pg_size_pretty(pg_relation_size('response_log_by_
> activity'::regclass));
>  pg_size_pretty
> 
>  7345 MB
> (1 row)
>
> conscopy=# select pg_size_pretty(pg_relation_size('activity_pkey'::
> regclass));
>  pg_size_pretty
> 
>  8110 MB
> (1 row)
>
> Index scan on response_log_by_activity is far slower. The table has just
> been repacked, and index rebuilt, but still slow.
>
> Is there any other way to make it faster ?
>
> Why Buffers: shared hit=3,357,159 read=153,313 on response_log_by_activity
> is much bigger than Buffers: shared hit=137925 read=32728 on activity_pkey
> while activity_pkey size is bigger ?
>
> --
> Regards,
>
> Soni Maula Harriz
>



-- 
Regards,

Soni Maula Harriz


Re: [PERFORM] slow index scan performance

2017-09-05 Thread Soni M
Trying on another server, it gives different result.

->  Index Scan using response_log_by_activity on public.response_log rl2
 (cost=0.00..50.29 rows=17 width=8) (actual time=0.955..0.967 rows=0
loops=30895)
   Output: rl2.activity_id, rl2.feed_id
   Index Cond: (rl2.activity_id =
rl.activity_id)
   Filter: rl2.success
   Buffers: shared hit=2311312
read=132342
   ->  Index Scan using activity_pkey on
public.activity a  (cost=0.00..49.79 rows=1 width=12) (actual
time=13.747..13.762 rows=1 loops=30892)
 Output: a.status_id, a.activity_id,
a.visit_id
 Index Cond: (a.activity_id =
rl.activity_id)
 Buffers: shared hit=124463 read=30175

Now, index scan on activity_pkey which take much slower. Can someone please
explain these ?

Thanks

On Tue, Sep 5, 2017 at 8:46 PM, Soni M  wrote:

> It's Postgres 9.1.24 on RHEL 6.5
>
> On Tue, Sep 5, 2017 at 8:24 PM, Soni M  wrote:
>
>> Consider these 2 index scan produced by a query
>>
>> ->  Index Scan using response_log_by_activity on public.response_log rl2
>>  (cost=0.00..51.53 rows=21 width=8) (actual time=9.017..9.056 rows=0
>> loops=34098)
>>Output: rl2.activity_id,
>> rl2.feed_id
>>Index Cond: (rl2.activity_id =
>> rl.activity_id)
>>Filter: rl2.success
>>Buffers: shared hit=3357159
>> read=153313
>>->  Index Scan using activity_pkey on
>> public.activity a  (cost=0.00..51.10 rows=1 width=12) (actual
>> time=0.126..0.127 rows=1 loops=34088)
>>  Output: a.status_id, a.activity_id,
>> a.visit_id
>>  Index Cond: (a.activity_id =
>> rl.activity_id)
>>  Buffers: shared hit=137925 read=32728
>>
>>
>> And it's size
>>
>> conscopy=# select pg_size_pretty(pg_relation_siz
>> e('response_log_by_activity'::regclass));
>>  pg_size_pretty
>> 
>>  7345 MB
>> (1 row)
>>
>> conscopy=# select pg_size_pretty(pg_relation_siz
>> e('activity_pkey'::regclass));
>>  pg_size_pretty
>> 
>>  8110 MB
>> (1 row)
>>
>> Index scan on response_log_by_activity is far slower. The table has just
>> been repacked, and index rebuilt, but still slow.
>>
>> Is there any other way to make it faster ?
>>
>> Why Buffers: shared hit=3,357,159 read=153,313 on
>> response_log_by_activity is much bigger than Buffers: shared hit=137925
>> read=32728 on activity_pkey while activity_pkey size is bigger ?
>>
>> --
>> Regards,
>>
>> Soni Maula Harriz
>>
>
>
>
> --
> Regards,
>
> Soni Maula Harriz
>



-- 
Regards,

Soni Maula Harriz


Re: [PERFORM] printing results of query to file in different times

2017-09-05 Thread George Neuner

Hi Mariel,

Please don't top post in the Postgresql groups.

On 9/5/2017 7:28 AM, Mariel Cherkassky wrote:
2017-08-31 16:24 GMT+03:00 George Neuner  >: > >> One thing you might look at is 
the isolation level of the query. >> If you are using READ_COMMITTED or 
less, and the table is busy, >> other writing queries may be stepping on 
yours and even >> potentially changing your result set during the cursor 
loop. I >> would try using REPEATABLE_READ and see what happens. > > I 
didn't understand what do you mean by REPEATABLE_READ .
I was referring to transaction isolation levels.  When multiple 
transactions are running concurrently, the DBMS can (or not) prevent 
them from seeing changes made by one another.  Consider 2 transactions A 
and B running concurrently:


  T1:   A reads table X
  T2:   B writes to table X
  T3:   B commits
  T4:   A reads table X again.

Depending on the isolation levels [and the specific query, obviously], A 
may or may not be able to see what changes B made to X.


The default isolation level in Postgresql is READ COMMITTED, which does 
allow transactions to see committed writes made by concurrently running 
transactions.  REPEATABLE READ is a higher level of isolation which 
effectively takes a snapshot of the table(s) when they are 1st read, and 
guarantees that any further reads (e.g., by cursors) of the tables made 
by the transaction continue to see the same results.



My thought was that your loop may be running slowly because the table is 
being changed underneath your cursor.  It may be better to pull the 
results into a temporary table and run your cursor loop over that.


For more information, see:
https://www.postgresql.org/docs/current/static/transaction-iso.html
https://www.postgresql.org/docs/9.6/static/sql-begin.html
https://stackoverflow.com/questions/6274457/set-isolation-level-for-postgresql-stored-procedures

George


Re: [PERFORM] printing results of query to file in different times

2017-09-05 Thread George Neuner


Geez ... I just saw how my last message got mangled.
Trying again.


On 9/5/2017 7:28 AM, Mariel Cherkassky wrote:

I didn't understand what do you mean by REPEATABLE_READ.


I was referring to transaction isolation levels.  When multiple 
transactions are running concurrently, the DBMS can (or not) prevent 
them from seeing changes made by one another.  Consider 2 transactions A 
and B running concurrently:


  T1:   A reads table X
  T2:   B writes to table X
  T3:   B commits
  T4:   A reads table X again.

Depending on the isolation levels [and the specific query, obviously], A 
may or may not be able to see what changes B made to X.


The default isolation level in Postgresql is READ COMMITTED, which does 
allow transactions to see committed writes made by concurrently running 
transactions.  REPEATABLE READ is a higher level of isolation which 
effectively takes a snapshot of the table(s) when they are 1st read, and 
guarantees that any further reads (e.g., by cursors) of the tables made 
by the transaction continue to see the same results.



My thought was that your loop may be running slowly because the table is 
being changed underneath your cursor.  It may be better to pull the 
results into a temporary table and run your cursor loop over that.


For more information, see:
https://www.postgresql.org/docs/current/static/transaction-iso.html
https://www.postgresql.org/docs/9.6/static/sql-begin.html
https://stackoverflow.com/questions/6274457/set-isolation-level-for-postgresql-stored-procedures 



George


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance