Re: [GENERAL] database is bigger after dump/restore - why? (60 GB to 109 GB)

2011-02-24 Thread marcin mank
On Tue, Feb 22, 2011 at 9:44 PM, Aleksey Tsalolikhin
atsaloli.t...@gmail.com wrote:

 Most of our data is in a single table, which on the old server is 50 GB in
 size and on the new server is 100 GB in size.


Maybe the table the on new server has fillfactor less than 100 ?

Greetings
Marcin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] sudoku in an sql statement

2009-11-04 Thread marcin mank
 I think the Oracle guy's version could easily be adapted to PG 8.4 ---
 those little rownum subqueries seem to be just a substitute for not
 having generate_series(1,9), and everything else is just string-pushing.

indeed.

marcin=# with recursive x( s, ind ) as
( select sud, position( ' ' in sud )
  from  (select '53  76  195986 8   6   34  8 3  17   2
6 628419  58  79'::text as sud) xx
  union all
  select substr( s, 1, ind - 1 ) || z || substr( s, ind + 1 )
   , position(' ' in repeat('x',ind) || substr( s, ind + 1 ) )
  from x
 ,  (select gs::text as z from generate_series(1,9) gs)z
  where ind  0
  and not exists ( select null
   from generate_series(1,9) lp
   where z.z = substr( s, ( (ind - 1 ) / 9 ) * 9 + lp, 1 )
   orz.z = substr( s, mod( ind - 1, 9 ) - 8 + lp * 9, 1 )
   orz.z = substr( s, mod( ( ( ind - 1 ) / 3 ), 3 ) * 3
  + ( ( ind - 1 ) / 27 ) * 27 + lp
  + ( ( lp - 1 ) / 3 ) * 6
   , 1 )
 )
)
select s
from x
where ind = 0;
 s
---
 
534678912672195348198342567859761423426853791713924856961537284287419635345286179
(1 row)

Time: 472.027 ms


btw: it is pretty cool to replace some of the numbers in input with
spaces and see how the query finds multiple solutions

btw2: is SQL with 'with recursive' turing-complete ? Anyone care to
try a Brainf*ck interpreter ? :)

Greetings
marcin Mańk

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] sudoku in an sql statement

2009-11-04 Thread marcin mank
On Thu, Nov 5, 2009 at 12:31 AM, Richard Broersma
richard.broer...@gmail.com wrote:
 I don't get the same results:

This is due to my email client breaking the lines.
Try this: http://pastebin.com/f2a0884a1

Greetings
Marcin Mańk

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] sudoku in an sql statement

2009-11-04 Thread marcin mank
On Thu, Nov 5, 2009 at 12:39 AM, marcin mank marcin.m...@gmail.com wrote:
 On Thu, Nov 5, 2009 at 12:31 AM, Richard Broersma
 richard.broer...@gmail.com wrote:
 I don't get the same results:

 This is due to my email client breaking the lines.
 Try this: http://pastebin.com/f2a0884a1

doh.
http://www.pastie.org/684163

Greetings
Marcin Mańk

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioned tables as a poor mans columnar index?

2009-10-16 Thread marcin mank
On Fri, Oct 16, 2009 at 9:19 PM, Peter Hunsberger
peter.hunsber...@gmail.com wrote:

 The basic problem I have is that I have some tables that are
 potentially very long (100,000's to millions of rows) and very skinny,

 and I end up with maybe a total of 12 bits of data in each row.

Are You aware that there are some 20-ish bytes of metadata for each
row? saving 4 bytes buys You nothing. Give it up.

Also, these are actually pretty small tables (i.e. they fit in memory
of any non-toy server).

Greetings
Marcin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Object-Oriented Database?

2009-04-27 Thread marcin mank
Is there any sane reason to use an array column, besides performance
(the values can be read in less disk seeks than in a
table-with-a-foreign-key scenario)?

Greetings
marcin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] intermittant performance problem

2009-03-29 Thread marcin mank
I think (a part of) Your problem is that order by random() is O(N
logN) complexity, while You are after O(N) .

The solution (in pseudocode)

random_sample(resultset,K):
  result := first K rows from resultset
  resultset.scrollto(K+1)
  p = K+1
  while(resultset.hasMoreRows())
row = resultset.current()
resultset.advance()
if(random()  K/p)
  replace a random element in result with row
p = p+1
  return result


the invariant being that at each step result contains a random sample
of K elements.

It should be fairly easy to implement in plpgsql.

Greetings
Marcin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] intermittant performance problem

2009-03-29 Thread marcin mank
On Sun, Mar 29, 2009 at 10:24 AM, marcin mank marcin.m...@gmail.com wrote:
 I think (a part of) Your problem is that order by random() is O(N
 logN) complexity, while You are after O(N) .

 The solution (in pseudocode)


[snip]

OK, I may be guiding You the wrong way

select g,g,g,g from generate_series(1,2500) as g order by random() limit 10

executes in under thirty seconds, so I don`t think the sort is a problem.

Greetings
Marcin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What is the best plan to upgrade PostgreSQL from an ancient version?

2009-01-26 Thread marcin mank
On Mon, Jan 26, 2009 at 10:37 PM, Dann Corbit dcor...@connx.com wrote:
 My notion is to do a character mode database dump as SQL statements and
 then load into the new version by execution of psql against the sql
 STATEMENTS.

 What are the gotchas we can expect with this approach?

 When I say 'ancient' I mean v7.1.3 and the target is v8.3.5.


One gotcha is that some ancient versions (and I think 7.1 is one of
these) used to silently truncate varchar values that don`t fit into
the declared field width, while recent versions throw errors. Check if
Your apps don`t depend on this behaviour.

good luck
Marcin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query sometimes takes down server

2009-01-15 Thread marcin mank
 I have a dynamically built query that will periodically(2 times a day and 
 becoming more frequent) make my server totally unresponsive.

does this query involve more than geqo_threshold (default 12) tables?
If so, this most probably is geqo (genetic query optimizer) kicking
in. Try to fiddle with some geqo parameters (raising geqo_effort is
the first choice).

Greetings
Marcin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] View vs Constantly Updated Table

2008-12-16 Thread marcin mank
It is generally better to save the balance. The general rule in
accounting systems design is what can be printed, should be
explicitly on disk. for an invoice:

value before tax, tax percentage, value after tax, total before tax,
total after tax, etc, should all be saved explicitly.

An account should have a balance. Every operation should have balance
before operation, value, balance after operation. You should never
update an operation.

This way when business rules change all previous documents are stored
in consistent state.

Greetings
Marcin Mańk

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-23 Thread marcin mank
Yes, the figures are like this:
* average number of raw inserts / second (without any optimization
 or previous aggregation): #clients (~ 100 thousand) * #sensors (~ 10)
 / 6seconds = 166 thousand inserts / second...

this is average?
166 000 * 20 bytes per record * 86400 seconds per day = 280GB / day ,
not counting indices.

What is the time span You want to have the data from?

Greetings
Marcin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] join question

2008-10-22 Thread marcin mank
Sort Method:  external sort  Disk: 1320kB

One simple speedup could be upping Your work_mem to 2M for this query,
so the sorts are in memory.

btw: Last time I used Postgres, it did not show the sort method. Cool.

Greetings
Marcin Mank

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query takes a long time

2008-08-23 Thread marcin mank

  select count(distinct attendance.studentid)

 from attendance

 where attendance.studentid not in (select studentid from attendance where
 auth not in('C','E','F','G','H','I','M','N','O','R','S','T','U'))


I am tired a lot now, but isn`t it the same as:

select count(distinct attendance.studentid)

from attendance

where auth in('C','E','F','G','H','I','M','N','O','R','S','T','U')

?


Re: [GENERAL] Recovering deleted or updated rows

2006-11-15 Thread Marcin Mank

- Original Message - 
From: Florian G. Pflug [EMAIL PROTECTED]
To: pgsql-general@postgresql.org
Sent: Wednesday, November 15, 2006 11:19 AM
Subject: [GENERAL] Recovering deleted or updated rows


 Hi

 I'm looking for a way to recover deleted or old versions of
 accidentally updated rows from a postgres 7.4 database. I've
 verified that the relevant tables haven't been vacuumed since
 the accident took place.

 I was thinking that it might work to patch the clog so that
 the offending transactions look like they have never been
 comitted? Would that work? How could I patch the clog?

 If you have any other ideas, please tell me - I'm quite
 desperate ;-)


be sure to backup the data files before trying any of my ideas


1) pgfsck - last supported version was iirc 7.3, but with some hacking it
may work for You (When I tried it with 7.4 , it gave some errors about
unknown data types)

2) pg_resetxlog
-select xmin from table where id=id_of_a_badly_updated_row (if that was
updated in one transaction. If not, try to find the minimum xmin)
-stop postmaster
- reset the transaction counter to a value a hundred less than what You get.
-start postmaster, and You should see the old data.
-pg_dump the table . There may be some strange things in the dump - review
it manually.
-stop postmaster
-restore datafiles from backup (pg_resetxlog may have messed up your data)

Greetings
Marcin


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] execute/perform and FOUND

2006-09-21 Thread Marcin Mank
 gives the result below. It seems inconsistent to me with
 regard to the FOUND variable. I would expect FOUND to always
 be false regardless of whether I use EXECUTE or PERFORM. I
 certainly do not expect it to be true for the third EXECUTE
 even assuming that PERFORM may have a bug. What is it that I
 am missing out on here ?


With:

 perform cmd;
 raise notice ''found (perform): %'', found;

You effectively do:
select 'select 1 from test where fk_item=1324314' ;


Try:

perform 1 from test where fk_item=1324314

Greetings
Marcin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] references/tutorial/tricks on dynamic generation of sql ( plpgsql functions) editing/coding

2006-09-11 Thread Marcin Mank
 I'm using pg 7.4 so $$ trick shouldn't work.
Should work. if you put enugh $$, somebody might backport this for you ;)


how about:

create or replace function SP_GarbageCollectionGenerate()
returns bool as '
declare
v_query text;
begin
v_qyery=''create or replace blah blah 
:para1 blah blah
blah blah :para2
'';
v_query=replace(v_query,'':para1'',quote_literal(value_1));
v_query=replace(v_query,'':para2'',quote_ident(value_2));
execute v_query;

end;
' language plpgsql;

a bit more readable, I think.

Greetings
Marcin

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] I need help creating a query

2006-07-14 Thread Marcin Mank

- Original Message - 
From: Sergio Duran [EMAIL PROTECTED]
To: pgsql-general@postgresql.org
Sent: Thursday, July 13, 2006 9:20 PM
Subject: [GENERAL] I need help creating a query


 Hello,

 I need a little help creating a query, I have two tables, worker and
 position, for simplicity sake worker only has its ID and its name,
position
 has the ID of the worker, the name of his position, a date, and his
salary/

 worker:   worker_id, name
 position: position_id, worker_id, position, startdate, salary

 If I perfom a query joining both tables, I can obtain all the workers and
 the positions the've had.

 SELECT name, startdate, position,  salary FROM worker JOIN position
 USING(worker_id);
 worker1 | 2001-01-01 | boss | 99
 worker2 | 2001-01-01 | cleaning| 100
 worker2 | 2006-04-01 | programmer   | 2
 worker2 | 2006-07-04 | management | 25000

 so far so good, now I need to obtain all the workers only with the
position
 they had on a given date.
 if I wanted to know the positions on '2006-05-01' it would return
 worker1 | 2001-01-01 | boss | 99
 worker2 | 2006-04-01 | programmer   | 2


This should work:

select distinct on(W.worker_id) W.name,P.position,P.salary
from worker W,position P
where P.worker_id=W.worker_id
and 'SOME DATE' = P.startdate
order by W.worker_id,P.startdate

Cheers
Marcin


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match