date:20090802

Re: [GENERAL] PostgreSQL server listen on other port than 5432

2009-08-02 Thread Dotan Barak

Hi.

On Thu, Jul 30, 2009 at 7:33 PM, Tom Lane wrote:
> Dotan Barak  writes:
>> The weird thing is that i used this port in a service that i wrote
>> only few seconds before this happened...
>
> Oh?  How'd you start that service exactly?
>
> I'm thinking maybe the postmaster inherited the open file from its
> parent process.  If it's not marked close-on-exec, which evidently
> it's not since the child processes have it too, then this could have
> happened as far as Postgres itself is concerned.  I'm having a bit of
> a hard time imagining how an open file could have gotten transmitted
> from some other initscript to this one, but it seems more probable
> than any other theory at the moment.
>
> Do any other processes besides PG have that socket open?  If you stop
> and restart the postmaster, does it open the socket again?

I guess you hit the spot: I have a service that I wrote in python
which uses port 17583.
This process restart the postgres SQL service using
"/etc/init.d/postgres restart"

I think that this may be related to this problem ...

I will mark the socket as close on exec.

Thanks
Dotan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] building a binary-portable database

2009-08-02 Thread Martijn van Oosterhout

On Sun, Aug 02, 2009 at 01:42:13AM -0400, Alexy Khrabrov wrote:
> Well, my question, of course, is, how come all those differences might  
> affect PG binary data so much -- portable design would try to minimize  
> such effects, wouldn't it?  Does it optimize for all of the above  
> intentionally, is it a side-effect of its design, and/or is there a set 
> of options for the build time which might minimize binary  
> incompatibility?  I'd like to understand exactly why and how we get  
> binary incompatibility, and what exactly do we get for not having it,  
> and can it be a choice. There's a lot of databases out there. e.g.  
> Berkeley DB, where the backup is mv or ftp.  Performance is allright,  
> too.  I wish I could configure some of my PG ones that way...

As long as you're only dealing with strings it's not a problem, but
when you start talking about integers or floating point there is no
"standard format". While it would theoretically be possible to make a
binary compatable version, the cost would be an extra conversion layer
for each and every column access in every table.

This is before you have even taken into acocunt the fact that different
CPUs have different alignment requirements for different types, so to
be portable you would have to take the worst case, which is just
wasting space on architectures which don't need it.

Finally you have effects like on 64-bit architectures you can pass a
64-bit value in a register, whereas on 32-bit architectures you may need
to allocate memory and pass a pointer.

Binary compatability takes work and costs performance and we prefer to
focus on other things.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.

signature.asc
Description: Digital signature

Re: [GENERAL] Division by zero

2009-08-02 Thread Oliver Kohll - Mailing Lists


On 31 Jul 2009, at 19:49, Jeff Davis wrote:


Yes -- you can do what you want anyway. If you want it to be more
readable, you can redefine the division operator yourself -- it's  
just a
function and operator that happens to be named "/" (although that  
might
not be wise). Maybe you can call it "//" to avoid confusion with  
people

used to the SQL standard behavior.


Great Idea, that's a very powerful feature, being able to redefine an  
operator. I did that as you suggest and it seems to work fine. My  
users access postgres through a web app layer so I modified the  
application to replace any cases of / with // in calculations as  
they're created.


In case there are any improvements to suggest and for the benefit of  
anyone else who wants to swallow division by zero, the function and  
operator are below. I only use integer and double precision numbers. I  
assume that using the more general 'numeric' rather than all  
combinations of these would have a performance penalty?


Regards
Oliver Kohll

oli...@gtwm.co.uk / 0845 456 1810 / 07814 828608
www.gtwm.co.uk - company
www.gtportalbase.com - product

---

-- This routine creates an alterantive division operator
-- that doesn't throw an error on a divide by zero
-- but rather returns null

CREATE OR REPLACE FUNCTION gtpb_divide(integer, integer) RETURNS integer
AS 'SELECT $1 / NULLIF($2,0);'
LANGUAGE SQL
IMMUTABLE
RETURNS NULL ON NULL INPUT;

CREATE OR REPLACE FUNCTION gtpb_divide(double precision, double  
precision) RETURNS double precision

AS 'SELECT $1 / NULLIF($2,0);'
LANGUAGE SQL
IMMUTABLE
RETURNS NULL ON NULL INPUT;

CREATE OR REPLACE FUNCTION gtpb_divide(double precision, integer)  
RETURNS double precision

AS 'SELECT $1 / NULLIF($2,0);'
LANGUAGE SQL
IMMUTABLE
RETURNS NULL ON NULL INPUT;

CREATE OR REPLACE FUNCTION gtpb_divide(integer, double precision)  
RETURNS double precision

AS 'SELECT $1 / NULLIF($2,0);'
LANGUAGE SQL
IMMUTABLE
RETURNS NULL ON NULL INPUT;

CREATE OPERATOR // (
PROCEDURE = gtpb_divide,
LEFTARG = integer,
RIGHTARG = integer
);

CREATE OPERATOR // (
PROCEDURE = gtpb_divide,
LEFTARG = double precision,
RIGHTARG = double precision
);

CREATE OPERATOR // (
PROCEDURE = gtpb_divide,
LEFTARG = double precision,
RIGHTARG = integer
);

CREATE OPERATOR // (
PROCEDURE = gtpb_divide,
LEFTARG = integer,
RIGHTARG = double precision
);



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Division by zero

2009-08-02 Thread Sam Mason

On Sun, Aug 02, 2009 at 12:08:28PM +0100, Oliver Kohll - Mailing Lists wrote:
> -- This routine creates an alterantive division operator
> -- that doesn't throw an error on a divide by zero
> -- but rather returns null
> 
> CREATE OR REPLACE FUNCTION gtpb_divide(integer, integer) RETURNS integer
> AS 'SELECT $1 / NULLIF($2,0);'
> LANGUAGE SQL
> IMMUTABLE
> RETURNS NULL ON NULL INPUT;

If I were you I'd remove the "RETURNS NULL ON NULL INPUT" flag.  I used
to think of it as just a "hint" to the planner as to its behavior,
but it turns out that it's interpreted much more strongly by PG.  The
interpretation means that the function doesn't end up getting be inlined
where I'd expect it to be and hence the optimizer doesn't get as much
freedom to rewrite your queries as you may want.

Admittedly it's going to be less of an issue with division that other
operators, but it's worth bearing in mind.  The "IMMUTABLE" options is a
good one to specify though, keep that!

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] questions on (parallel) COPY and when to REINDEX

2009-08-02 Thread Alban Hertroys


On 1 Aug 2009, at 23:24, Janet Jacobsen wrote:


Both of the big tables (now up to > 15 M rows each) have
indexes on several of the columns.  The indexes were
created using CREATE INDEX CONCURRENTLY...
Both tables have one or two foreign key constraints.


Others have answered some of your questions already, so I snipped a few.


My questions are:
(2) Should I REINDEX these two tables daily after the pipeline
completes?  Is this what other people do in practice?


No need, but as people are querying as soon as data is arriving, an  
analyse of the table you just copied to should help performance - the  
query plans will be derived from more accurate table statistics that  
way.



(3) Currently the pipeline executes in serial fashion.  We'd
like to cut the wall clock time down as much as possible.
The data processing and data analysis can be done in parallel,
but can the loading of the database be done in parallel, i.e.,
can I execute four parallel COPY commands from four copies
of a script?  Our initial attempt at doing this failed.  I found one
posting in the archives about parallel COPY, but it doesn't seem
to be quite on point.


As long as you're not using the same database-session in parallel you  
should be fine. You can't do parallel database operations in the same  
session. Last time I did something similar I used separate database  
connections.



(5) If I drop the indexes and foreign key constraints, then is it
possible to COPY to a table from more than one script, i.e., do
parallel COPY?  It seems like a really bad idea to drop those
foreign key constraints.


You can COPY in parallel, but having no FK's does help insert- 
performance. In that case whether you should or shouldn't remove your  
FK's depends on what's more important to you; insert performance or  
data correctness.


As some of your users query the data while it's still coming in I  
guess that data correctness is in fact more important to you and you  
should keep the FK's.


You wrote that your pipeline runs for a period of 4 hours and the  
table is about 15M rows now. What is taking up all that time? I  
understand why you'd want to parallelise that process, but where do  
you expect the gains?


From the above I'm guessing that part of the problem you want to  
solve by parallelising is insert-performance. In cases like these I've  
seen people with problems with the look-up speed of foreign keys  
because the database chose a bad query plan. Regular analyses during  
inserting data should prevent that.



Should re-think about where our database loading fits into the
overall pipeline, i.e., do the data processing and analysis in
parallel, but keep the data loading sequential?  The reason for
not doing all of the data loading at the end is that some of the
users *really* want to start querying the data and analysis
results as soon as they enter the database.


You seem to have a design where the database grows indefinitely? Your  
database doesn't look to be very big yet, but I expect that at some  
point in the future your data will grow so much that the ordering of  
it on disk starts to matter for how fast records can be fetched.  
That's a typical scenario where people here start to advise using  
table partitioning.


The thing is though that partitioning only works well if the queries  
your users perform contain clauses of which the database knows they  
divide the data (the same clauses used to partition the table).


The people you work with apparently are knowledgeable enough that they  
create their own queries. They will have to take partition constraints  
into account too if you choose to use table partitioning.



Looking forward to your replies.

Janet


Regards,
Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4a7581ec10134875916639!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Division by zero

2009-08-02 Thread Pavel Stehule

2009/8/2 Sam Mason :
> On Sun, Aug 02, 2009 at 12:08:28PM +0100, Oliver Kohll - Mailing Lists wrote:
>> -- This routine creates an alterantive division operator
>> -- that doesn't throw an error on a divide by zero
>> -- but rather returns null
>>
>> CREATE OR REPLACE FUNCTION gtpb_divide(integer, integer) RETURNS integer
>> AS 'SELECT $1 / NULLIF($2,0);'
>> LANGUAGE SQL
>> IMMUTABLE
>> RETURNS NULL ON NULL INPUT;
>
> If I were you I'd remove the "RETURNS NULL ON NULL INPUT" flag.  I used
> to think of it as just a "hint" to the planner as to its behavior,
> but it turns out that it's interpreted much more strongly by PG.  The
> interpretation means that the function doesn't end up getting be inlined
> where I'd expect it to be and hence the optimizer doesn't get as much
> freedom to rewrite your queries as you may want.
>

I thing, it's not true - RETURNS NULL ON NULL INPUT is equal to STRICT
flag, and it means, don't run function, when any param is null. For
optimalisator it means only one - when any parameter is constant NULL,
then function evaluation should be replaced by NULL. But not too much
often optimalizer should detect this case, so this is shortcut for
evaluator.  This flag doesn't change inlining.

> Admittedly it's going to be less of an issue with division that other
> operators, but it's worth bearing in mind.  The "IMMUTABLE" options is a
> good one to specify though, keep that!
>

There is paradox - IMMUTABLE function break inlinig :(. There is maybe bug
> --
>  Sam  http://samason.me.uk/
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Run procedure at startup

2009-08-02 Thread Jasen Betts

On 2009-07-27, Saleem EDAH-TALLY  wrote:
> --Boundary-00=_NZcbKOfdE2cMluA
> Content-Type: text/plain;
>   charset="us-ascii"
> Content-Transfer-Encoding: 7bit
>
> Hello,
>
> Is there a way to run a pl/pgsql automatically at server startup ?

pgagent perhaps? modify the startup scripts?

> Is there a way to run a pl/pgsql function with an infinite loop as a daemon ?

write a daemon (as a shell script if neccessary).
OTOH crontab or pgagent may be enough?

> Is there a way to start a pl/pgsql function that would persist after the user 
> session has closed ?

pgagent perhaps?

> Is there a way for an unprivileged user to delegate a task (allowed by 
> superuser) to a superuser ?

superuser specifys "security definer" when the superuser defines the
function.


>Is there a way to run a pl/pgsql automatically at 
>server startup ?
>Is there a way to run a pl/pgsql function with an 
>infinite loop as a daemon ?
>Is there a way to start a pl/pgsql function that 
>would persist after the user session has closed ?
>Is there a way for an unprivileged user to delegate 
>a task (allowed by superuser) to a superuser ?
>Thanks in advance.
>
> --Boundary-00=_NZcbKOfdE2cMluA--
>


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Division by zero

2009-08-02 Thread Sam Mason

On Sun, Aug 02, 2009 at 02:20:18PM +0200, Pavel Stehule wrote:
> 2009/8/2 Sam Mason :
> > On Sun, Aug 02, 2009 at 12:08:28PM +0100, Oliver Kohll - Mailing Lists 
> > wrote:
> >> CREATE OR REPLACE FUNCTION gtpb_divide(integer, integer) RETURNS integer
> >> AS 'SELECT $1 / NULLIF($2,0);'
> >> LANGUAGE SQL
> >> IMMUTABLE
> >> RETURNS NULL ON NULL INPUT;
> >
> > If I were you I'd remove the "RETURNS NULL ON NULL INPUT" flag. I used
> > to think of it as just a "hint" to the planner as to its behavior,
> > but it turns out that it's interpreted much more strongly by PG. The
> > interpretation means that the function doesn't end up getting be inlined
> > where I'd expect it to be and hence the optimizer doesn't get as much
> > freedom to rewrite your queries as you may want.
> 
> I thing, it's not true - RETURNS NULL ON NULL INPUT is equal to STRICT
> flag, and it means, don't run function, when any param is null.

Yes, this is how PG interprets it.

> For
> optimalisator it means only one - when any parameter is constant NULL,
> then function evaluation should be replaced by NULL. But not too much
> often optimalizer should detect this case, so this is shortcut for
> evaluator.  This flag doesn't change inlining.

No, not unless things have changed since this discussion:

  http://archives.postgresql.org/message-id/20090604090045.gr5...@samason.me.uk

> > Admittedly it's going to be less of an issue with division that other
> > operators, but it's worth bearing in mind.  The "IMMUTABLE" options is a
> > good one to specify though, keep that!
> 
> There is paradox - IMMUTABLE function break inlinig :(. There is maybe bug

Not in any tests I've done.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Division by zero

2009-08-02 Thread Pavel Stehule

2009/8/2 Sam Mason :
> On Sun, Aug 02, 2009 at 02:20:18PM +0200, Pavel Stehule wrote:
>> 2009/8/2 Sam Mason :
>> > On Sun, Aug 02, 2009 at 12:08:28PM +0100, Oliver Kohll - Mailing Lists 
>> > wrote:
>> >> CREATE OR REPLACE FUNCTION gtpb_divide(integer, integer) RETURNS integer
>> >> AS 'SELECT $1 / NULLIF($2,0);'
>> >> LANGUAGE SQL
>> >> IMMUTABLE
>> >> RETURNS NULL ON NULL INPUT;
>> >
>> > If I were you I'd remove the "RETURNS NULL ON NULL INPUT" flag. I used
>> > to think of it as just a "hint" to the planner as to its behavior,
>> > but it turns out that it's interpreted much more strongly by PG. The
>> > interpretation means that the function doesn't end up getting be inlined
>> > where I'd expect it to be and hence the optimizer doesn't get as much
>> > freedom to rewrite your queries as you may want.
>>
>> I thing, it's not true - RETURNS NULL ON NULL INPUT is equal to STRICT
>> flag, and it means, don't run function, when any param is null.
>
> Yes, this is how PG interprets it.
>
>> For
>> optimalisator it means only one - when any parameter is constant NULL,
>> then function evaluation should be replaced by NULL. But not too much
>> often optimalizer should detect this case, so this is shortcut for
>> evaluator.  This flag doesn't change inlining.
>
> No, not unless things have changed since this discussion:
>
>  http://archives.postgresql.org/message-id/20090604090045.gr5...@samason.me.uk
>
>> > Admittedly it's going to be less of an issue with division that other
>> > operators, but it's worth bearing in mind.  The "IMMUTABLE" options is a
>> > good one to specify though, keep that!
>>
>> There is paradox - IMMUTABLE function break inlinig :(. There is maybe bug
>
> Not in any tests I've done.

I did it - and in this case immutable is wrong and strict not. It's an
new for me, because I used rules that are well only for plpgsql or C
language. What I see now, the rules for sql are totally different.

Pavel

>
> --
>  Sam  http://samason.me.uk/
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Division by zero

2009-08-02 Thread Sam Mason

On Sun, Aug 02, 2009 at 05:22:45PM +0200, Pavel Stehule wrote:
> 2009/8/2 Sam Mason :
> > On Sun, Aug 02, 2009 at 02:20:18PM +0200, Pavel Stehule wrote:
> >> There is paradox - IMMUTABLE function break inlinig :(. There is maybe bug
> >
> > Not in any tests I've done.
> 
> I did it - and in this case immutable is wrong and strict not.

I'm not sure what you're responding to here, but I'm pretty sure the OP
wants IMMUTABLE and does not want STRICT/RETURNS NULL ON NULL INPUT.

> It's an
> new for me, because I used rules that are well only for plpgsql or C
> language. What I see now, the rules for sql are totally different.

SQL language functions are going to be different from anything else
because the can be.  The planner has intimate knowledge of SQL and hence
will try hard to expand these out and optimize them (in a similar way to
how it handles views).

The semantics of these keywords shouldn't change between SQL, plpgsql
and C functions though, it's just that the optimizer can look inside an
SQL function and not other functions.

Maybe if you can say what you did and what result you got back?

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] building a binary-portable database

2009-08-02 Thread Alexy Khrabrov

How about portability between systems with the same endianness and  
bitness, e.g. Intel 64-bit ones?


Cheers,
Alexy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Division by zero

2009-08-02 Thread Pavel Stehule

2009/8/2 Sam Mason :
> On Sun, Aug 02, 2009 at 05:22:45PM +0200, Pavel Stehule wrote:
>> 2009/8/2 Sam Mason :
>> > On Sun, Aug 02, 2009 at 02:20:18PM +0200, Pavel Stehule wrote:
>> >> There is paradox - IMMUTABLE function break inlinig :(. There is maybe bug
>> >
>> > Not in any tests I've done.
>>
>> I did it - and in this case immutable is wrong and strict not.
>
> I'm not sure what you're responding to here, but I'm pretty sure the OP
> wants IMMUTABLE and does not want STRICT/RETURNS NULL ON NULL INPUT.
>

I checked if function was inlined or not. When I mark function as
strict then it was inlined. When I mark function as IMMUTABLE then it
wasn't inlined. That's all - you can check it too.

>> It's an
>> new for me, because I used rules that are well only for plpgsql or C
>> language. What I see now, the rules for sql are totally different.
>
> SQL language functions are going to be different from anything else
> because the can be.  The planner has intimate knowledge of SQL and hence
> will try hard to expand these out and optimize them (in a similar way to
> how it handles views).
>
> The semantics of these keywords shouldn't change between SQL, plpgsql
> and C functions though, it's just that the optimizer can look inside an
> SQL function and not other functions.
>
> Maybe if you can say what you did and what result you got back?
>
> --
>  Sam  http://samason.me.uk/
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] building a binary-portable database

2009-08-02 Thread Martijn van Oosterhout

On Sun, Aug 02, 2009 at 12:02:41PM -0400, Alexy Khrabrov wrote:
> How about portability between systems with the same endianness and  
> bitness, e.g. Intel 64-bit ones?

Some parameters vary between compilers on the same platform. IIRC
whether a long on a 64-bit platform is 64-bit depends on the compiler
(windows platforms leave long as 32-bit). Alignment also differs
between compilers which will translate to differnces on disk.

Then you have things like time_t which depend on the C library you use.
size_t depends on the memory model, or perhaps even on the compile
flags. integer datetimes is a configure option.

There's nothing to stop you trying, but there's been no effort in
making it work.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.

signature.asc
Description: Digital signature

Re: [GENERAL] Division by zero

2009-08-02 Thread Sam Mason

On Sun, Aug 02, 2009 at 06:03:11PM +0200, Pavel Stehule wrote:
> 2009/8/2 Sam Mason :
> > On Sun, Aug 02, 2009 at 05:22:45PM +0200, Pavel Stehule wrote:
> >> 2009/8/2 Sam Mason :
> >> > On Sun, Aug 02, 2009 at 02:20:18PM +0200, Pavel Stehule wrote:
> >> >> There is paradox - IMMUTABLE function break inlinig :(. There is maybe 
> >> >> bug
> >> >
> >> > Not in any tests I've done.
> >>
> >> I did it - and in this case immutable is wrong and strict not.
> >
> > I'm not sure what you're responding to here, but I'm pretty sure the OP
> > wants IMMUTABLE and does not want STRICT/RETURNS NULL ON NULL INPUT.
> 
> I checked if function was inlined or not. When I mark function as
> strict then it was inlined. When I mark function as IMMUTABLE then it
> wasn't inlined. That's all - you can check it too.

I will be checking different things, please say what you're testing.

Different things are inlined in different places, its the different
places things get inlined that cause the optimizer to do different
things.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Division by zero

2009-08-02 Thread Pavel Stehule

2009/8/2 Sam Mason :
> On Sun, Aug 02, 2009 at 06:03:11PM +0200, Pavel Stehule wrote:
>> 2009/8/2 Sam Mason :
>> > On Sun, Aug 02, 2009 at 05:22:45PM +0200, Pavel Stehule wrote:
>> >> 2009/8/2 Sam Mason :
>> >> > On Sun, Aug 02, 2009 at 02:20:18PM +0200, Pavel Stehule wrote:
>> >> >> There is paradox - IMMUTABLE function break inlinig :(. There is maybe 
>> >> >> bug
>> >> >
>> >> > Not in any tests I've done.
>> >>
>> >> I did it - and in this case immutable is wrong and strict not.
>> >
>> > I'm not sure what you're responding to here, but I'm pretty sure the OP
>> > wants IMMUTABLE and does not want STRICT/RETURNS NULL ON NULL INPUT.
>>
>> I checked if function was inlined or not. When I mark function as
>> strict then it was inlined. When I mark function as IMMUTABLE then it
>> wasn't inlined. That's all - you can check it too.
>
> I will be checking different things, please say what you're testing.
>

look on thread "IMMUTABLE break inlining simple SQL functions."

Pavel

> Different things are inlined in different places, its the different
> places things get inlined that cause the optimizer to do different
> things.
>
> --
>  Sam  http://samason.me.uk/
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Division by zero

2009-08-02 Thread Tom Lane

Sam Mason  writes:
> I'm not sure what you're responding to here, but I'm pretty sure the OP
> wants IMMUTABLE and does not want STRICT/RETURNS NULL ON NULL INPUT.

Yeah --- STRICT will prevent inlining.  The function's expression
actually is strict, but the planner isn't smart enough about NULLIF
to realize that, so it doesn't inline.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] building a binary-portable database

2009-08-02 Thread Tom Lane

Martijn van Oosterhout  writes:
> On Sun, Aug 02, 2009 at 12:02:41PM -0400, Alexy Khrabrov wrote:
>> How about portability between systems with the same endianness and  
>> bitness, e.g. Intel 64-bit ones?

> There's nothing to stop you trying, but there's been no effort in
> making it work.

Exactly.  It might work, but we don't promise it or test for it.
If it breaks you get to keep both pieces.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Problem trying to load trigger

2009-08-02 Thread Michael Gould

CREATE OR REPLACE FUNCTION iss.accessor_trigger()  RETURNS "trigger" AS $$
BEGIN

IF(TG_WHEN = 'BEFORE') THEN
  IF(TG_OP = 'INSERT') THEN
NEW.createdatetime := current_timestamp
NEW.createuser := current_user
  ELSIF (TG_OP = 'UPDATE') THEN
INSERT into iss.auditaccessor SELECT 'B','C',OLD.*;
  ELSIF (TG_OP = 'DELETE') THEN
INSERT into iss.auditaccessor SELECT 'B','D',OLD.*;
  END IF;
  RETURN NEW;
ELSIF (TG_WHEN = 'AFTER') THEN
  IF(TG_OP = 'INSERT') THEN
  ELSIF (TG_OP = 'UPDATE') THEN
INSERT into iss.auditaccessor SELECT 'A','C',NEW.*;
  ELSIF (TG_OP = 'DELETE') THEN
  END IF;
  RETURN OLD;
END IF;

END $$ LANGUAGE plpgsql VOLATILE;

I'm trying to use a single trigger to do a couple of things...

The first is when a record is created to timestamp the createdatetime and
the createuser columns with the current date/time or user.  If there is a
update then I want to make before and after images of the record and if a
delete I want to keep the before image of the record.

when I try and load this I get the following errors.


I'm new to postgres so I'm not sure where I'm off here.  Any help is greatly
appreciated

ERROR:  syntax error at or near "$1"
LINE 1: SELECT  current_timestamp  $1  := current_user ELSIF ( $2  =...
   ^
QUERY:  SELECT  current_timestamp  $1  := current_user ELSIF ( $2  =
'UPDATE') THEN INSERT into iss.auditaccessor SELECT 'B','C', $3 .*
CONTEXT:  SQL statement in PL/PgSQL function "accessor_trigger" near line 8

** Error **

ERROR: syntax error at or near "$1"
SQL state: 42601
Context: SQL statement in PL/PgSQL function "accessor_trigger" near line 8





--
Michael Gould, Managing Partner
Intermodal Software Solutions, LLC
904.226.0978
904.592.5250 fax



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Problem trying to load trigger

2009-08-02 Thread Alban Hertroys


On 2 Aug 2009, at 19:08, Michael Gould wrote:

CREATE OR REPLACE FUNCTION iss.accessor_trigger()  RETURNS "trigger"  
AS $$

BEGIN

IF(TG_WHEN = 'BEFORE') THEN
 IF(TG_OP = 'INSERT') THEN
   NEW.createdatetime := current_timestamp
   NEW.createuser := current_user
 ELSIF (TG_OP = 'UPDATE') THEN
   INSERT into iss.auditaccessor SELECT 'B','C',OLD.*;
 ELSIF (TG_OP = 'DELETE') THEN
   INSERT into iss.auditaccessor SELECT 'B','D',OLD.*;
 END IF;
 RETURN NEW;
ELSIF (TG_WHEN = 'AFTER') THEN
 IF(TG_OP = 'INSERT') THEN
 ELSIF (TG_OP = 'UPDATE') THEN
   INSERT into iss.auditaccessor SELECT 'A','C',NEW.*;
 ELSIF (TG_OP = 'DELETE') THEN
 END IF;
 RETURN OLD;
END IF;

END $$ LANGUAGE plpgsql VOLATILE;



ERROR:  syntax error at or near "$1"
LINE 1: SELECT  current_timestamp  $1  := current_user ELSIF ( $2   
=...

  ^
QUERY:  SELECT  current_timestamp  $1  := current_user ELSIF ( $2  =
'UPDATE') THEN INSERT into iss.auditaccessor SELECT 'B','C', $3 .*
CONTEXT:  SQL statement in PL/PgSQL function "accessor_trigger" near  
line 8


** Error **

ERROR: syntax error at or near "$1"
SQL state: 42601
Context: SQL statement in PL/PgSQL function "accessor_trigger" near  
line 8



You're missing a few semi-colons after the first two assignments. I'm  
also not sure whether the empty THEN clauses at lines 14 and 17 will  
be accepted, it's probably better to leave them out. And lastly, you  
don't need braces around your conditional expressions or identifier  
quotation around the RETURN-type.


Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4a75cff110131139260432!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Problem trying to load trigger

2009-08-02 Thread Tom Lane

Michael Gould  writes:
> CREATE OR REPLACE FUNCTION iss.accessor_trigger()  RETURNS "trigger" AS $$
> BEGIN

> IF(TG_WHEN = 'BEFORE') THEN
>   IF(TG_OP = 'INSERT') THEN
> NEW.createdatetime := current_timestamp
> NEW.createuser := current_user


You've forgotten to end these statements with semicolons ...

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How to execute external script from a TRIGGER or FUNCTION ?

2009-08-02 Thread Denis BUCHER

Andy Colson a écrit :
>>> I need to execute an external script from Postgresql, it could be in
>>> perl, in bash/shell, in whatever...
>>>
>>> Any help would be appreciated ! I just need a very simple example if
>>> possible...
>>>
>>> I already searched on the web but found nothing...
>>
>> After hours of search, I searched just some more and I think I found the
>> solution, hope it can be useful to someone else :
>>
>> CREATE LANGUAGE plperlu;
>>
>> CREATE FUNCTION test_perl_external (integer) RETURNS boolean AS $$
>>$filename = '/tmp/somefile';
>> if (-e $filename) { return true; }
>> return false;
>> $$ LANGUAGE plperlu;
>>
>> SELECT test_perl_external(1);
> 
> You want to run a script?  The above would test to see if it exists, but
> not run it.
> 
> Use qx/cmd args/  or system('cmd args') to run it.  (In perl anyway, I
> dunno if plperl supports it)

Yes sorry that's correct, my example was just to show a solution to
access the outside world ! This is my real final version :

CREATE OR REPLACE FUNCTION hds_verifycustomer (integer) RETURNS boolean
AS $$
my $no_client = @_[0];
# Verify if customer exists in AS 400 system
$checkexitcode = system
("~postgres/scripts/checklive-as400-customer.pl $no_client >/dev/null
2>/dev/null");
if ($checkexitcode > 0) { return false; }
# Ok update has been done
return true;
$$ LANGUAGE plperlu;

Note, "LANGUAGE" is "plperlu" (u=unsafe) and not "plperl", because
otherwise you can't access the "outside world"...

This function is used in a very complex function that makes everything
transparent to use the "local" postgresql customer database :

SELECT * FROM hds_findcustomer(10234);

This function :
1. Returns the customer if present in postgres
2. Otherwise executes the external script (check and update)
3. And returns the customer if updated

Denis

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] character 0xe29986 of encoding "UTF8" has no equivalent in "LATIN2"

2009-08-02 Thread Andreas Kalsch

The function "convert_to(string text, dest_encoding name)" will throw an 
error and so break my application when not supported characters are 
included in the unicode string.

So what can I do
- to filter characters out which have no counterpart in the latin codesets
- or to simple ignore wrong characters?

Problem: Users will enter _any_ characters in my application and an 
error really doesn't help in this case.


What I am searching for is a function to undiacritic special letters to 
simple ones.


There is provided an example - 
http://www.postgres.cz/index.php/PostgreSQL_SQL_Tricks#Diacritic_removing 
- which will not work because of the error, when I put _any_ valid UTF8 
character to the functions.


Best,

Andi

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] character 0xe29986 of encoding "UTF8" has no equivalent in "LATIN2"

2009-08-02 Thread Sam Mason

On Sun, Aug 02, 2009 at 08:45:52PM +0200, Andreas Kalsch wrote:
> Problem: Users will enter _any_ characters in my application and an 
> error really doesn't help in this case.

Then why don't you stop converting to LATIN2?

> What I am searching for is a function to undiacritic special letters to 
> simple ones.

It would be easy to write a regex to strip out the invalid characters if
that's what you want.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Use of Postgres in REST/JSON with BLOBS

2009-08-02 Thread Dennis Gearon


I decided on a 'Hybrid Server' approach. I will be piping the BLOBS directly 
out of the filesystem using HTTP from a 'nginx' server. I *MAY* store the 'real 
filename' in the database, but I definitely will be storing permissions in the 
database.

I've done EASILY 2 months worth of research and experiments in order to come to 
the solution that I am using. In order to share that with the Open Source 
world, I am posting my blog address to the related newsgroups that I'm on.

http://php-rest-i18n.blogspot.com/

Comment, save yourself some research, etc.

Dennis Gearon

Signature Warning

EARTH has a Right To Life

I agree with Bolivian President Evo Morales

# The right to life: "The right for no ecosystem to be eliminated by the 
irresponsible acts of human beings."

# The right of biosystems to regenerate themselves: "Development cannot be 
infinite. There's a limit on everything."

# The right to a clean life: "The right for Mother Earth to live without 
contamination, pollution. Fish and animals and trees have rights."

# The right to harmony and balance between everyone and everything: "We are all 
interdependent."


See the movie - 'Inconvenient Truth'
See the movie - 'Syriana'

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Problem trying to load trigger

2009-08-02 Thread Michael Gould

Thanks to everyone who answered.  Getting used to PostGres's unique syntax
can take time getting used to.

Best Regards

Michael Gould

"Tom Lane"  wrote:
> Michael Gould  writes:
>> CREATE OR REPLACE FUNCTION iss.accessor_trigger()  RETURNS "trigger" AS
$$
>> BEGIN
> 
>> IF(TG_WHEN = 'BEFORE') THEN
>>   IF(TG_OP = 'INSERT') THEN
>> NEW.createdatetime := current_timestamp
>> NEW.createuser := current_user
> 
> 
> You've forgotten to end these statements with semicolons ...
> 
> regards, tom lane
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
> 



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] questions on (parallel) COPY and when to REINDEX

2009-08-02 Thread Janet Jacobsen


Thanks for your reply.  Responses below, and one follow-up
question about when/how often to use analyze.

Janet


On 02/08/2009 05:09 a.m., Alban Hertroys wrote:

On 1 Aug 2009, at 23:24, Janet Jacobsen wrote:

My questions are:
(2) Should I REINDEX these two tables daily after the pipeline
completes?  Is this what other people do in practice?


No need, but as people are querying as soon as data is arriving, an 
analyse of the table you just copied to should help performance - the 
query plans will be derived from more accurate table statistics that way.


The files that are being copied into the table(s) are between 200 to 1,000
each, but there are hundreds of these small files every day.  Would you
recommend running analyze after every COPY? 



(3) Currently the pipeline executes in serial fashion.  We'd
like to cut the wall clock time down as much as possible.
The data processing and data analysis can be done in parallel,
but can the loading of the database be done in parallel, i.e.,
can I execute four parallel COPY commands from four copies
of a script?  Our initial attempt at doing this failed.  I found one
posting in the archives about parallel COPY, but it doesn't seem
to be quite on point.


As long as you're not using the same database-session in parallel you 
should be fine. You can't do parallel database operations in the same 
session. Last time I did something similar I used separate database 
connections.


Yes, they are separate database connections because the connection
is made in the data loading script.



(5) If I drop the indexes and foreign key constraints, then is it
possible to COPY to a table from more than one script, i.e., do
parallel COPY?  It seems like a really bad idea to drop those
foreign key constraints.


You can COPY in parallel, but having no FK's does help 
insert-performance. In that case whether you should or shouldn't 
remove your FK's depends on what's more important to you; insert 
performance or data correctness.


As some of your users query the data while it's still coming in I 
guess that data correctness is in fact more important to you and you 
should keep the FK's.


Yes, I think we should keep the FK's.


You wrote that your pipeline runs for a period of 4 hours and the 
table is about 15M rows now. What is taking up all that time? I 
understand why you'd want to parallelise that process, but where do 
you expect the gains?



We're processing images, and the data processing and analysis takes
up most of the time, but the images can be processed/analyzed in parallel.
We've been doing all of the data loading at the end - one COPY at a
time.  Originally that made sense because the researchers wanted to
check the images before loading the data/analysis results into the db.

But now we want to load the data/analysis results as soon as they are
available, so if there are four copies of the processing/analysis script
running, we want to have the data loading script initiated at the end of
each processing/analysis script, which is equivalent to four COPYs
writing four different files into the same table.

In terms of time, we will not see a huge decrease in the wall clock time,
but the data/analysis results will be available to the users much sooner,
and this matters a lot to the researchers.


From the above I'm guessing that part of the problem you want to solve 
by parallelising is insert-performance. In cases like these I've seen 
people with problems with the look-up speed of foreign keys because 
the database chose a bad query plan. Regular analyses during inserting 
data should prevent that.


Okay, so I can try this.



Should re-think about where our database loading fits into the
overall pipeline, i.e., do the data processing and analysis in
parallel, but keep the data loading sequential?  The reason for
not doing all of the data loading at the end is that some of the
users *really* want to start querying the data and analysis
results as soon as they enter the database.


You seem to have a design where the database grows indefinitely? Your 
database doesn't look to be very big yet, but I expect that at some 
point in the future your data will grow so much that the ordering of 
it on disk starts to matter for how fast records can be fetched. 
That's a typical scenario where people here start to advise using 
table partitioning.


The thing is though that partitioning only works well if the queries 
your users perform contain clauses of which the database knows they 
divide the data (the same clauses used to partition the table).


The people you work with apparently are knowledgeable enough that they 
create their own queries. They will have to take partition constraints 
into account too if you choose to use table partitioning.




Yes, there are a couple of tables that it seems will grow indefinitely.
We are not in production yet, so there is some question about
should so much be going into the database.  For now the answer is
"yes, everything".

L

Re: [GENERAL] questions on (parallel) COPY and when to REINDEX

2009-08-02 Thread andy


Janet Jacobsen wrote:

Thanks for your reply.  Responses below to answer your questions,
and one follow-up question on REINDEX.

Janet



Both tables have one or two foreign key constraints.

(2) Should I REINDEX these two tables daily after the pipeline
completes?  Is this what other people do in practice?


it depends if an index exists on the table when you fill it with 
data.  But I repeat myself :-).  If an index exists you would not need 
to reindex it.  It may be faster to fill a table without an index, 
then add an index later.  But that would depend on if you need the 
index for unique constraints.




Ok.  Since data loading occurs daily, and the users query the table
while the data loading is going on, it seems like I should not drop
the indexes before the daily loading. 


I re-read the REINDEX pages.  I see the following statement,

"Also, for B-tree indexes a freshly-constructed index is somewhat
faster to access than one that has been updated many times, because
logically adjacent pages are usually also physically adjacent in a
newly built index. (This consideration does not currently apply to
non-B-tree indexes.) It might be worthwhile to reindex periodically
just to improve access speed."

This quote says "has been updated many times" and "worthwhile to index
periodically".  I'm not sure how to interpret "many times" and 
"periodically".


In our case, on a daily basis, 100K rows or so are added to two tables,
and a later script does 100K updates on one of the table. Does that make
us a candidate for daily REINDEXing?


Its tough to say.  I'd guess not every day.  Once a month?  The best way to 
find out is to do some timing.  Do a few indexed select statements and 'explain 
analyze' them.  See what the numbers tell you.  Then REINDEX and do the same 
test.  Then wait a month and try the same test.  See if its much slower.

The difference between having an index and not is hugely huge orders of 
magnitude.  The difference between a balanced index and unbalanced is minor.

A vacuum analyze might be more important than a reindex, depending on how many 
indexes you have, it will update the stats about the indexes and help the 
planner pick the best index.

I cant answer as to what others do, my pg database is 25meg.  Yes meg.  And 
260K rows.  Its embarrassing.  By next month I'll probably be up to 260.5K 
rows.  So I really have no experience with the volume of data your pushing 
around.





(3) Currently the pipeline executes in serial fashion.  We'd
like to cut the wall clock time down as much as possible.
The data processing and data analysis can be done in parallel,
but can the loading of the database be done in parallel, i.e.,
can I execute four parallel COPY commands from four copies


We'd need more specifics.  Are you COPY'ing into two different tables 
at once?  (that should work).  Or the same table with different data 
(that should work too, I'd guess) or the same data with a unique key 
(that'll break)?




We'd like to run four identical scripts in parallel, each of which will
copy a different file into the same table.
Our initial attempt at doing this failed.  


What was the error?



If the return status from trying to do the COPY is 7, the script prints
a failure message, i.e., not the Postgres error. I will set up a test on
a test case to get more information.  (I didn't run the initial try.)


COPY wont return 7.  Not sure where the 7 comes from.  The copy may fail and 
whatever language your programming in my raise an exception, which might be 
numbered 7... I suppose.  Multiple copy's into the same table at the same time 
should work.  I think the error was on your side.

Note that while you are in a transaction your clients wont be able to see any 
of the data until you commit.  Since some of them want at the data asap, you 
might want to break up your copy's with a few commits.  I sur'pose tha'd depend 
on how long it all takes though.



I found one
posting in the archives about parallel COPY, but it doesn't seem
to be quite on point.


They have added parallel copy to the pg_restore, but I think that does 
different tables, not the same table.  Was that what you saw?


Yes, I think so.  The reference is to "Andrews parallel restore patch":
http://archives.postgresql.org/pgsql-hackers/2008-09/msg01711.php
The subject line is "lock contention on parallel COPY ?"


Yeah, that's an internal lock on some really small variable deep in the guts of 
pg core.  Not an entire table lock.


-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] questions on (parallel) COPY and when to REINDEX

2009-08-02 Thread Andy Colson


On 1 Aug 2009, at 23:24, Janet Jacobsen wrote:

My questions are:
(2) Should I REINDEX these two tables daily after the pipeline
completes?  Is this what other people do in practice?


No need, but as people are querying as soon as data is arriving, an 
analyse of the table you just copied to should help performance - the 
query plans will be derived from more accurate table statistics that way.


The files that are being copied into the table(s) are between 200 to 1,000
each, but there are hundreds of these small files every day.  Would you
recommend running analyze after every COPY?


Just once, at the end.  Assuming you are only deleting 100k records and 
re-adding/updating another 100k in a batch.  That's not so many records it'll 
through the stats out of whack.  (If you were dropping/updating 15M in a batch 
you might wanna analyze in the middle)



You wrote that your pipeline runs for a period of 4 hours and the 
table is about 15M rows now. What is taking up all that time? I 
understand why you'd want to parallelise that process, but where do 
you expect the gains?



We're processing images, and the data processing and analysis takes
up most of the time, but the images can be processed/analyzed in parallel.
We've been doing all of the data loading at the end - one COPY at a
time.  Originally that made sense because the researchers wanted to
check the images before loading the data/analysis results into the db.


Ah!  Images!  When you are doing the COPY are you escaping the data?  You cant 
"just" copy a binary file.

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Problem with Numerics multiplication in C-function

2009-08-02 Thread Ilya Urikh

Hi, I have a strange problem with Numeric multiplication in C-function.

There are 2 functions getRate and getVAT which return Numeric. In 3rd
function calculateService I try to multiply the results of getRate and
getVAT. After execution I have two types of error, some time without
messages and with message "Invalid memory alloc ... ".
If I initialize the Numeric variables inside calculateService and multiply,
function numeric_mul works fine.

PostgreSQL 8.3.7

Datum getRate(PG_FUNCTION_ARGS) {
int32 accountId = PG_GETARG_INT32(0);
int16 serviceId = PG_GETARG_INT16(1);
DateADT date = PG_GETARG_DATEADT(2);

bool isNull;
char command[QUERY_MAX_SIZE];
char message[MESSAGE_MAX_SIZE];

Numeric rate = DatumGetNumeric(DirectFunctionCall3(numeric_in,
CStringGetDatum(RATE_ERROR_CSTRING_VALUE), 0, -1));

//Build SQL query
snprintf(command, sizeof (command), "...");

SPI_connect();
SPI_execute(command, true, 0);
rate = DatumGetNumeric(SPI_getbinval(SPI_tuptable->vals[0],
 SPI_tuptable->tupdesc,
 1,
 &isNull));

#ifdef PR_DEBUG
snprintf(message, sizeof (message), " getRate: Returns rate =
%s.",
DatumGetCString(DirectFunctionCall1(numeric_out,
NumericGetDatum(rate;
elog(INFO, message);
#endif
SPI_finish();
PG_RETURN_NUMERIC(rate);
}

Datum calculateService(PG_FUNCTION_ARGS) {
// Like the getRate
}

Datum calculateService(PG_FUNCTION_ARGS) {
int32   accountId = PG_GETARG_INT32(0);
int16   serviceId = PG_GETARG_INT16(1);
DateADT date = PG_GETARG_DATEADT(2);
int32   transactionRegisterId = PG_GETARG_INT32(3);

Numeric rate;
Numeric vat;
Numeric amount;

rate = DatumGetNumeric(DirectFunctionCall3(getRate,
Int32GetDatum(accountId),
Int16GetDatum(serviceId),
DateADTGetDatum(date)));
vat = DatumGetNumeric(DirectFunctionCall1(getVAT,
DateADTGetDatum(date)));

#ifdef PR_DEBUG
snprintf(message, sizeof (message), " calculateService: rate =
%s, vat",
DatumGetCString(DirectFunctionCall1(numeric_out,
NumericGetDatum(rate;
elog(INFO, message);
#endif

amount = DatumGetNumeric(DirectFunctionCall2(numeric_mul,
NumericGetDatum(rate),
NumericGetDatum(vat)));

// ERROR
...
}


-- 
Best regards,
Ilya Urikh.

Re: [GENERAL] questions on (parallel) COPY and when to REINDEX

2009-08-02 Thread Janet Jacobsen


Hi, Andy.  Thanks for the responses and information.

Just to let you know... what we are storing in the db are the image
attributes - about 40 of them - not the images.  So the COPY
is reading an ascii file of the image attributes.  It turns out to be
useful to have the image attributes handy - much better than reading
the image headers. The images are available on spinning disk, and
the image locations are in the db.

Thanks,
Janet


On 02/08/2009 05:59 p.m., Andy Colson wrote:

On 1 Aug 2009, at 23:24, Janet Jacobsen wrote:

My questions are:
(2) Should I REINDEX these two tables daily after the pipeline
completes?  Is this what other people do in practice?


No need, but as people are querying as soon as data is arriving, an 
analyse of the table you just copied to should help performance - 
the query plans will be derived from more accurate table statistics 
that way.


The files that are being copied into the table(s) are between 200 to 
1,000

each, but there are hundreds of these small files every day.  Would you
recommend running analyze after every COPY?


Just once, at the end.  Assuming you are only deleting 100k records 
and re-adding/updating another 100k in a batch.  That's not so many 
records it'll through the stats out of whack.  (If you were 
dropping/updating 15M in a batch you might wanna analyze in the middle)




You wrote that your pipeline runs for a period of 4 hours and the 
table is about 15M rows now. What is taking up all that time? I 
understand why you'd want to parallelise that process, but where do 
you expect the gains?



We're processing images, and the data processing and analysis takes
up most of the time, but the images can be processed/analyzed in 
parallel.

We've been doing all of the data loading at the end - one COPY at a
time.  Originally that made sense because the researchers wanted to
check the images before loading the data/analysis results into the db.


Ah!  Images!  When you are doing the COPY are you escaping the data?  
You cant "just" copy a binary file.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Problem with Numerics multiplication in C-function

2009-08-02 Thread Ilya Urikh

Hi, I have a strange problem with Numeric multiplication in C-function.

There are 2 functions getRate and getVAT which return Numeric. In 3rd
function calculateService I try to multiply the results of getRate and
getVAT. After execution I have two types of error, some time without
messages and with message "Invalid memory alloc ... ".
If I initialize the Numeric variables inside calculateService and
multiply, function numeric_mul works fine.

PostgreSQL 8.3.7

Datum getRate(PG_FUNCTION_ARGS) {
int32 accountId = PG_GETARG_INT32(0);
int16 serviceId = PG_GETARG_INT16(1);
DateADT date = PG_GETARG_DATEADT(2);

bool isNull;
char command[QUERY_MAX_SIZE];
char message[MESSAGE_MAX_SIZE];

Numeric rate = DatumGetNumeric(DirectFunctionCall3(numeric_in,
CStringGetDatum(RATE_ERROR_CSTRING_VALUE), 0, -1));

//Build SQL query
snprintf(command, sizeof (command), "...");

SPI_connect();
SPI_execute(command, true, 0);
rate = DatumGetNumeric(SPI_getbinval(SPI_tuptable->vals[0],
 SPI_tuptable->tupdesc,
 1,
 &isNull));

#ifdef PR_DEBUG
snprintf(message, sizeof (message), " getRate: Returns rate =
%s.",
DatumGetCString(DirectFunctionCall1(numeric_out,
NumericGetDatum(rate;
elog(INFO, message);
#endif
SPI_finish();
PG_RETURN_NUMERIC(rate);
}

Datum calculateService(PG_FUNCTION_ARGS) {
// Like the getRate
}

Datum calculateService(PG_FUNCTION_ARGS) {
int32   accountId = PG_GETARG_INT32(0);
int16   serviceId = PG_GETARG_INT16(1);
DateADT date = PG_GETARG_DATEADT(2);
int32   transactionRegisterId = PG_GETARG_INT32(3);

Numeric rate;
Numeric vat;
Numeric amount;

rate = DatumGetNumeric(DirectFunctionCall3(getRate,
Int32GetDatum(accountId),
Int16GetDatum(serviceId),
DateADTGetDatum(date)));
vat = DatumGetNumeric(DirectFunctionCall1(getVAT,
DateADTGetDatum(date)));

#ifdef PR_DEBUG
snprintf(message, sizeof (message), " calculateService: rate
= %s, vat",
DatumGetCString(DirectFunctionCall1(numeric_out,
NumericGetDatum(rate;
elog(INFO, message);
#endif

amount = DatumGetNumeric(DirectFunctionCall2(numeric_mul,
NumericGetDatum(rate),
NumericGetDatum(vat)));

// ERROR
...
}


Best regards,
Ilya Urikh.

Re: [GENERAL] Drop Cluster]

2009-08-02 Thread sweta





>
> It seems you are trying to drop a replication cluster, to drop a
> this you have to do as follows:
>

Yes it is a replication cluster. I am using Slony to replicate
my 8.1.11 database on CentOS 5.


> connect to the node 1(i.e., Master DB) database, check whether the
> _replcluster schema is already exists or not by giving the command  "/dn"
>

 this give -->

[r...@quirinus pg_log]# su - postgres
-bash-3.2$ /dn
-bash: /dn: No such file or directory
-bash-3.2$ dn
-bash: dn: command not found



 Regards,
 Sweta




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Drop Cluster]

2009-08-02 Thread Abbas

On Mon, Aug 3, 2009 at 11:11 AM,  wrote:

>
>
>
>
> >
> > It seems you are trying to drop a replication cluster, to drop a
> > this you have to do as follows:
> >
>
> Yes it is a replication cluster. I am using Slony to replicate
> my 8.1.11 database on CentOS 5.
>
>
> > connect to the node 1(i.e., Master DB) database, check whether the
> > _replcluster schema is already exists or not by giving the command  "/dn"
> >
>
>  this give -->
>
> [r...@quirinus pg_log]# su - postgres
> -bash-3.2$ /dn
> -bash: /dn: No such file or directory
> -bash-3.2$ dn
> -bash: dn: command not found
>
> Need to fire these commands by connecting to your database.


>
>
>  Regards,
>  Sweta
>
>
>
>

Re: [GENERAL] synchronous_commit=off doesn't always return immediately

2009-08-02 Thread tomrevam

Tom Lane-2 wrote:
> 
> tomrevam  writes:
> It might help to increase wal_buffers.
> 

Hi,

I increased the wal_buffers to 2 MB. Initially this improved the performance
very much, but after 4 days of continuous operation the system is back to
very long inserts and updates (at least as bad as it was before).
Is there a way to know what resource the DB is running out of?

Thanks,
Tomer
-- 
View this message in context: 
http://www.nabble.com/synchronous_commit%3Doff-doesn%27t-always-return-immediately-tp24621119p24785860.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Drop Cluster

2009-08-02 Thread sweta

hi,

 Thankyou  Its done :)

Regards,
Sweta


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL server listen on other port than 5432

Re: [GENERAL] building a binary-portable database

Re: [GENERAL] Division by zero

Re: [GENERAL] Division by zero

Re: [GENERAL] questions on (parallel) COPY and when to REINDEX

Re: [GENERAL] Division by zero

Re: [GENERAL] Run procedure at startup

Re: [GENERAL] Division by zero

Re: [GENERAL] Division by zero

Re: [GENERAL] Division by zero

Re: [GENERAL] building a binary-portable database

Re: [GENERAL] Division by zero

Re: [GENERAL] building a binary-portable database

Re: [GENERAL] Division by zero

Re: [GENERAL] Division by zero

Re: [GENERAL] Division by zero

Re: [GENERAL] building a binary-portable database

[GENERAL] Problem trying to load trigger

Re: [GENERAL] Problem trying to load trigger

Re: [GENERAL] Problem trying to load trigger

Re: [GENERAL] How to execute external script from a TRIGGER or FUNCTION ?

[GENERAL] character 0xe29986 of encoding "UTF8" has no equivalent in "LATIN2"

Re: [GENERAL] character 0xe29986 of encoding "UTF8" has no equivalent in "LATIN2"

[GENERAL] Use of Postgres in REST/JSON with BLOBS

Re: [GENERAL] Problem trying to load trigger

Re: [GENERAL] questions on (parallel) COPY and when to REINDEX

Re: [GENERAL] questions on (parallel) COPY and when to REINDEX

Re: [GENERAL] questions on (parallel) COPY and when to REINDEX

[GENERAL] Problem with Numerics multiplication in C-function

Re: [GENERAL] questions on (parallel) COPY and when to REINDEX

[GENERAL] Problem with Numerics multiplication in C-function

Re: [GENERAL] Drop Cluster]

Re: [GENERAL] Drop Cluster]

Re: [GENERAL] synchronous_commit=off doesn't always return immediately

Re: [GENERAL] Drop Cluster

35 matches

Site Navigation

Mail list logo

Footer information