date:20121010

Re: [GENERAL] moving from MySQL to pgsql

2012-10-10 Thread F. BROUARD / SQLpro


Le 10/10/2012 10:47, Vineet Deodhar a écrit :

Hi !

3) Can I simulate MySQL's TINYINT data-type (using maybe the custom data
type or something else)



Another way, and a good practice toot is to use SQL DOMAINs wich is a 
part of the ISO SQL since 1992 that MySQL don't have...


CREATE DOMAIN TINYINT
   AS SMALLINT
   CHECK (VALUE BETWEEN 0 AND 255)

Most part of the modelling tools are able to use DOMAINs in their 
modelling process, like PowerDesigner, Mega, ERwin, Rational...


A +

--
Frédéric BROUARD - expert SGBDR et SQL - MVP SQL Server - 06 11 86 40 66
Le site sur le langage SQL et les SGBDR  :  http://sqlpro.developpez.com
Enseignant Arts & Métiers PACA, ISEN Toulon et CESI/EXIA Aix en Provence
Audit, conseil, expertise, formation, modélisation, tuning, optimisation
*** http://www.sqlspot.com *



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Storing large files in multiple schemas: BLOB or BYTEA

2012-10-10 Thread tigran2-postgres

>Yeah, a pg_dump mode that dumped everything but large objects would be
nice.

There is option -b for pg_dump which controls whether large objects are
dumped or no. The problem is that with option -b it dumps all large objects
regardless of what schema you requested it to dump using option -n.
Otherwise it works fine.


>I'm now wondering about the idea of implementing a pg_dump option that 
>dumped large objects into a directory tree like
>   lobs/[loid]/[lob_md5]
>and wrote out a restore script that loaded them using `lo_import`.
>
>During dumping temporary copies could be written to something like 
>lobs/[loid]/.tmp. with the md5 being calculated on the fly as the 
>byte stream is read. If the dumped file had the same md5 as the existing 
>one it'd just delete the tempfile; otherwise the tempfile would be 
>renamed to the calculated md5.
>
>That way incremental backup systems could manage the dumped LOB tree 
>without quite the same horrible degree of duplication as is currently 
>faced when using lo in the database with pg_dump.
>
>A last_modified timestamp on `pg_largeobject_metadata` would be even 
>better, allowing the cost of reading and discarding rarely-changed large 
>objects to be avoided.

Definitely interesting idea with incremental backups.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] moving from MySQL to pgsql

2012-10-10 Thread Vineet Deodhar

On Thu, Oct 11, 2012 at 5:26 AM, Ondrej Ivanič wrote:

> Hi,
>
> On 10 October 2012 19:47, Vineet Deodhar  wrote:
> > 3) Can I simulate MySQL's TINYINT data-type (using maybe the custom data
> > type or something else)
>
> What do you exactly mean? Do you care about storage requirements or
> constraints? The smallest numeric type in postgres is smallint: range
> is +/- 32K and you need two bytes. You can use check constraint to
> restrict the range (postgres doesn't have signed / unsigned types):
>
> create table T (
>   tint_signed smallint check ( tint_signed >= -128 and tint_signed =< 127
> ),
>   tint_unsigned smallint check ( tint_unsigned >= 0 and tint_unsigned =<
> 255 )
> )
>
>
Yes. Considering the storage requirements , I am looking for TINYINT kind
of data type.


> if you care about storage then "char" (yes, with quotes) might be the
> right type for you.
>
>> --
>> Ondrej Ivanic
>> (ondrej.iva...@gmail.com)
>> (http://www.linkedin.com/in/ondrejivanic)
>>
>
>
If I use "char" for numeric field, would it be possible to do numeric
operations comparisons such as max(tint_unsigned) ?

--- Vineet

Re: [GENERAL] Storing large files in multiple schemas: BLOB or BYTEA

2012-10-10 Thread Craig Ringer


On 10/11/2012 01:35 PM, tigran2-postg...@riatest.com wrote:

Using files stored outside the database creates all sorts of problems.
For starters you lose ACID guaranties. I would prefer to keep them in
database. We did a lot of experiments with Large Objects and they really
worked fine (stored hundreds of LOBs ranging from a few MB up to 1GB).
Postgres does a really good job with Large Objects. If it was not the
pg_dump problem I would not hesitate to use LOBs.


Yeah, a pg_dump mode that dumped everything but large objects would be 
nice.


Right now I find storing large objects in the DB such a pain from a 
backup management point of view that I avoid it where possible.



I'm now wondering about the idea of implementing a pg_dump option that 
dumped large objects into a directory tree like

  lobs/[loid]/[lob_md5]
and wrote out a restore script that loaded them using `lo_import`.

During dumping temporary copies could be written to something like 
lobs/[loid]/.tmp. with the md5 being calculated on the fly as the 
byte stream is read. If the dumped file had the same md5 as the existing 
one it'd just delete the tempfile; otherwise the tempfile would be 
renamed to the calculated md5.


That way incremental backup systems could manage the dumped LOB tree 
without quite the same horrible degree of duplication as is currently 
faced when using lo in the database with pg_dump.


A last_modified timestamp on `pg_largeobject_metadata` would be even 
better, allowing the cost of reading and discarding rarely-changed large 
objects to be avoided.




--
Craig Ringer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing large files in multiple schemas: BLOB or BYTEA

2012-10-10 Thread tigran2-postgres

>I believe the general consensus around here is to not do that, if you can
avoid it. File systems are much better equipped to handle files of that
magnitude, especially when it comes to retrieving them, scanning >through
their contents, or really, any access pattern aside from simple storage.

> 

>You're better off storing the blob on disk somewhere and storing a row that
refers to its location. Either key pieces for a naming scheme or the full
path.

> 

>This is especially true if you mean to later access that data with PHP.

> 

>--

>Shaun Thomas

 

Using files stored outside the database creates all sorts of problems. For
starters you lose ACID guaranties. I would prefer to keep them in database.
We did a lot of experiments with Large Objects and they really worked fine
(stored hundreds of LOBs ranging from a few MB up to 1GB). Postgres does a
really good job with Large Objects. If it was not the pg_dump problem I
would not hesitate to use LOBs.

Re: [GENERAL] Storing large files in multiple schemas: BLOB or BYTEA

2012-10-10 Thread tigran2-postgres

>Large Objects and bytea are the only ways.

> 

>If you want to pg_dump only certain large objects, that won't work as far
as I know (maybe using permissions and a non-superuser can help).

> 

>You absolutely need to pg_dump parts of the database regularly?

> 

>Yours,

>Laurenz Albe

 

It is not an absolute requirement but would be really nice to have. We have
a multi-tenant database with each tenant data stored in a separate scheme.
Using pg_dump seems to be the ideal way to migrate tenant data from one
database to another when we need to do it to balance the load.

Re: [GENERAL] Shorthand syntax for triggers

2012-10-10 Thread Craig Ringer


On 10/11/2012 12:22 AM, Joe Van Dyk wrote:


3. Triggers can access a special CHANGED value that's either NEW for
insert or updates, or OLD for deletes.


I'm not a big fan of the prior comments about small syntax changes, but 
this would simplify quite a bit of code. I'd *really* like a way to 
refer to "NEW for INSERT or UPDATE, OLD for DELETE" as an implicit 
automatic variable.


I do see the appeal of making trigger functions anonymous, but I really 
doubt it's worth the hassle.



4. Default for 'after insert' triggers is to return null, as I believe
it doesn't matter what you return here.


This is a trivial convenience, but not one I'd be against.


5. Way less repetitive typing.


If you're repeating the same triggers over and over you may want to look 
at writing them to be re-usable. See eg:


http://wiki.postgresql.org/wiki/Audit_trigger_91plus

--
Craig Ringer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] moving from MySQL to pgsql

2012-10-10 Thread Craig Ringer


On 10/10/2012 04:47 PM, Vineet Deodhar wrote:


2) I run MySQL from a USB stick.
There is no installation required (on WinXP.). (not tried on Ubuntu)
Is it the same for pgsql?


On Windows PostgreSQL is usually installed as a system service with its 
own user account (pre-9.2) or running in the network service account 
(9.2+). This isn't strictly required, though.


You can keep the .zip binary releases on a USB key and use pg_ctl to 
start/stop them from your own scripts. If you're bundling Pg in your 
application this may be the best choice. See:


  http://www.enterprisedb.com/products-services-training/pgbindownload

You *really* shouldn't keep the database its self on a USB key. 
Performance is likely to be terrible, and many USB keys have quite short 
write lifetimes so a database on a USB key can wear some of them out in 
a real hurry.


Think about your backup process too. With PostgreSQL you have a couple 
of options, including log archiving, periodic dumps, and warm standby. 
Please read the backup chapter of the manual in detail.


--
Craig Ringer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Planner chooses multi-column index in 9.2 when maybe it should not

2012-10-10 Thread Greg Sabino Mullane

On Wed, Oct 10, 2012 at 09:24:32PM -0400, Tom Lane wrote:
> I'd not been thinking of that change as something we'd risk
> back-patching, but maybe we should consider putting it into 9.2.  It
> seems like the index-only scan support has put a new level of premium on
> the quality of the planner's rowcount estimates.

Yes, please do! It's that or we globally disable index-only scans, 
which I'd prefer not to do. Let me know if you'd like me to test 
a patch, I can apply it and see if it fixes our issue at hand.

> Meanwhile, that range condition in itself looks a tad, er, klugy.
> Do you really need that, or is this a crummy way of stating
> foobar.id = m.id?

No, it's really needed. That's merely the tip of the kluginess; 
don't get me started! The dangers of an organically grown schema. :)
This thing has been growing since Postgres v6.

(Looking back at how far Postgres has come from 6.x to 9.2 
is truly awe-inspiring)

-- 
Greg Sabino Mullane g...@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8

pgp5oFeJLefSg.pgp
Description: PGP signature

Re: [GENERAL] moving from MySQL to pgsql

2012-10-10 Thread Joshua D. Drake



On 10/10/2012 02:18 AM, Sim Zacks wrote:


2) I run MySQL from a USB stick.
There is no installation required (on WinXP.). (not tried on Ubuntu)
Is it the same for pgsql?

To use postgres on a USB stick, see
http://www.postgresonline.com/journal/archives/172-Starting-PostgreSQL-in-windows-without-install.html


3) Can I simulate MySQL's TINYINT data-type (using maybe the custom
data type or something else)

You can either use bool or smallint with a constraint.


Or he could create a custom type easily enough.

JD




Sim



--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 'full_page_writes=off' , VACUUM and crashing streaming slaves...

2012-10-10 Thread Sean Chittenden

>> Oct  5 15:00:25 db01 postgres[76648]: [5944-1] javafail@dbcluster 76648 0: 
>> FATAL:  pipe() failed: Too many open files in system
> 
> This message must be coming from initSelfPipe(), and after poking around
> a bit I think the failure must be occurring while a new backend is
> attempting to do "OwnLatch(&MyProc->procLatch)" in InitProcess.  The
> reason the postmaster treats this as a crash is that the new backend
> just armed the dead-man switch (MarkPostmasterChildActive) but it exits
> without doing ProcKill which would disarm it.  So this is just an
> order-of-operations bug in InitProcess: we're assuming that it can't
> fail before reaching "on_shmem_exit(ProcKill, 0)", and the latch
> additions broke that.  (Though looking at it, assuming that the
> PGSemaphoreReset call cannot fail seems a tad risky too.)
> 
> So that explains the crashes, but it doesn't (directly) explain why you
> had data corruption.

I've since been able to repeat this with full_page_writes=on and have had 
identical corruption, so I don't think this is full_page_writes related any 
more. I can also confirm that there was just one crash by the master database 
because it required manual intervention to bring back up (a backup label was 
sitting stale in the data dir[1]).

The slaves died during the post-crash VACUUM just as earlier. I'm completing 
the VACUUM now and am seeing multiple warnings from VACUUM.

WARNING:  relation "tbl_a" page 2115352 is uninitialized --- fixing
WARNING:  relation "tbl_a" page 2115353 is uninitialized --- fixing
WARNING:  relation "tbl_a" page 2115354 is uninitialized --- fixing
WARNING:  relation "tbl_a" page 2115355 is uninitialized --- fixing
WARNING:  relation "tbl_a" page 2115356 is uninitialized --- fixing
WARNING:  relation "tbl_a" page 2115357 is uninitialized --- fixing
WARNING:  relation "tbl_a" page 2115358 is uninitialized --- fixing
WARNING:  relation "tbl_a" page 2115359 is uninitialized --- fixing
WARNING:  relation "tbl_a" page 2115360 is uninitialized --- fixing

On one of the slaves:

Oct 11 00:17:36 db02 postgres[66904]: [21-1] @ 66904 0: WARNING:  page 120547 
of relation base/16387/20196 is uninitialized
Oct 11 00:17:36 db02 postgres[66904]: [21-2] @ 66904 0: CONTEXT:  xlog redo 
vacuum: rel 1663/16387/20196; blk 125016, lastBlockVacuumed 0
Oct 11 00:17:36 db02 postgres[66904]: [22-1] @ 66904 0: PANIC:  WAL contains 
references to invalid pages
Oct 11 00:17:36 db02 postgres[66904]: [22-2] @ 66904 0: CONTEXT:  xlog redo 
vacuum: rel 1663/16387/20196; blk 125016, lastBlockVacuumed 0
Oct 11 00:17:36 db02 postgres[66897]: [10-1] @ 66897 0: LOG:  startup process 
(PID 66904) was terminated by signal 6: Abort trap
Oct 11 00:17:36 db02 postgres[66897]: [11-1] @ 66897 0: LOG:  terminating any 
other active server processes

Hopefully this is helpful information. -sc



[1] The backup label was residual from the initial sync and wasn't cleaned up 
for some reason during a pg_stop_backup(). For now I'm chalking this up as a 
bug in repmgr even though repmgr completed cloning the slave successfully 
(supposedly).

START WAL LOCATION: 9F/3620 (file 0001009F0036)
CHECKPOINT LOCATION: 9F/37E9D6D8
BACKUP METHOD: pg_start_backup
BACKUP FROM: master
START TIME: 2012-10-06 17:48:10 UTC
LABEL: repmgr_standby_clone_1349545601

--
Sean Chittenden
s...@chittenden.org



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can two “SELECT FOR UPDATE” statements on the same table cause a deadlock?

2012-10-10 Thread Tom Lane

Steve A  writes:
> In a nutshell, I'm curious about the order in which PG will lock rows during 
> a SELECT FOR UPDATE. If two simultaneous SELECT FOR UPDATE statements select 
> intersecting rows from the same table, can PG be relied upon to lock the rows 
> in a consistent manner that always avoids deadlock (e.g. in order of 
> ascending primary key)?

Only if you use ORDER BY in each such query to constrain the rows to be
locked in the same order.  You probably want to spend some time perusing
the fine manual very carefully:
http://www.postgresql.org/docs/9.2/static/sql-select.html#SQL-FOR-UPDATE-SHARE

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Planner chooses multi-column index in 9.2 when maybe it should not

2012-10-10 Thread Tom Lane

Greg Sabino Mullane  writes:
>  ->  Bitmap Heap Scan on foobar o  (C=30389..835271 R=8980 W=8) 
> (AT=0.06..0.07 R=1 L=1)
>Recheck Cond: (((id) >= (m.id)) AND ((id) <= ((m.id) || 
> '.99')))
>Filter: (((status) <> ALL ('{panda,penguin}'[])) \
>  AND ((id) ~ (('^' || (m.id)) || '(\.[0-9.]+)?$')))
>->  Bitmap Index Scan on foobar_pkey  (C=0..30386 R=1888670 W=0) 
> (AT=0.02..0.02 R=1 L=1)
>  Index Cond: (((id) >= (m.id)) AND ((id) <= ((m.id) || 
> '.99')))

Actually, looking closer, I think the problem is not with the estimation
of the index-only scan on the other index; the planner is estimating
that as pretty expensive, which it is.  The problem is that it thinks
the above bitmap scan is pretty expensive, when it isn't.  And the
reason evidently is that it's totally off in the weeds about the
selectivity of the range condition on foobar.id.  Anytime you've got
1888670 estimated rows and 1 actual row, you've got a problem.

This is related to the problem I was on about a couple weeks ago:
http://archives.postgresql.org/message-id/17655.1348874...@sss.pgh.pa.us
namely that the planner fails to recognize pairs of clauses as a range
constraint if they're join clauses.  If it had recognized that, you'd
have gotten an estimate that would still be far more than "1 row", but
would be more than an order of magnitude less than this one, which would
be enough to fix this problem.

I'd not been thinking of that change as something we'd risk
back-patching, but maybe we should consider putting it into 9.2.  It
seems like the index-only scan support has put a new level of premium on
the quality of the planner's rowcount estimates.

Meanwhile, that range condition in itself looks a tad, er, klugy.
Do you really need that, or is this a crummy way of stating
foobar.id = m.id?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Index only scan

2012-10-10 Thread Gavin Flower


On 11/10/12 12:41, Tom Lane wrote:

Gavin Flower  writes:

On 11/10/12 01:03, Lars Helge Øverland wrote:

My question is: Would it be feasible and/or possible to implement
index only scans in a way that it could take advantage of several,
single-column indexes? For example, a query spanning columns a, b, c
could take advantage of 3 single-column indexes put on columns a, b,
c.

Index only scans do use multiple indexes of single fields where
appropriate.  Here the planner determined it only needed to scan 2 of
the 3 relevant single field indexes.

But your example isn't an index-only scan ... it's a plain old bitmap
scan, and so it does touch the heap.

The difficulty with what Lars proposes is that there's no way to scan
the different indexes "in sync" --- each one will be ordered according
to its own column order.  In principle I guess we could read out the
index data and do a join using the ctid's, but it's far from clear that
such a thing would be worth the trouble.

regards, tom lane

Thanks for the correction!

Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] moving from MySQL to pgsql

2012-10-10 Thread Darren Duncan


I noticed something here that none of the other replies addressed.

Vineet Deodhar wrote:
3) Can I simulate MySQL's TINYINT data-type (using maybe the custom data 
type or something else)


The answer to your question depends on what you were using the TINYINT for.

For example, many people use TINYINT in MySQL to store BOOLEAN (true/false) data 
because MySQL doesn't have a real BOOLEAN type; MySQL has the BOOLEAN keyword, 
but if you choose it then what you actually get is a TINYINT.


And so, if you are actually using the field just for true/false or 1/0 values, 
then Postgres' BOOLEAN type (which is not a number) is the best thing for you to 
use.


If you were storing actual numbers outside that range, then use SMALLINT or 
something like that.


Any reverse engineering of the MySQL schema will never know you used the BOOLEAN 
keyword and just say you have a TINYINT, so you may have to study your schema 
and its uses more to know what kind of data/type you actually have.


-- Darren Duncan


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Planner chooses multi-column index in 9.2 when maybe it should not

2012-10-10 Thread Greg Sabino Mullane

Found a good demonstration of the problem. Here's explain analyze of a 
query on 9.2 with enable_indexonlyscan = off; This produces the exact same 
plan as 8.3. The tables in question have been analyzed. Changing 
random_page_cost has no effect. The main foobar table has 17M rows. 
I did multiple runs of both to eliminate any caching effects.

foobar.id is VARCHAR(16)
foobar.status is VARCHAR(32)

Indexes:
"foobar_pkey" PRIMARY KEY, btree (id) CLUSTER
"foobar_status" UNIQUE, btree (status, id)

(8.3 and up, plus 9.2 with index scan disabled)
GroupAggregate  (C=30389..1754503 R=1 W=22) (AT=0.3..0.3 R=1 L=1)
 ->  Nested Loop Left Join  (C=30389..1754147 R=23751 W=22) (AT=0.1..0.2 R=7 
L=1)
   ->  Nested Loop Left Join  (C=30389..835374 R=8980 W=16) (AT=0.1.0.1 R=1 L=1)
 ->  Index Scan using foobar_pkey on foobar m  (C=0..13 R=1 W=8) 
(AT=0.03..0.03 rows=1 L=1)
   Index Cond: ((id) = '17464097')
   Filter: ((id) !~~ '%.%')
 ->  Bitmap Heap Scan on foobar o  (C=30389..835271 R=8980 W=8) 
(AT=0.06..0.07 R=1 L=1)
   Recheck Cond: (((id) >= (m.id)) AND ((id) <= ((m.id) || '.99')))
   Filter: (((status) <> ALL ('{panda,penguin}'[])) \
 AND ((id) ~ (('^' || (m.id)) || '(\.[0-9.]+)?$')))
   ->  Bitmap Index Scan on foobar_pkey  (C=0..30386 R=1888670 W=0) 
(AT=0.02..0.02 R=1 L=1)
 Index Cond: (((id) >= (m.id)) AND ((id) <= ((m.id) || '.99')))
   ->  Index Scan using baz_id on baz (C=0..101 R=40 W=22) (AT=0.03..0.06 R=7 
L=1)
 Index Cond: ((o.id) = (id))
 Filter: (price <> 0::numeric)
 Rows Removed by Filter: 3
Total runtime: 0.459 ms

Now, if we turn on index only scans, we get a terrible runtime:

GroupAggregate  (C=0.00..1314945 R=1 W=22) (AT=34502..34502 R=1 L=1)
->  Nested Loop Left Join  (C=0.00..1314589 R=23751 W=22) (AT=31934..34502 R=7 
L=1)
  ->  Nested Loop Left Join  (C=0.00..395816 R=8980 W=16) (AT=31934..34502 R=1 
L=1)
->  Index Only Scan using foobar_pkey on foobar m  (C=0.00..13.81 R=1 W=8) 
(AT=0.029..0.034 R=1 L=1)
  Index Cond: (id = '17464097')
  Filter: ((id) !~~ '%.%')
  Heap Fetches: 0
->  Index Only Scan using foobar_status on foobar o  (C=0.00..395713 R=8980 
W=8) (AT=31934..34502 R=1 L=1)
  Index Cond: ((id >= (m.id)) AND (id <= ((m.id) || '.99')))
  Filter: (((status) <> ALL ('{panda,penguin}'[])) \
AND ((id) ~ (('^' || (m.id)) || '(\.[0-9.]+)?$')))
  Heap Fetches: 0
  ->  Index Scan using baz_id on baz (C=0..101 R=40 W=22) (AT=0.1..0.2 R=7 L=1)
Index Cond: ((o.id) = (id))
Filter: (price <> 0::numeric)
Rows Removed by Filter: 3

Total runtime: 34502.670 ms

Yeah34 seconds versus near-instant. The first index-only scan does great, 
but that second one - ouch - even with no heap fetches at all!

-- 
Greg Sabino Mullane g...@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8


pgp7vrOJXPrmk.pgp
Description: PGP signature

Re: [GENERAL] moving from MySQL to pgsql

2012-10-10 Thread Ondrej Ivanič

Hi,

On 10 October 2012 19:47, Vineet Deodhar  wrote:
> 3) Can I simulate MySQL's TINYINT data-type (using maybe the custom data
> type or something else)

What do you exactly mean? Do you care about storage requirements or
constraints? The smallest numeric type in postgres is smallint: range
is +/- 32K and you need two bytes. You can use check constraint to
restrict the range (postgres doesn't have signed / unsigned types):

create table T (
  tint_signed smallint check ( tint_signed >= -128 and tint_signed =< 127 ),
  tint_unsigned smallint check ( tint_unsigned >= 0 and tint_unsigned =< 255 )
)

if you care about storage then "char" (yes, with quotes) might be the
right type for you.

-- 
Ondrej Ivanic
(ondrej.iva...@gmail.com)
(http://www.linkedin.com/in/ondrejivanic)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Planner chooses multi-column index in 9.2 when maybe it should not

2012-10-10 Thread Tom Lane

Greg Sabino Mullane  writes:
> Found a good demonstration of the problem. Here's explain analyze of a 
> query on 9.2 with enable_indexonlyscan = off; This produces the exact same 
> plan as 8.3. The tables in question have been analyzed. Changing 
> random_page_cost has no effect. The main foobar table has 17M rows. 

Can you provide a self-contained test case for this (ie, sample data)?

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Index only scan

2012-10-10 Thread Tom Lane

Gavin Flower  writes:
> On 11/10/12 01:03, Lars Helge Øverland wrote:
>> My question is: Would it be feasible and/or possible to implement
>> index only scans in a way that it could take advantage of several,
>> single-column indexes? For example, a query spanning columns a, b, c
>> could take advantage of 3 single-column indexes put on columns a, b,
>> c.

> Index only scans do use multiple indexes of single fields where 
> appropriate.  Here the planner determined it only needed to scan 2 of 
> the 3 relevant single field indexes.

But your example isn't an index-only scan ... it's a plain old bitmap
scan, and so it does touch the heap.

The difficulty with what Lars proposes is that there's no way to scan
the different indexes "in sync" --- each one will be ordered according
to its own column order.  In principle I guess we could read out the
index data and do a join using the ctid's, but it's far from clear that
such a thing would be worth the trouble.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Index only scan

2012-10-10 Thread Ondrej Ivanič

Hi,

On 10 October 2012 23:03, Lars Helge Øverland  wrote:
> We are now in the process of designing a new component for analytics
> and this feature got me thinking we could utilize postgres over other
> alternatives like column-oriented databases. Basically we will have a
> wide, denormalized table with 20+ columns with relatively low
> cardinality. Typically we will have queries which sums a fact column
> based on where/group by clauses on several dimension columns (standard
> data warehouse stuff). An example would be "select a, b, c, sum(d)
> from analytics where a=1 and b=2 group by a,b,c";
>
> Finally, is there anyone else who are using postgres for this purpose
> and have some good tips to share in order to achieve good performance,
> including index strategies, beyond the standard config best practices?

yes, we had fact table which has around 250 columns and 250mil rows.
The question is if you can partition your data set. For example,
monthly partition. This keeps indexes small but all queries must be
constrained by the same column as is used for partitioning (ie.
monthly partitions -> every query should have "datetime between ...
and ...")

From my experience postgres is not good with large group by queries.
For example, your query:
select a, b, c, sum(d) from analytics where a=1 and b=2 group by a,b,c

could be executed over multiple connections:
insert into t select select a, b, c, sum(d) as d from analytics where
c >= val1 and c < val2 and a=1 and b=2 group by a,b,c
insert into t select select a, b, c, sum(d) as d from analytics where
c >= val2 and c < val3 and a=1 and b=2 group by a,b,c
...
insert into t select select a, b, c, sum(d) as d from analytics where
c >= valN-1 and c < valN and a=1 and b=2 group by a,b,c

and then get the final result:
select a, b, c, sum(d) from t group by a,b,c

You can use pgpool-II parallel query feature instead of manual slicing.

-- 
Ondrej Ivanic
(ondrej.iva...@gmail.com)
(http://www.linkedin.com/in/ondrejivanic)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Index only scan

2012-10-10 Thread Gavin Flower


On 11/10/12 01:03, Lars Helge Øverland wrote:

Hi all,

first of all thanks for the great new "index only scan" feature in
9.2. We have managed to adapt our app (dhis2.org) to take advantage of
it and it really speeds up several queries significantly.

We are now in the process of designing a new component for analytics
and this feature got me thinking we could utilize postgres over other
alternatives like column-oriented databases. Basically we will have a
wide, denormalized table with 20+ columns with relatively low
cardinality. Typically we will have queries which sums a fact column
based on where/group by clauses on several dimension columns (standard
data warehouse stuff). An example would be "select a, b, c, sum(d)
from analytics where a=1 and b=2 group by a,b,c";

Now my initial idea was to simply put a single index on all of those
columns, in the hope that "index only scans" would kick in. It seems
this is not the case, as strictly one (single or multi-column) index
is required to enable index only scans for a query.

My question is: Would it be feasible and/or possible to implement
index only scans in a way that it could take advantage of several,
single-column indexes? For example, a query spanning columns a, b, c
could take advantage of 3 single-column indexes put on columns a, b,
c.

Finally, is there anyone else who are using postgres for this purpose
and have some good tips to share in order to achieve good performance,
including index strategies, beyond the standard config best practices?


best regards,

Lars Helge Øverland




Index only scans do use multiple indexes of single fields where 
appropriate.  Here the planner determined it only needed to scan 2 of 
the 3 relevant single field indexes.



Cheers,
Gavin

-- index_only_scan_001.sql


DROP TABLE IF EXISTS iostab;

CREATE TABLE iostab
(
id  int PRIMARY KEY,
a   int,
b   int,
c   int,
d   int,
z   text
);


INSERT INTO iostab (id, a, b, c, d, z) VALUES
(generate_series(1, 100),
1000 * random(),
1000 * random(),
1000 * random(),
1000 * random(),
'qq' || random());

CREATE INDEX ON iostab (a);
CREATE INDEX ON iostab (b);
CREATE INDEX ON iostab (c);
CREATE INDEX ON iostab (d);

ANALYZE VERBOSE iostab;

EXPLAIN
SELECT
i.*
FROM
iostab i
WHERE
i.a = 2
AND i.b = 7
AND i.c = 4
/**/;/**/


//


DROP TABLE
psql:index_only_scan_001.sql:14: NOTICE:  CREATE TABLE / PRIMARY KEY 
will create implicit index "iostab_pkey" for table "iostab"

CREATE TABLE
INSERT 0 100
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
psql:index_only_scan_001.sql:30: INFO:  analyzing "public.iostab"
psql:index_only_scan_001.sql:30: INFO:  "iostab": scanned 15385 of 15385 
pages, containing 100 live rows and 0 dead rows; 3 rows in 
sample, 100 estimated total rows

ANALYZE
 QUERY PLAN

 Bitmap Heap Scan on iostab i  (cost=41.37..45.39 rows=1 width=90)
   Recheck Cond: ((b = 7) AND (a = 2))
   Filter: (c = 4)
   ->  BitmapAnd  (cost=41.37..41.37 rows=1 width=0)
 ->  Bitmap Index Scan on iostab_b_idx (cost=0.00..20.55 
rows=960 width=0)

   Index Cond: (b = 7)
 ->  Bitmap Index Scan on iostab_a_idx (cost=0.00..20.57 
rows=963 width=0)

   Index Cond: (a = 2)
(8 rows)

gavin=>

Re: [GENERAL] Shorthand syntax for triggers

2012-10-10 Thread Sergey Konoplev

On Wed, Oct 10, 2012 at 9:22 AM, Joe Van Dyk  wrote:
> I wish I could do:
>
> create trigger some_trigger after insert on products
> execute procedure do $$ begin
> insert into audits values (CHANGED.value);
> end $$ language plpgsql;

IF TG_OP = 'DELETE' THEN RENAME OLD TO myrow;
ELSE RENAME NEW TO myrow; END IF;

and then use

insert into audits values (myrow.value);

>
>
> Changes/improvements:
>
> 1. Triggers default to 'for each row'
>
> 2. Triggers can use anonymous functions
>
> 3. Triggers can access a special CHANGED value that's either NEW for insert
> or updates, or OLD for deletes.
>
> 4. Default for 'after insert' triggers is to return null, as I believe it
> doesn't matter what you return here.
>
> 5. Way less repetitive typing.
>
>
> Thoughts? Is this a terrible idea?



-- 
Sergey Konoplev

a database and software architect
http://www.linkedin.com/in/grayhemp

Jabber: gray...@gmail.com Skype: gray-hemp Phone: +14158679984


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Planner chooses multi-column index in 9.2 when maybe it should not

2012-10-10 Thread Greg Sabino Mullane

On Wed, Oct 10, 2012 at 01:31:29PM -0400, Tom Lane wrote:
> The above doesn't seem like a regression to me.  You told it not to use
> a seqscan, and it didn't.  (The reason it now considers the index is
> that an index-only scan is possible; before 9.2 there was no reason to
> consider an indexscan at all given this query, so you got the seqscan
> despite the attempted disable.)

Ah...index-only scans. Now it makes sense.

...
> It's not obvious that this is a worse plan than a seqscan --- the
> index-only scans will only have to read the index not the heap, at least
> if the heap is all-visible.  If it's coming out slower, then that's a
> question of whether the cost estimates match reality.  I'd wonder how
> many heap fetches occur anyway, and also whether you've tweaked the
> planner cost parameters.

We've lowered random_page_cost, but raising it back to the default does 
not help.

> You should be able to force it back to the seqscan based plan by turning
> off enable_indexscan or enable_indexonlyscan.  It would be useful to
> see EXPLAIN ANALYZE (not just EXPLAIN) results for both this plan and
> the seqscan plan in 9.2.

Thanks, I will play around with both a better test case and getting some 
explain analyzes (they were taking too long to run; thought I should get 
the email out first in case it was something obvious).


-- 
Greg Sabino Mullane g...@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8


pgpn2mbk5zwEr.pgp
Description: PGP signature

Re: [GENERAL] pymssql Connection to the database failed for an unknown reason

2012-10-10 Thread Alban Hertroys

On 10 Oct 2012, at 3:17, thomson...@sourcephotonics.com wrote:

> On Saturday, January 29, 2011 9:08:22 PM UTC+8, orgilhp wrote:
>> Hello
>> I am using pymssql to connect to MSSQL2008 database. But an error
>> occurs:
>> --
> import pymssql
> conn = pymssql.connect(host='orgilhpnb\mssql2008', user='erp', 
> password='123', database='eoffice_clone')
>> Traceback (most recent call last):
>>  File "", line 1, in 
>>  File "/usr/lib/pymodules/python2.6/pymssql.py", line 609, in connect
>>raise InterfaceError, e[0]
>> pymssql.InterfaceError: Connection to the database failed for an
>> unknown reason.
>> --
>> 
>> The host name, user, password and database name are all correct.
>> Why I get an error? Please help me!
>> 
>> Any suggestion would be highly appreciated!
>> 
>> Best regards,
>> Orgil
> 
> Do you solve this issue? I encounted the same issue
> would you like to reply the issue?


It looks like you're accidentally using an MS-SQL interface to connect to a 
Postgres database? Or you're asking the wrong mailing list ;)

Alban Hertroys

--
Religion is: Volunteering your children for brainwashing.





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_upgrade not detecting version properly

2012-10-10 Thread Bruce Momjian

On Wed, Oct 10, 2012 at 10:35:06AM -0600, Chris Ernst wrote:
> On 10/10/2012 09:56 AM, Bruce Momjian wrote:
> > Can you show me what is in the PG_VERSION file in the old cluster?  It
> > should be "9.1".
> 
> Hi Bruce,
> 
> Thank you for the reply.  Indeed it is "9.1":
> 
> # cat /postgresql/9.1/main/PG_VERSION
> 9.1
> 
> And just for good measure:
> 
> cat /postgresql/9.2/main/PG_VERSION
> 9.2
> 
> And there are no other PostgreSQL versions on this machine.
> 
> Hmm... I was just about to send this when something else occurred to me.
>  I had initially tried to run pg_upgrade as root and it said it couldn't
> be run as root.  So I've been running it as my own user (which is in the
> postgres group).  However, everything in /postgresql/9.1/main is owned
> by postgres with 700 permissions.
> 
> I switched to the postgres user and now pg_upgrade is running.  Perhaps
> just a more informative error message is in order.
> 
> Thank you for the shove in the right direction =)

Oops, that code was returning zero if it couldn't open the file.  The
attached, applied patch to head and 9.2 issues a proper error message.

Seems this "zero return" has been in the code since the beginning.  :-(

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/server.c b/contrib/pg_upgrade/server.c
new file mode 100644
index 11e7e75..a9f9d85
*** a/contrib/pg_upgrade/server.c
--- b/contrib/pg_upgrade/server.c
*** get_major_server_version(ClusterInfo *cl
*** 149,155 
  	snprintf(ver_filename, sizeof(ver_filename), "%s/PG_VERSION",
  			 cluster->pgdata);
  	if ((version_fd = fopen(ver_filename, "r")) == NULL)
! 		return 0;
  
  	if (fscanf(version_fd, "%63s", cluster->major_version_str) == 0 ||
  		sscanf(cluster->major_version_str, "%d.%d", &integer_version,
--- 149,155 
  	snprintf(ver_filename, sizeof(ver_filename), "%s/PG_VERSION",
  			 cluster->pgdata);
  	if ((version_fd = fopen(ver_filename, "r")) == NULL)
! 		pg_log(PG_FATAL, "could not open version file: %s\n", ver_filename);
  
  	if (fscanf(version_fd, "%63s", cluster->major_version_str) == 0 ||
  		sscanf(cluster->major_version_str, "%d.%d", &integer_version,

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Planner chooses multi-column index in 9.2 when maybe it should not

2012-10-10 Thread Tom Lane

Greg Sabino Mullane  writes:
> We are seeing a performance regression when moving to 9.2. There is a 
> complex query that is doing a self-join, but long story short, it 
> is choosing to use a multi-column index when it really ought not to be.
> I was not able to develop a standalone test case without resorting 
> to changing enable_seqscan, but this does show the difference:

> CREATE TABLE gregtest (a TEXT PRIMARY KEY, b TEXT);
> CREATE INDEX gregtest_i ON gregtest(b,a);
> SET enable_seqscan = off;
> EXPLAIN SELECT 1 FROM gregtest WHERE a <> 'panda' AND b <> 'yak';

> On versions 8.2, 8.3, 8.4, 9.0, and 9.1, this gives:

>  Seq Scan on gregtest  (cost=100.00..122.90 rows=855 width=0)
>Filter: ((a <> 'panda'::text) AND (b <> 'yak'::text))

> Which makes sense - I would imagine that b = 'yak' would use the index, 
> but the negation means the index is not very useful?

> However, on 9.2, this gives:

>  Bitmap Heap Scan on gregtest  (cost=8.76..31.66 rows=855 width=0)
>Filter: ((a <> 'panda'::text) AND (b <> 'yak'::text))
>->  Bitmap Index Scan on gregtest_i  (cost=0.00..8.55 rows=860 width=0)

The above doesn't seem like a regression to me.  You told it not to use
a seqscan, and it didn't.  (The reason it now considers the index is
that an index-only scan is possible; before 9.2 there was no reason to
consider an indexscan at all given this query, so you got the seqscan
despite the attempted disable.)

> The above was tested on stock versions of Postgres, with no changes 
> made to postgresql.conf. In the actual query, the result is something like 
> this on 9.2 (columns explained below):

>  Nested Loop  (cost=0.00..6050226723847.12 rows=282638194054762 width=8)
>Join Filter: ((o.foo)::text <> ((m.foo)::text || 'panda'::text))
>->  Index Only Scan using index_i on foobar m  (cost=0.00..889187.83 
> rows=16998032 width=8)
>->  Materialize  (cost=0.00..1079773.42 rows=16627702 width=8)
>  ->  Index Only Scan using index_i on foobar o  (cost=0.00..931682.91 
> rows=16627702 width=8)
>Filter: ((status)::text <> 'split'::text)

It's not obvious that this is a worse plan than a seqscan --- the
index-only scans will only have to read the index not the heap, at least
if the heap is all-visible.  If it's coming out slower, then that's a
question of whether the cost estimates match reality.  I'd wonder how
many heap fetches occur anyway, and also whether you've tweaked the
planner cost parameters.

You should be able to force it back to the seqscan based plan by turning
off enable_indexscan or enable_indexonlyscan.  It would be useful to
see EXPLAIN ANALYZE (not just EXPLAIN) results for both this plan and
the seqscan plan in 9.2.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Planner chooses multi-column index in 9.2 when maybe it should not

2012-10-10 Thread Greg Sabino Mullane

We are seeing a performance regression when moving to 9.2. There is a 
complex query that is doing a self-join, but long story short, it 
is choosing to use a multi-column index when it really ought not to be.
I was not able to develop a standalone test case without resorting 
to changing enable_seqscan, but this does show the difference:

CREATE TABLE gregtest (a TEXT PRIMARY KEY, b TEXT);
CREATE INDEX gregtest_i ON gregtest(b,a);
SET enable_seqscan = off;
EXPLAIN SELECT 1 FROM gregtest WHERE a <> 'panda' AND b <> 'yak';

On versions 8.2, 8.3, 8.4, 9.0, and 9.1, this gives:

 Seq Scan on gregtest  (cost=100.00..122.90 rows=855 width=0)
   Filter: ((a <> 'panda'::text) AND (b <> 'yak'::text))

Which makes sense - I would imagine that b = 'yak' would use the index, 
but the negation means the index is not very useful?

However, on 9.2, this gives:

 Bitmap Heap Scan on gregtest  (cost=8.76..31.66 rows=855 width=0)
   Filter: ((a <> 'panda'::text) AND (b <> 'yak'::text))
   ->  Bitmap Index Scan on gregtest_i  (cost=0.00..8.55 rows=860 width=0)

The above was tested on stock versions of Postgres, with no changes 
made to postgresql.conf. In the actual query, the result is something like 
this on 9.2 (columns explained below):

 Nested Loop  (cost=0.00..6050226723847.12 rows=282638194054762 width=8)
   Join Filter: ((o.foo)::text <> ((m.foo)::text || 'panda'::text))
   ->  Index Only Scan using index_i on foobar m  (cost=0.00..889187.83 
rows=16998032 width=8)
   ->  Materialize  (cost=0.00..1079773.42 rows=16627702 width=8)
 ->  Index Only Scan using index_i on foobar o  (cost=0.00..931682.91 
rows=16627702 width=8)
   Filter: ((status)::text <> 'split'::text)

But like this on 8.3:

 Nested Loop  (cost=1003294.60..8207409555713.15 rows=283931552087940 width=8)
   Join Filter: ((o.foo)::text <> ((m.foo)::text || 'panda'::text))
   ->  Seq Scan on foobar o  (cost=0.00..962314.95 rows=16672204 width=8)
 Filter: ((status)::text <> 'split'::text)
   ->  Materialize  (cost=1003294.60..1240121.96 rows=17030236 width=8)
 ->  Seq Scan on foobar m  (cost=0.00..919739.36 rows=17030236 width=8)

In the above, foobar has a primary key on foo, and an index named 
index_i on foobar(status, foo). In another variation of the query, 
8.3 uses foobar_pkey as well, rather than index_i, and filters that. 
Matter of fact, index_i is never used.

At any rate, the change causes the original query to run much, much 
slower. Problem on 9.2? Something wrong with our system and/or query? 
More information needed from me?


-- 
Greg Sabino Mullane g...@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8


pgpELPYaKfY4x.pgp
Description: PGP signature

Re: [GENERAL] pg_upgrade not detecting version properly

2012-10-10 Thread Chris Ernst

On 10/10/2012 09:56 AM, Bruce Momjian wrote:
> Can you show me what is in the PG_VERSION file in the old cluster?  It
> should be "9.1".

Hi Bruce,

Thank you for the reply.  Indeed it is "9.1":

# cat /postgresql/9.1/main/PG_VERSION
9.1

And just for good measure:

cat /postgresql/9.2/main/PG_VERSION
9.2

And there are no other PostgreSQL versions on this machine.

Hmm... I was just about to send this when something else occurred to me.
 I had initially tried to run pg_upgrade as root and it said it couldn't
be run as root.  So I've been running it as my own user (which is in the
postgres group).  However, everything in /postgresql/9.1/main is owned
by postgres with 700 permissions.

I switched to the postgres user and now pg_upgrade is running.  Perhaps
just a more informative error message is in order.

Thank you for the shove in the right direction =)

Cheers!

- Chris

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] FTS for a controlled vocab

2012-10-10 Thread Sumit Raja

Hello,

I am trying to identify how best to handle the situation where a controlled
vocabulary needs to be searched on using full text search.

I have a list of brand names that have, what FTS deems, blank characters in
them that I need to search against. E.g. (+)people, D&G, 100% Design.

These particular combinations are proving to be difficult so I would like
to do a replacement at index and query time
(+)people, +people -> pluspeople
100% Design -> 100percent Design
D&G, D & G, DG -> DandG

Running these through the default parser means I get a much reduced lexemes
that won't be exact enough.

   alias   |   description   | token |  dictionaries  |  dictionary  |
lexemes
---+-+---++--+-
 asciiword | Word, all ASCII | d | {english_stem} | english_stem | {d}
 blank | Space symbols   | & | {} |  |
 asciiword | Word, all ASCII | g | {english_stem} | english_stem | {g}
(3 rows)

 alias   |   description   | token  |  dictionaries  |  dictionary  |
lexemes
---+-+++--+-
 blank | Space symbols   | (  | {} |  |
 blank | Space symbols   | +) | {} |  |
 asciiword | Word, all ASCII | people | {english_stem} | english_stem |
{peopl}


Can I achieve this with FTS and dictionaries or would I need a custom
parser? Any other ideas on how a search like this could work?

I have considered using the actual text column in the query to try and
match the exact term using ilike and pg_trgm. So for a user query of 'D&G
dresses' the select could be:

select * from test where text_val @@ plainto_tsquery('english','d&g
dresses') and lex ilike'%d&g%';

but there would be some horrible query mangling to find all words that have
the blank tokens and use them in multiple ilike comparisons.

Thanks

Sumit

[GENERAL] Shorthand syntax for triggers

2012-10-10 Thread Joe Van Dyk

Instead of this:

create function some_trigger() returns trigger as $$
begin
if TG_OP = 'DELETE' then

  insert into audits values (OLD.value);

else

  insert into audits values (NEW.value);

end if;

return NULL;

end
$$ language plpgsql;


create trigger some_trigger after insert on products

for each row execute procedure some_trigger();


I wish I could do:

create trigger some_trigger after insert on products
execute procedure do $$ begin
insert into audits values (CHANGED.value);
end $$ language plpgsql;


Changes/improvements:

1. Triggers default to 'for each row'

2. Triggers can use anonymous functions

3. Triggers can access a special CHANGED value that's either NEW for insert
or updates, or OLD for deletes.

4. Default for 'after insert' triggers is to return null, as I believe it
doesn't matter what you return here.

5. Way less repetitive typing.


Thoughts? Is this a terrible idea?

Re: [GENERAL] pg_upgrade not detecting version properly

2012-10-10 Thread Bruce Momjian

On Tue, Oct  9, 2012 at 09:50:22PM -0600, Chris Ernst wrote:
> Hi all,
> 
> I'm trying to test using pg_upgrade to go from 9.1.6 to 9.2.1 on Ubuntu
> server 10.04.  But when I run pg_upgrade, it tells me I can only run it
> on 8.3 or later.
> 
> Old:
> postgres=# SELECT version();
> version
> 
> 
>  PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc-4.4.real
> (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3, 64-bit
> (1 row)
> 
> 
> New:
> postgres=# SELECT version();
> version
> 
> 
>  PostgreSQL 9.2.1 on x86_64-unknown-linux-gnu, compiled by gcc-4.4.real
> (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3, 64-bit
> (1 row)
> 
> 
> Yet when I try to run pg_upgrade:
> 
> $ /usr/lib/postgresql/9.2/bin/pg_upgrade -b /usr/lib/postgresql/9.1/bin/
> -d /postgresql/9.1/main -B /usr/lib/postgresql/9.2/bin/ -D
> /postgresql/9.2/main -k -c -v
> Running in verbose mode
> Performing Consistency Checks
> -
> Checking current, bin, and data directories ok
> Checking cluster versions
> This utility can only upgrade from PostgreSQL version 8.3 and later.
> Failure, exiting
> 
> Any idea what could be going on here?
> 
> Thank you in advance for your help.

That is cetainly odd.  It is using this C code:

if (GET_MAJOR_VERSION(old_cluster.major_version) < 803)
pg_log(PG_FATAL, "This utility can only upgrade from PostgreSQL version 
8.3 and later.\n");

which is assigned from this function:

get_major_server_version(ClusterInfo *cluster)
{
FILE   *version_fd;
charver_filename[MAXPGPATH];
int integer_version = 0;
int fractional_version = 0;

snprintf(ver_filename, sizeof(ver_filename), "%s/PG_VERSION",
 cluster->pgdata);
if ((version_fd = fopen(ver_filename, "r")) == NULL)
return 0;

if (fscanf(version_fd, "%63s", cluster->major_version_str) == 0 ||
sscanf(cluster->major_version_str, "%d.%d", &integer_version,
   &fractional_version) != 2)
pg_log(PG_FATAL, "could not get version from %s\n", 
cluster->pgdata);

fclose(version_fd);

return (100 * integer_version + fractional_version) * 100;
}

Can you show me what is in the PG_VERSION file in the old cluster?  It
should be "9.1".

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Re: Suggestions for the best strategy to emulate returning multiple sets of results

2012-10-10 Thread Bret Stern

create a ramdrive 
On Wed, 2012-10-10 at 14:27 +0100, Seref Arikan wrote:
> Hi Merlin, 
> Thanks for the response. At the moment, the main function is creating
> two temp tables that drops on commit, and python functions fills
> these. Not too bad, but I'd like to push these temp tables to ram,
> which is a bit tricky due to not having a direct method of doing this
> with postgresql. (a topic that has been discussed in the past in this
> mail group) 
> 
> The global variable idea is interesting though. I have not encountered
> this before, is it the global dictionary SD/GD mentioned here:
> http://www.postgresql.org/docs/9.0/static/plpython-sharing.html ?
> It may help perform the expensive transformations once and reuse the
> results. 
> 
> Kind regards
> Seref
> 
> On Wed, Oct 10, 2012 at 2:06 PM, Merlin Moncure 
> wrote:
> On Mon, Oct 8, 2012 at 4:59 PM, Seref Arikan
>  wrote:
> > Thanks Merlin,
> > I've  tried arrays but plpython does not support returning
> arrays of custom
> > db types (which is what I'd need to do)
> 
> 
> 
> hm -- yeah.  can your custom types be broken down into plain
> SQL types
> (that is, composite types?).  maybe stash the results in
> global
> variable and return it in two calls, or insert into into a
>  tempt
> table that drops on commit?
> 
> merlin
> 




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pymssql Connection to the database failed for an unknown reason

2012-10-10 Thread Thomson . Li

On Saturday, January 29, 2011 9:08:22 PM UTC+8, orgilhp wrote:
> Hello
> I am using pymssql to connect to MSSQL2008 database. But an error
> occurs:
> --
> >>> import pymssql
> >>> conn = pymssql.connect(host='orgilhpnb\mssql2008', user='erp', 
> >>> password='123', database='eoffice_clone')
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "/usr/lib/pymodules/python2.6/pymssql.py", line 609, in connect
> raise InterfaceError, e[0]
> pymssql.InterfaceError: Connection to the database failed for an
> unknown reason.
> --
> 
> The host name, user, password and database name are all correct.
> Why I get an error? Please help me!
> 
> Any suggestion would be highly appreciated!
> 
> Best regards,
> Orgil

Do you solve this issue? I encounted the same issue
would you like to reply the issue?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq-how to add a schema to search path

2012-10-10 Thread Divakar Singh

Hi Laurenz,
Thanks for quick reply.
I hope it will help. Will check and revert.

Best Regards,
dpsmails


--- On Wed, 10/10/12, Albe Laurenz  wrote:

From: Albe Laurenz 
Subject: RE: [GENERAL] libpq-how to add a schema to search path
To: "Divakar Singh  *EXTERN*" , pgsql-general@postgresql.org
Date: Wednesday, October 10, 2012, 4:01 PM

Divakar Singh wrote:
> While making connection to PGSQL using libpq, is there any option to
mention the schema name?
> Something similar exists in java, however not sure about libpq.

Use the "options" connection parameter, like this:

psql "dbname=test user=laurenz port=5432 options='-c
search_path=schema1,schema2'"

Yours,
Laurenz Albe

Re: [GENERAL] Re: Suggestions for the best strategy to emulate returning multiple sets of results

2012-10-10 Thread Seref Arikan

Thanks Bret,
I'm concerned about what happens when my functions under high load fills
the ramdrive with temporary tables I'm using. The advantage of telling
postgres to use ram with an option to fall back to disk is significantly
better in terms of uptime.
However, I was thinking about some mechanism in the middle tier that
watches the space in the ram drive and redirects queries to functions that
create temp tables on disk, if ram drive is close to full. That may help me
accomplish what I'm trying to

Regards
Seref


On Wed, Oct 10, 2012 at 3:58 PM, Bret Stern <
bret_st...@machinemanagement.com> wrote:

> create a ramdrive
> On Wed, 2012-10-10 at 14:27 +0100, Seref Arikan wrote:
> > Hi Merlin,
> > Thanks for the response. At the moment, the main function is creating
> > two temp tables that drops on commit, and python functions fills
> > these. Not too bad, but I'd like to push these temp tables to ram,
> > which is a bit tricky due to not having a direct method of doing this
> > with postgresql. (a topic that has been discussed in the past in this
> > mail group)
> >
> > The global variable idea is interesting though. I have not encountered
> > this before, is it the global dictionary SD/GD mentioned here:
> > http://www.postgresql.org/docs/9.0/static/plpython-sharing.html ?
> > It may help perform the expensive transformations once and reuse the
> > results.
> >
> > Kind regards
> > Seref
> >
> > On Wed, Oct 10, 2012 at 2:06 PM, Merlin Moncure 
> > wrote:
> > On Mon, Oct 8, 2012 at 4:59 PM, Seref Arikan
> >  wrote:
> > > Thanks Merlin,
> > > I've  tried arrays but plpython does not support returning
> > arrays of custom
> > > db types (which is what I'd need to do)
> >
> >
> >
> > hm -- yeah.  can your custom types be broken down into plain
> > SQL types
> > (that is, composite types?).  maybe stash the results in
> > global
> > variable and return it in two calls, or insert into into a
> >  tempt
> > table that drops on commit?
> >
> > merlin
> >
>
>
>

Re: [GENERAL] Compression

2012-10-10 Thread rtshadow

Where do I find more information about PG fork you mentioned?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Compression-tp4304322p5727363.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Re: Suggestions for the best strategy to emulate returning multiple sets of results

2012-10-10 Thread Seref Arikan

Comments inline (sorry, did not cc the group in the other mail)

On Wed, Oct 10, 2012 at 2:55 PM, Merlin Moncure  wrote:

> On Wed, Oct 10, 2012 at 8:27 AM, Seref Arikan
>  wrote:
> > Hi Merlin,
> > Thanks for the response. At the moment, the main function is creating two
> > temp tables that drops on commit, and python functions fills these. Not
> too
> > bad, but I'd like to push these temp tables to ram, which is a bit tricky
> > due to not having a direct method of doing this with postgresql. (a topic
> > that has been discussed in the past in this mail group)
> >
> > The global variable idea is interesting though. I have not encountered
> this
> > before, is it the global dictionary SD/GD mentioned here:
> > http://www.postgresql.org/docs/9.0/static/plpython-sharing.html ?
> > It may help perform the expensive transformations once and reuse the
> > results.
>
> yeah.  maybe though you might find that the overhead of temp tables is
> already pretty good -- they are mostly ram based in typical usage as
> they aren't synced.  I find actually the greatest overhead in terms of
> using them is creation and dropping -- so for very low latency
> transactions I use a unlogged permanent table with value returned by
> txid_current() as the leading field in the key.
>
This is very interesting. The reason I've tried to avoid a shared temp
table is that I'd have to have a session id for calls, which led to severe
performance issues with the entity attribute value approach I'm using in
the temp table.
Your approach sounds to have been designed to overcome my problem, but I
have no idea  what an unlogged table does, and your use of txid_current.
Could you explain a bit?

Regards
Seref


>
> merlin
>

[GENERAL] Re: Suggestions for the best strategy to emulate returning multiple sets of results

2012-10-10 Thread Merlin Moncure

On Wed, Oct 10, 2012 at 8:27 AM, Seref Arikan
 wrote:
> Hi Merlin,
> Thanks for the response. At the moment, the main function is creating two
> temp tables that drops on commit, and python functions fills these. Not too
> bad, but I'd like to push these temp tables to ram, which is a bit tricky
> due to not having a direct method of doing this with postgresql. (a topic
> that has been discussed in the past in this mail group)
>
> The global variable idea is interesting though. I have not encountered this
> before, is it the global dictionary SD/GD mentioned here:
> http://www.postgresql.org/docs/9.0/static/plpython-sharing.html ?
> It may help perform the expensive transformations once and reuse the
> results.

yeah.  maybe though you might find that the overhead of temp tables is
already pretty good -- they are mostly ram based in typical usage as
they aren't synced.  I find actually the greatest overhead in terms of
using them is creation and dropping -- so for very low latency
transactions I use a unlogged permanent table with value returned by
txid_current() as the leading field in the key.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] moving from MySQL to pgsql

2012-10-10 Thread Merlin Moncure

On Wed, Oct 10, 2012 at 3:47 AM, Vineet Deodhar
 wrote:
> Hi !
> At present, I am using MySQL as backend for my work.
> Because of the licensing implications, I am considering to shift from MySQL
> to pgsql.
> Typically, my apps are multi-user, web based or LAN based.
>
> 1) Read over the internet that ---
> Postgres is not threaded, but every connection gets it's own process. The OS
> will distribute the processes across the processors. Basically a single
> connection will not be any faster with SMP, but multiple connections will
> be.
>
> MySQL is multi-threaded server so it can use many processors. A separate
> thread is created for each connection.
> source:

PostgreSQL is multi-threaded in that it has multiple execution
threads.  The only difference is that each thread has its own process
where in mysql every thread runs in the same process.  Each approach
has various pros and cons that ultimately don't matter most
applications.

> In what way it might affect my app performance?

Basically, it doesn't -- at least not very much.  There are many other
things that are going to make a much bigger difference.

> 2) I run MySQL from a USB stick.
> There is no installation required (on WinXP.). (not tried on Ubuntu)
> Is it the same for pgsql?

Not sure what you mean there.  Mysql has a windows installer, as does
postgres.  It's possible to bootstrap postgres without an installer if
you know what you're doing, but generally silent mode install is the
way to go.  You've omitted some very important details, like the
specific security model of the windows environments you'll install to.

> 3) Can I simulate MySQL's TINYINT data-type (using maybe the custom data
> type or something else)

You have a couple of options:
Postgresql explicitly-double-quoted "char", which is a byte.  Another
options is to use smallint + check constraints.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing large files in multiple schemas: BLOB or BYTEA

2012-10-10 Thread Albe Laurenz

tigran2-postg...@riatest.com wrote:
> Is there any other way to store large data in Postgres that allows
streaming and correctly works with
> multiple schemas per database?

Large Objects and bytea are the only ways.

If you want to pg_dump only certain large objects, that won't work
as far as I know (maybe using permissions and a non-superuser can help).

You absolutely need to pg_dump parts of the database regularly?

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Re: Suggestions for the best strategy to emulate returning multiple sets of results

2012-10-10 Thread Seref Arikan

Hi Merlin,
Thanks for the response. At the moment, the main function is creating two
temp tables that drops on commit, and python functions fills these. Not too
bad, but I'd like to push these temp tables to ram, which is a bit tricky
due to not having a direct method of doing this with postgresql. (a topic
that has been discussed in the past in this mail group)

The global variable idea is interesting though. I have not encountered this
before, is it the global dictionary SD/GD mentioned here:
http://www.postgresql.org/docs/9.0/static/plpython-sharing.html ?
It may help perform the expensive transformations once and reuse the
results.

Kind regards
Seref

On Wed, Oct 10, 2012 at 2:06 PM, Merlin Moncure  wrote:

> On Mon, Oct 8, 2012 at 4:59 PM, Seref Arikan
>  wrote:
> > Thanks Merlin,
> > I've  tried arrays but plpython does not support returning arrays of
> custom
> > db types (which is what I'd need to do)
>
>
> hm -- yeah.  can your custom types be broken down into plain SQL types
> (that is, composite types?).  maybe stash the results in global
> variable and return it in two calls, or insert into into a  tempt
> table that drops on commit?
>
> merlin
>

[GENERAL] Re: Suggestions for the best strategy to emulate returning multiple sets of results

2012-10-10 Thread Merlin Moncure

On Mon, Oct 8, 2012 at 4:59 PM, Seref Arikan
 wrote:
> Thanks Merlin,
> I've  tried arrays but plpython does not support returning arrays of custom
> db types (which is what I'd need to do)

hm -- yeah.  can your custom types be broken down into plain SQL types
(that is, composite types?).  maybe stash the results in global
variable and return it in two calls, or insert into into a  tempt
table that drops on commit?

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing large files in multiple schemas: BLOB or BYTEA

2012-10-10 Thread Shaun Thomas


On 10/10/2012 05:16 AM, tigran2-postg...@riatest.com wrote:


I need to store large files (from several MB to 1GB) in Postgres
database. The database has multiple schemas. It looks like Postgres
has 2 options to store large objects: LOB and BYTEA. However we seem
to hit problems with each of these options.


I believe the general consensus around here is to not do that, if you 
can avoid it. File systems are much better equipped to handle files of 
that magnitude, especially when it comes to retrieving them, scanning 
through their contents, or really, any access pattern aside from simple 
storage.


You're better off storing the blob on disk somewhere and storing a row 
that refers to its location. Either key pieces for a naming scheme or 
the full path.


This is especially true if you mean to later access that data with PHP.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Index only scan

2012-10-10 Thread Lars Helge Øverland

Hi all,

first of all thanks for the great new "index only scan" feature in
9.2. We have managed to adapt our app (dhis2.org) to take advantage of
it and it really speeds up several queries significantly.

We are now in the process of designing a new component for analytics
and this feature got me thinking we could utilize postgres over other
alternatives like column-oriented databases. Basically we will have a
wide, denormalized table with 20+ columns with relatively low
cardinality. Typically we will have queries which sums a fact column
based on where/group by clauses on several dimension columns (standard
data warehouse stuff). An example would be "select a, b, c, sum(d)
from analytics where a=1 and b=2 group by a,b,c";

Now my initial idea was to simply put a single index on all of those
columns, in the hope that "index only scans" would kick in. It seems
this is not the case, as strictly one (single or multi-column) index
is required to enable index only scans for a query.

My question is: Would it be feasible and/or possible to implement
index only scans in a way that it could take advantage of several,
single-column indexes? For example, a query spanning columns a, b, c
could take advantage of 3 single-column indexes put on columns a, b,
c.

Finally, is there anyone else who are using postgres for this purpose
and have some good tips to share in order to achieve good performance,
including index strategies, beyond the standard config best practices?


best regards,

Lars Helge Øverland


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq-how to add a schema to search path

2012-10-10 Thread Albe Laurenz

Divakar Singh wrote:
> While making connection to PGSQL using libpq, is there any option to
mention the schema name?
> Something similar exists in java, however not sure about libpq.

Use the "options" connection parameter, like this:

psql "dbname=test user=laurenz port=5432 options='-c
search_path=schema1,schema2'"

Yours,
Laurenz Albe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Storing large files in multiple schemas: BLOB or BYTEA

2012-10-10 Thread tigran2-postgres

Hi,

 

I need to store large files (from several MB to 1GB) in Postgres database.
The database has multiple schemas. It looks like Postgres has 2 options to
store large objects: LOB and BYTEA. However we seem to hit problems with
each of these options.

 

1. LOB. This works almost ideal, can store up to 2GB and allows streaming so
that we do not hit memory limits in our PHP backend when reading the LOB.
However all blobs are stored in pg_catalog and are not part of schema. This
leads to a big problem when you try to use pg_dump with options -n and -b to
dump just one schema with its blobs. It dumps the schema data correctly
however then it include ALL blobs in the database not just the blobs that
belong to the particular schema.

Is there a way to dump the single schema with its blobs using pg_dump or
some other utility?

 

2. BYTEA. These are correctly stored per schema so pg_dump -n works
correctly however I cannot seem to find a way to stream the data. This means
that there is no way to access the data from PHP if it is larger than memory
limit.

 

Is there any other way to store large data in Postgres that allows streaming
and correctly works with multiple schemas per database?

 

Thanks.

 

(Sorry if this double-posts on pgsql-php, I did not know which is the best
list for this question).

Re: [GENERAL] moving from MySQL to pgsql

2012-10-10 Thread Chris Travers

On Wed, Oct 10, 2012 at 2:20 AM, Vineet Deodhar wrote:

> On Wed, Oct 10, 2012 at 2:38 PM, Chris Travers wrote:
>
>>
>>
>> On Wed, Oct 10, 2012 at 1:47 AM, Vineet Deodhar > > wrote: PostgreSQL has an excellent optimizer and the on-disk layout is
>> completely different.  This will dwarf any changes due to threads vs
>> queries.
>>
>
>
>> However be prepared to rethink your indexing strategies.
>>
>> Best Wishes,
>> Chris Travers
>>
>
>
> Thanks Chris.
> I didn't understand by what do you mean by "be prepared to rethink your
> indexing strategies."
>
> In MySQL, I have created indexes, Unique indexes, complex or multi-field
> indexes, etc.
> In what way should I re-consider the indexing?
>
> In InnoDB your tables are basically primary key indexes with the rest of
the row data attached.  For this reason a sequential scan is *slow* since
it cannot traverse the table in physical order.  In PostgreSQL tables are
indexed paged heaps and there is essentially no difference between a UNIQUE
index on not null columns and a primary key.

What this means is that in MySQL/InnoDB more indexes are almost always
better, because a sequential scan is always very slow.  In PostgreSQL,
sequential scans are pretty fast but primary key lookups are a little
slower.  Consequently on PostgreSQL you may want to reduce the number of
non-unique indexes at first and add back as necessary.

Re: [GENERAL] moving from MySQL to pgsql

2012-10-10 Thread Vineet Deodhar

On Wed, Oct 10, 2012 at 2:38 PM, Chris Travers wrote:

>
>
> On Wed, Oct 10, 2012 at 1:47 AM, Vineet Deodhar 
> wrote: PostgreSQL has an excellent optimizer and 
> the on-disk layout is
> completely different.  This will dwarf any changes due to threads vs
> queries.
>

> However be prepared to rethink your indexing strategies.
>
> Best Wishes,
> Chris Travers
>

Thanks Chris.
I didn't understand by what do you mean by "be prepared to rethink your
indexing strategies."

In MySQL, I have created indexes, Unique indexes, complex or multi-field
indexes, etc.
In what way should I re-consider the indexing?

Thanks,
Vineet

Re: [GENERAL] moving from MySQL to pgsql

2012-10-10 Thread Sim Zacks


  
  
On 10/10/2012 10:47 AM, Vineet Deodhar
  wrote:

Hi !
  At present, I am using MySQL as backend for my work.
  Because of the licensing implications, I am considering to shift
  from MySQL to pgsql.
  Typically, my apps are multi-user, web based or LAN based.
  
  1) Read over the internet that --- 
  Postgres is not threaded, but every connection gets it's own
  process. The OS will distribute the processes across the
  processors. Basically a single connection will not be any faster
  with SMP, but multiple connections will be.
  
  MySQL is multi-threaded server so it can use many processors. A
  separate thread is created for each connection.
  source: http://dcdbappl1.cern.ch:8080/dcdb/archive/ttraczyk/db_compare/db_compare.html#Comparison+of+Oracle%2C+MySQL+and+Postgres+DBMS
  
  In what way it might affect my app performance?

Performance will not be affected negatively.  MySQL only has one
thread per connection, so a single query will never use multiple
threads (scary concept to think about). 

2) I run MySQL from a USB stick.
  There is no installation required (on WinXP.). (not tried on
  Ubuntu)
  Is it the same for pgsql?

To use postgres on a USB stick, see

http://www.postgresonline.com/journal/archives/172-Starting-PostgreSQL-in-windows-without-install.html

3) Can I simulate MySQL's TINYINT data-type (using
  maybe the custom data type or something else)

You can either use bool or smallint with a constraint.

Sim

Re: [GENERAL] moving from MySQL to pgsql

2012-10-10 Thread Chris Travers

On Wed, Oct 10, 2012 at 1:47 AM, Vineet Deodhar wrote:

> Hi !
> At present, I am using MySQL as backend for my work.
> Because of the licensing implications, I am considering to shift from
> MySQL to pgsql.
> Typically, my apps are multi-user, web based or LAN based.
>
> 1) Read over the internet that ---
> Postgres is not threaded, but every connection gets it's own process. The
> OS will distribute the processes across the processors. Basically a single
> connection will not be any faster with SMP, but multiple connections will
> be.
>
> MySQL is multi-threaded server so it can use many processors. A separate
> thread is created for each connection.
> source:
> http://dcdbappl1.cern.ch:8080/dcdb/archive/ttraczyk/db_compare/db_compare.html#Comparison+of+Oracle%2C+MySQL+and+Postgres+DBMS
>
> In what way it might affect my app performance?
>

Not much.  MySQL doesn't support intraquery parallelism to my knowledge.
 You will get extra robustness due to process isolation however.  There
might be some slight costs due to shared memory management overhead but
these are probably insignificant compared to other factors.   PostgreSQL
has an excellent optimizer and the on-disk layout is completely different.
 This will dwarf any changes due to threads vs queries.  However be
prepared to rethink your indexing strategies.

>
> 2) I run MySQL from a USB stick.
> There is no installation required (on WinXP.). (not tried on Ubuntu)
> Is it the same for pgsql?
>
> You would want a user account created because PostgreSQL won't run as an
administrator but otherwise, yes.

> 3) Can I simulate MySQL's TINYINT data-type (using maybe the custom data
> type or something else)
>

I would use smallint (2 byte int), but if you have to "char" with quotes
should give you a one byte field.  I don't know about utf8 tules on it
though.

Best Wishes,
Chris Travers

Re: [GENERAL] something better than pgtrgm?

2012-10-10 Thread Willy-Bas Loos

Thanks, but no, we do need the performance
And we have admins (not users) enter the names and codes, but we can't make
it way complicated to do that.
I thought you meant that they see to it that the names end up in the
database under the correct encoding (which is a logical thing to do..)

Thanks anyway :)!

WBL

On Tue, Oct 9, 2012 at 5:16 PM, Andrew Sullivan  wrote:

> On Tue, Oct 09, 2012 at 03:54:35PM +0200, Willy-Bas Loos wrote:
> >
> > >  If so, I
> > > can almost imagine a way this could work
> > >
> >
> > Great! How?
>
> Well, it involves very large tables.  But basically, you work out a
> "variant" table for any language you like, and then query across it
> with subsets of the trigrams you were just working with.  It probably
> sucks in performance, but at least you're likely to get valid
> sequences this way.
>
> For inspiration on this (and why I have so much depressing news on the
> subject of internationalization in a multi-script and multi-lingual
> environment), see RFC 3743 and RFC 4290.  These are related (among
> other things) to how to make "variants" of different DNS labels
> somehow hang together.  The problem is not directly related to what
> you're working on, but it's a similar sort of problem: people have
> rough ideas of what they're entering, and they need an exact match.
> You have the good fortune of being able to provide them with a hint!
> I wish I were in your shoes.
>
> A
>
> --
> Andrew Sullivan
> a...@crankycanuck.ca
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
"Quality comes from focus and clarity of purpose" -- Mark Shuttleworth

51 matches

Mail list logo