[GENERAL] Re: Dynamic use of RAISE with USING to generate and catch non-hardcoded custom exceptions

2017-09-22 Thread mike davis
>This works for me:
>
>DO $$
>DECLARE
 > v_msg TEXT := 'SOMETHING IS WRONG';
>  v_sqlstate TEXT := 'E0001';
>BEGIN
>  RAISE EXCEPTION USING message = v_msg, errcode = v_sqlstate;
>EXCEPTION
>  WHEN SQLSTATE 'E0001' THEN
> RAISE NOTICE '%','Error E0001 raised - going to do something about it';
>  WHEN OTHERS THEN
> RAISE NOTICE 'OTHER ERRORS: %,%', sqlstate,sqlerrm;
>END$$;

>NOTICE:  Error E0001 raised - going to do something about it
>
>Or you could do
>  RAISE EXCEPTION SQLSTATE v_sqlstate USING message = v_msg;

That does indeed work !

The second possible way of :
RAISE EXCEPTION SQLSTATE v_sqlstate USING message = v_msg;

however doesn't ! I think that was the format i had also tried and why i went 
down the dymanic route.

So it seems variables can be used in the USING subclause but not outside it. 
The manual does seem to hint at this as
"after level if any, you can write a format (which must be a simple string 
literal, not an expression)"

Anyway,  RAISE EXCEPTION USING message = v_msg, errcode = v_sqlstate; works a 
treat!

Many thanks Tom & Pavel.

Mike




[GENERAL] Dynamic use of RAISE with USING to generate and catch non-hardcoded custom exceptions

2017-09-21 Thread mike davis
I’m trying to get dynamic version of the RAISE command working so that I can 
use a table of custom application error messages and codes for use by all 
developed plpgsql functions. In this way the customer error codes and message 
are not hard coded into code and are defined consistently in one place in the 
db.


However, I cannot get a dynamic/parameterised version of the RAISE command 
working with the USING syntax - I want to do this so that i can catch the 
raised error in an EXCEPTION block.


The following example shows a example of (working) hardcoded version:


DO

$$

DECLARE


BEGIN

  RAISE EXCEPTION 'Something is wrong' USING errcode = 'E0001';


EXCEPTION

  WHEN SQLSTATE 'E0001' THEN

 RAISE NOTICE '%','Error E0001 raised - going to do something about it';

  WHEN OTHERS THEN

 RAISE NOTICE 'OTHER ERRORS: %', sqlstate;



END

$$


Which raises and catches the custom error E0001 and returns (as expected)

NOTICE:  Error E0001 raised - going to do something about it.


Now what I am trying to achieve is as above but for the msg text and errcode to 
be retrieved from a table before issuing the RAISE EXCEPTION statement.


ie. Assume v_msg and v_sqlstate have been retrieved and contain:

v_msg = 'Something is wrong’

v_sqlstate =  ‘E0001’


The what I want to raise dynamically is:


RAISE EXCEPTION v_msg USING errcode = v_sqlstate;


and be able to use the same exception block as above in the hard coded example.


I searched and found a couple of similar examples where

RAISE EXCEPTION ’%’, i_msg

is used and works but this does not allow a custom SQLSTATE to be raised and 
trapped.


ie. The following runs ok:


DO

$$

DECLARE


v1 TEXT ;


BEGIN



  v1 := 'SOMETHING IS WRONG';

  RAISE NOTICE '%', v1;

  RAISE EXCEPTION '%', v1;


EXCEPTION

  WHEN SQLSTATE 'E0001' THEN

 RAISE NOTICE '%','Error E0001 raised - going to do something about it';

  WHEN OTHERS THEN

 RAISE NOTICE 'OTHER ERRORS: %', sqlstate;



END

$$


and returns:

NOTICE:  SOMETHING IS WRONG

NOTICE:  OTHER ERRORS: P0001


but obviously the WHEN OTHERS has caught the exception and the SQLSTATE is the 
default P0001.


So, then what i really want is similar to the above but with the USING keyword 
of RAISE being dynamic/parameterised.


So i tried the following:


DO

$$

DECLARE


v_msg TEXT := '''SOMETHING IS WRONG''';

v_sqlstate TEXT := '''E0001''';

v1 TEXT ;


BEGIN

  v1 := v_msg || ' USING errcode = ' || v_sqlstate;

  RAISE NOTICE '%', v1;

  RAISE EXCEPTION '%', v1;


EXCEPTION

  WHEN SQLSTATE 'E0001' THEN

 RAISE NOTICE '%','Error E0001 raised - going to do something about it';

  WHEN OTHERS THEN

 RAISE NOTICE 'OTHER ERRORS: %,%', sqlstate,sqlerrm;



END

$$


which returns:

NOTICE:  'SOMETHING IS WRONG' USING errcode = 'E0001'

NOTICE:  OTHER ERRORS: P0001,'SOMETHING IS WRONG' USING errcode = 'E0001'


So clearly the whole of v1 (whilst syntatically correct) is treated as the 
message and the default sqlstate of P0001 is still raised and caught by WHEN 
OTHERS.


Have tried a few other things but cannot find way to get a custom 
errcode/sqlstate to be raised except by hardcoding it ! I’m sure there must be 
a way to do this!


Any help or advice on how to achieve this very much appreciated !


Re: [GENERAL] Surprising results from array concatenation

2017-04-25 Thread Mike Blackwell
On Tue, Apr 25, 2017 at 12:53 PM, Tom Lane  wrote:

> Yeah.  The core problem here is that the parser has to disambiguate the
> || operator: is it "anyarray || anyelement" or "anyarray || anyarray"?
>
​<...>​


> Peeking at the contents of the literal would make the behavior very
> unpredictable/data-dependent, so we don't.


​Fair enough.

Would a note in that section of the docs pointing out this behavior be
worthwhile?


[GENERAL] Surprising results from array concatenation

2017-04-25 Thread Mike Blackwell
The docs (section 9.18 for PG 9.6) show as an example for array
concatenation

ARRAY[4,5,6] || 7

which works fine.  However, trying the same with an array of text doesn't
work:

# select array['a','b','c'] || 'd';
ERROR:  malformed array literal: "d"
LINE 1: select array['a','b','c'] || 'd';
 ^
DETAIL:  Array value must start with "{" or dimension information.

Casting the second value to TEXT works.

# select array['a','b','c'] || 'd'::TEXT;
 ?column?
---
 {a,b,c,d}
(1 row)

The assumption that the second argument is an array constant seems
surprising.

__
*Mike Blackwell | Technical Analyst, Distribution Services/Rollout
Management | RRD*
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
mike.blackw...@rrd.com
http://www.rrdonnelley.com


<http://www.rrdonnelley.com/>
* <mike.blackw...@rrd.com>*


Re: [GENERAL] Column Tetris Automatisation

2017-01-15 Thread Mike Sofen
|From: Nicolas Paris
|Hello,
|In postgresl order of columns does have an non negligeable impact on table
|size[1].
|Table are in many cases dynamic, and new fields can appear in the database
life.
|I suspect re-ordering columns based on types would be an automatisable task
|
|Moreover, most client code should not be dependent on column order. Then a
|TETRIS option in VACUUM FULL would be usefull for many users.
|Does that makes sense ?

For me it doesn't.  As a database architect/engineer, I always create a data
model first (using some software like Xcase, Embarcadero, Erwin), then use
that software to generate the DDL to create the physical database objects,
then start creating stored functions against those objects.

Since my code follows my model, and since I don't use "select *", and since
I prefer to have all of my identifiers at the "top" of a table, order does
matter...it all ties together nicely, making it easier for other developers
to follow an identical pattern across all of the database objects. 

All of that said, the notion of embedding Tetris functionality into a
codebase makes me smile, for some reason...

Mike Sofen



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] LYDB: What advice about stored procedures and other server side code?

2016-12-28 Thread Mike Sofen
|From: Christoph Moench-Tegeder
|Initially, running code in your database can make life easier for the 
developers
|(ise pgTap for testing, pl/profiler and pl/debugger, etc.). But once you have 
to
|change your schema, the hurt begins:
|you'll need downtime for that, or you'll have to deal with the possibility of
|changing the API of your "database side" code, and matching code and tables on
|the database.

I look at this from the opposite direction: with a stable database API (via 
stored procs), I can change the schema and logic within the procs without 
causing any app code breakage…the app tier is completely insulated from those 
changes – that’s worth a lot.  Yes, for deploying the schema change there must 
be an outage, but that’s true regardless of where the data is being manipulated 
– the data is still in a db…and there are ways to mitigate/eliminate the 
duration of the outage.

|The next pain point is scalability: running code on the database server puts 
your
|code on the most expensive and hardest to scale CPUs. You can (almost) always
|add another appserver to your setup (just spin up a VM with a tomcat or
|whatever-you-use). But if the bottleneck is your database CPUs, you'd have to
|move to a larger server 

Our dedicated db servers have not yet shown any real CPU consumption during app 
use - memory, for us, is the only real limiting factor.  The only time CPU 
consumption spikes is during admin activities - reindexing, vacuuming, bulk 
data loads...that sort of thing.  Even the boxplot calculations barely cause a 
ripple.  To me that speaks to the efficiency of language and engine working 
together.  You are right of course on the scaling - if we do run out of CPU 
horsepower and cannot scale up any more, we'd have to scale out, and there are 
ways to do that too.  IOWs, the model doesn't have to change, just the solution 
to solve the scaling (solve the problem, not the symptom).

|TL;DR: database side code can be a great thing in a small application, but once
|the application and traffic grows, "code in the database"
|requires specialist attention and may become a burden.
|Unfortunately, most large applications started small...

In my opinion, having a database specialist work on database stuff is a GOOD 
thing.  Tables get designed properly, correct indexes are built, efficient 
query plans are created, etc.  ORMs are a shortcut to getting an app talking to 
data, but aren't a substitute for a proper, scalable data tier.  IMO...being a 
data specialist...  :-)

Mike Sofen (Synthetic Genomics)



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-28 Thread Mike Sofen
From: Tim Uckun
I have seen various links on the internet which indicate that PLV8 is 
significantly faster than PL-PGSQL sometimes an order of magnitude faster. 

 

Is there any benefit to choosing PL-PGSQL?



I can’t speak to PLV8.  However, I can speak to plpgsql, and specifically 
stored functions (procs).  I use it exclusively to create a database API for 
real-time web applications to hit.  My API calls (procs) are hitting large 
tables, sometimes doing complex logic within the sproc.  It allows me to 
provide a simple, standardized interface to the web devs, allowing them to 
focus on the app code work.

 

Performance is superb and continues to surprise me (I came from the SQL Server 
world).  As others have mentioned, the natural lashup of plpgsql to postgres (I 
liked Alban’s term, “impedance”), is a key aspect.  Also:

 

-stored procs provide another security layer against sql injection 
attacks.

-Caching SEEMS to be more efficient/effective with stored procs (that 
could be wishful thinking too).

-Stored procs allow skilled sql practitioners to provide far more 
sophisticated sql solutions than the typical python developer is capable of…my 
experience is that most web devs don’t really understand databases (or even 
care about them – they are a necessary evil), so providing a pure encapsulated 
sql solution (via stored procs) removes that mental impedance mismatch.

-Performance?  Simple “get” procs that return data for a specific 
indexed query against larger tables (50m+ rows) in a few milliseconds…I can 
live with that kind of performance.

-I’m also doing some heavy lifting in the sql, calculating histograms 
and boxplots for data visualizations.  This is an unusual scenario, but the 
other option is sending a massive chunk of data to another server for 
processing – just the transit time would kill the deal.  I am mindful that at a 
certain point, there won’t be enough memory and i/o to go around, but the web 
app is a low user count/high user task complexity app, so I’ve tailored the 
model to match.

 

Mike Sofen  (Synthetic Genomics)



Re: [GENERAL] Syntax error needs fresh eyeballs

2016-12-26 Thread Mike Sofen
From: Rich Shepard
   My schema includes three tables and psql throws an error I'm not seeing
when I try to read the schema into the database. I'd appreciate fresh eyes
looking at the table and learning what error I'm not seeing.

 

   The tables:

CREATE TABLE Weather (

   site_id INTEGER PRIMARY KEY,

   site_name TEXT,

   site_location TEXT

);

CREATE TABLE Weather_Params (

   site_id INTEGER

  REFERENCES Weather(site_id),

   param TEXT,

   param_unit TEXT,

   freq INTEGER,

   freq_unit TEXT,

   equip TEXT,

   PRIMARY KEY (site_id, param)

);

CREATE TABLE Weather_Data (

   site_id INTEGER

  REFERENCES Weather(site_id),

   monit_date DATE,

   monit_time TIME,

   read_by TEXT,  -- name of employee collecting data

   param TEXT

  REFERENCES Weather_Params(param),

   param_value REAL,

   PRIMARY KEY (site_id, monit_date, monit_time, param) );

 

   The error:  ERROR:  there is no unique constraint matching given keys for
referenced table "weather_params".

---

Just a guess...You've camel-cased the table names but aren't using double
quotes.  Try either lower-casing all table names and references to them, or
double-quoting all identifiers.  

 

Mike

 



Re: [GENERAL] Storing files: 2.3TBytes, 17M file count

2016-11-28 Thread Mike Sofen
From: Thomas Güttler   Sent: Monday, November 28, 2016 6:28 AM



...I have 2.3TBytes of files. File count is 17M

Since we already store our structured data in postgres, I think about storing 
the files in PostgreSQL, too.

Is it feasible to store file in PostgreSQL?

---

I am doing something similar, but in reverse.  The legacy mysql databases I’m 
converting into a modern Postgres data model, have very large genomic strings 
stored in 3 separate columns.  Out of the 25 TB of legacy data storage (in 800 
dbs across 4 servers, about 22b rows), those 3 columns consume 90% of the total 
space, and they are just used for reference, never used in searches or 
calculations.  They range from 1k to several MB.

 

Since I am collapsing all 800 dbs into a single PG db, being very smart about 
storage was critical.  Since we’re also migrating everything to AWS, we’re 
placing those 3 strings (per row) into a single json document and storing the 
document in S3 bins, with the pointer to the file being the globally unique PK 
for the row…super simple.  The app tier knows to fetch the data from the db and 
large string json from the S3 bins.  The retrieval time is surprisingly fast, 
this is all real time web app stuff.

 

This is a model that could work for anyone dealing with large objects (text or 
binary).  The nice part is, the original 25TB of data storage drops to 5TB – a 
much more manageable number, allowing for significant growth, which is on the 
horizon.

 

Mike Sofen  (Synthetic Genomics USA)



Re: [GENERAL] initdb createuser commands

2016-10-30 Thread Mike Sofen
From: Samuel Williams  Sent: Sunday, October 30, 2016 3:42 PM
As a community I'd think that having feedback from a new user would be valuable 
since as you say, sometimes when you get ingrained into the "way of doing 
things" that you don't see how they could be improved or different.

Samuel

   

I’d take a different tack.  I spent 20 years with SQL Server and easily (almost 
gleefully) hopped over to Postgres and especially pgplsql and PgAdmin III, from 
using SqlServer Management Studio (SSMS – their admin/coding app).

 

Sure, I had to learn the PG way of doing things, but really, it was a 
no-brainer.  I had to spend a few extra cycles learning the PG best practices 
and particular way of doing things but it was trivial…google and done.  The 
vast community has created massive amounts of examples for nearly everything 
imaginable – and some things I would never have imagined anyone would try to do 
– such that I don’t have to Lewis and Clark it but just dive right in and write 
code.

 

IMO, nothing major needs changing in the language or command syntax – it’s 
logical and easy for anyone skilled in sql.  If someone isn’t skilled in sql, 
the requests you’ve made won’t assist them at all.

 

Mike Sofen (Synthetic Genomics)



Re: [GENERAL] Immutable datastore library?

2016-10-18 Thread Mike Sofen
From: Guyren Howe   Sent: Monday, October 17, 2016 1:40 PM
I would like to use Postgres as an immutable data store. A subject table
would have a timestamp column, and that would be added to what would
otherwise be the primary key.  Trap updates and turn them into inserts. Have
an associated _deleted table. Trap deletes and turn them into inserts of the
primary key into that table.  Create a view that only shows the non-deleted
records with the most recent timestamps.

Stored procedure to do all that to a table. Event trigger to drop and
re-create the view on changes to the table columns.
--
A couple years ago at another firm, I designed and built a real time ODS
(operational data store) for a large healthcare company, on SQL Server 2012
that supported your exact requirements, since the ODS was the primary data
feed for their data warehouse.

My solution leveraged the Merge tsql function ("Upsert" in PG) to detect
inserts/updates/deletes.  We don't allow physical row deletes in medical
data, so these were logical deletes aka an update to an
"InactivatedDatetime" column making it not null.  I used a checksum function
in the Update branch to detect if the inbound data had changed at all, to
avoid creating dry updates (no change in the data but a new Update row would
be written otherwise).

Ok that's the internals for the write to the ODS.  I wrapped the entire
Merge statement inside of another insert statement using the equivalent of
the PG "Into" function, which took every column from the ODS write and wrote
the same data to a "History" table of the same name - those writes were
always inserts, creating a persistent, complete picture of every write to
the ODS.  Each row going into the History tables was marked with a "D"
(delete), "I" (insert) or "U" (update).  The History data was used for both
auditing and for nightly batches feeding the data warehouse, where row type
(D, I, or U) drove their Type 2 processing.  As you can imagine, the table
design was crucial to the success of this model.

This was ultra-efficient on the real time data flowing in from the hospital
and clinical EMRs (different systems) - in one transaction/one query, I was
able to double-write the data and ensure both writes completed or fail both
and error.  The "batches" were small - up to 100k rows or less, and
processed in under 100ms.  But even when there was a network outage and we
had to do a catch up load with millions of rows, it ran very quickly.  IOWs,
the double write overhead was very modest, especially with modern disk
performance.

Mike Sofen (Synthetic Genomics)



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Graphical entity relation model

2016-09-28 Thread Mike Sofen
From: jotpe   Sent: Wednesday, September 28, 2016 1:17 PM



Does anybody know a Software for generating graphical entity relation models 
from existing postgresql databases?
Best regards Johannes

---

I don’t know any useful free ones, but my favorite less expensive one (and the 
one I’ve used for the last few years) is xcase (http://www.xcase.com/) .  It 
can reverse engineer against most current popular databases including Postgres. 
 The current version has 2 postgres quirks: 1) you cannot (for some bizzare 
reason) declare a json/jsonb column, so I declare it as text and simply adjust 
the generated DDL as needed.  2) you cannot declare an index on a text column 
(as if they are thinking it is akin to a sql server text column).  Other than 
those issues, which can be easily worked around, it is by far the fastest 
modeling tool I’ve ever used (I’ve used the more famous ones), and the fully 
comprehensive, with a very modern looking UI.

 

In contrast, there are the over-priced dinosaurs with old ugly UIs.  A while 
back I reviewed some of the modeling tools, and none did it for me, I went 
ahead and got another license to xcase.

 

Mike Sofen (Synthetic Genomics)



Re: [GENERAL] how to monitor the progress of really large bulk operations?

2016-09-28 Thread Mike Sofen
From: Pavel StehuleSent: Tuesday, September 27, 2016 9:18 PM
2016-09-28 6:13 GMT+02:00 Pavel Stehule <pavel.steh...@gmail.com 
<mailto:pavel.steh...@gmail.com> >:

Hi

2016-09-27 23:03 GMT+02:00 Mike Sofen <mso...@runbox.com 
<mailto:mso...@runbox.com> >:

Hi gang,

how to view the state of a transaction in flight, seeing how many rows have 
been read or inserted (possible for a transaction in flight?), memory 
allocations across the various PG processes, etc.

some years ago I used a trick 
http://okbob.blogspot.cz/2014/09/nice-unix-filter-pv.html#links

 

pltoolbox has counter function 
https://github.com/okbob/pltoolbox/blob/master/utils.c

pavel=# insert into omega2 select (x.xx).* 
   from (select pst.counter(omega,20, true) xx 
from omega
) x;
NOTICE:  processed 20 rows, current value is '(5,8)'
NOTICE:  processed 20 rows, current value is '(5,8)'

Regards

Pavel

 

 

Pavel - That’s a very interesting function and thanks for sharing your toolbox. 
 The big question of course, is what is the impact on performance, scalability 
and stability?  Would it work inside of a stored function that would allow me 
write out the progress to a tracking table?  

 

Mike



[GENERAL] how to monitor the progress of really large bulk operations?

2016-09-27 Thread Mike Sofen
Hi gang,

 

On PG 9.5.1, linux, I'm running some large ETL operations, migrate data from
a legacy mysql system into PG, upwards of 250m rows in a transaction (it's
on a big box).  It's always a 2 step operation - extract raw mysql data and
pull it to the target big box into staging tables that match the source, the
second step being read the landed dataset and transform it into the final
formats, linking to newly generated ids, compressing big subsets into jsonb
documents, etc.

 

While I could break it into smaller chunks, it hasn't been necessary, and it
doesn't eliminate my need:  how to view the state of a transaction in
flight, seeing how many rows have been read or inserted (possible for a
transaction in flight?), memory allocations across the various PG processes,
etc.

 

Possible or a hallucination?

 

Mike Sofen (Synthetic Genomics)



Re: [GENERAL] IDE for function/stored proc development.

2016-09-03 Thread Mike Sofen
From: Tim Uckun Sent: Saturday, September 03, 2016 2:37 AM
Does anybody use an IDE for doing heavy duty stored proc development?  PGadmin 
is decent but I am looking for something better.

 

I have been using the Datagrip app (from Jetbrains), from its beta release up 
through now v 2016.2 and love it.  Full autocomplete, it has my object browser 
on the left, source code file browser on the right (I have it tied into our 
git), massively customizable to look and behave any way you want it.  It is not 
effective yet for admin tasks.  I really love the modern interface.  I have 
zero connection to the company, paid for a license ($200).  Worth every penny.

 

I’m a hardcore stored proc/func dev, building database api’s for the large 
systems I design/build…I write a LOT of code and datagrip is where I do it.  I 
also use the crash-prone PgAdminIII  for admin stuff like table mods, quick 
scripting of table def or insert columns, or backups, etc…the normal admin 
stuff.

 

MikeS



Re: [GENERAL] Rackspace to RDS using DMS (Postgres 9.2)

2016-09-01 Thread Mike Sofen
From: Joshua D. Drake  Sent: Thursday, September 01, 2016 9:29 AM
On 08/31/2016 03:41 PM, Patrick B wrote:

> Hi guys, I posted this question on the ADMIN list but will post here 

> as well so more people can comment...

>  <https://www.postgresql.org/message-id/CAJNY3it_AfxJhmwMHtpiAbHG47GS5rJ> 
> https://www.postgresql.org/message-id/CAJNY3it_AfxJhmwMHtpiAbHG47GS5rJ

> OAUgfHw%2BGm5OXCbUm7w%40mail.gmail.com

> 

> I've got a 2.3TB Database running at Rackspace... We'll be migrating 

> it to RDS PostgreSQL 9.5 very soon...

> 

> We already have an EC2 Instance at Amazon running PostgreSQL 9.2 as 

> streaming replication from Rackspace.

> 

> I'll have to upgrade the version of Postgres on that instance before 

> start using DMS service.

> 

> *Question:*

> Has anybody ever used that service? I'm just trying to find out how 

> much time it will take to perform the migration...

 

It is 2.3TB, it is going to take a long time no matter what service you are 
running.

 

No, I have not used DMS. Frankly, with all respect to AWS/RDS the idea of 
running a 2.3TB instance that will get any level of performance sounds 
ridiculously expensive.

 

Sincerely,

 

JD

 

I currently have an EC2 instance in AWS – an m4.xlarge (4 cores, 16gb, 3tb 
SSDs) and it’s pretty cheap, about $620/mo ($210/mo for the compute, $410 for 
the storage).  The performance of this setup rivals in-house Cisco UCS server 
that we are demoing that costs ~$100k, as long as our batch sizes don’t exceed 
available memory – that’s where the larger Cisco pulls ahead.  The $620/mo is 
the on-demand price, btw…the reserved price is much lower.

 

$100k/ $620 = 161 months of operation before cost parity.

 

Mike S



Re: [GENERAL] UPDATE OR REPLACE?

2016-09-01 Thread Mike Sofen
On Thu, Sep 1, 2016 at 12:10 PM, dandl <da...@andl.org> wrote:
> Sqlite has options to handle an update that causes a duplicate key. Is 
> there anything similar in Postgres?
> This is not an UPSERT. The scenario is an UPDATE that changes some key 
> field so that there is now a duplicate key. In Sqlite this handled as:
> UPDATE OR IGNORE table SET 
> UPDATE OR REPLACE table SET 
>
> And so on
>
> See https://www.sqlite.org/lang_update.html.
>
> Can Postgres do this?

I would propose that this effectively violates referential integrity and 
shouldn't be a valid design pattern.

In my mind primary keys are supposed to be static, stable, non-volatile...aka 
predictable.  It feels like an alien invading my schema, to contemplate such an 
activity.  I hope PG never supports that.

Postgres allows developers incredible freedom to do really crazy things.  That 
doesn't mean that they should.

Mike Sofen (USA)



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] UUIDs & Clustered Indexes

2016-08-30 Thread Mike Sofen
From: George Neuner  Sent: Tuesday, August 30, 2016 5:54 PM



>Mike Sofen wrote: So in this scenario, I'm using 

>BOTH bigserials as the PK and uuids as AKs in the core tables.  I 

>reference the bigints for all joins and (have to) use the uuids for the 

>filters.  It's been working ok so far, lookup performance on a table 

>with a few million rows, using the uuid (indexed) is instantaneous.  

>I'll soon have a 100 million+ rows loaded into a single table and know a
bit more.

> 

>The uuids are also design insurance for me in case I need to shard, 

>since I'll need/want that uniqueness across servers.

 

FYI:  articles about sharding using bigint keys.

 

 
<http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-in
stagram>
http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-ins
tagram

 <http://rob.conery.io/2014/05/29/a-better-id-generator-for-postgresql/>
http://rob.conery.io/2014/05/29/a-better-id-generator-for-postgresql/

 

George

 

I remember reading these articles a long time ago, forgot about them...and
appreciate the reminder!  

 

I really liked the enhanced Instagram function from Rob Conery in the second
link, but so far haven't needed to deal with it.  However, an upcoming
project may require huge data storage - approaching hundreds of billions of
rows, and I'm sticking with Postgres - so this will be a great way to test
drive the function.  And I may try my hand at a further enhancement, time
permitting.  Thanks for the links!

 

Mike

 



Re: [GENERAL] UUIDs & Clustered Indexes

2016-08-30 Thread Mike Sofen
From: Tom Lane  Sent: Tuesday, August 30, 2016 7:16 AM



Luke Gordon < <mailto:gord...@gmail.com> gord...@gmail.com> writes:

> However, according to a message on this mailing list, Postgres doesn't 

> have clustered indexes:

> "But Postgres doesn't _have_ clustered indexes, so that article 

> doesn't apply at all. The other authors appear to have missed this
important point."

>  <https://www.postgresql.org/message-id/56798352.7060902%40uchicago.edu>
https://www.postgresql.org/message-id/56798352.7060902%40uchicago.edu

 

> But, doing a quick check, it appears Postgres does indeed have a 

> mechanism for a clustered index:

>  <https://www.postgresql.org/docs/9.5/static/sql-cluster.html>
https://www.postgresql.org/docs/9.5/static/sql-cluster.html

 

CLUSTER just does a one-time sort to put the table into index order.

There is no mechanism that would cause subsequent insertions of new keys to
respect that ordering, so it's pretty much irrelevant to the argument about
whether new UUID keys need to be generated in some ordered fashion.

 

Do you actually *need* UUID keys, and if so why?  A plain old bigint column
is smaller, cheaper to index, and the natural mechanism for generating it
(ie a sequence) will tend to preserve ordering for free.

 

 regards, tom lane

 

I agree with Tom for a "normal" application - I would always use bigints
(bigserial) as a PK column.  The app I currently working on is a high
security web app for which the app coders require guids for all identifiers
flowing around the system.  So in this scenario, I'm using BOTH bigserials
as the PK and uuids as AKs in the core tables.  I reference the bigints for
all joins and (have to) use the uuids for the filters.  It's been working ok
so far, lookup performance on a table with a few million rows, using the
uuid (indexed) is instantaneous.  I'll soon have a 100 million+ rows loaded
into a single table and know a bit more.

 

The uuids are also design insurance for me in case I need to shard, since
I'll need/want that uniqueness across servers.

 

Mike Sofen



Re: [GENERAL] Re: Clustered index to preserve data locality in a multitenant application?

2016-08-30 Thread Mike Sofen
From: Thomas Kellerer Sent: Tuesday, August 30, 2016 4:39 AM
Nicolas Grilly schrieb am 30.08.2016 um 13:12:

> We rely on clustered indexes to preserve data locality for each 

> tenant. Primary keys start with the tenant ID. This way, rows 

> belonging to the same tenant are stored next to each other. Because 

> all requests hit only one tenant, this is a great performance 

> improvement.

> 

 

What about partitioning by tenant? With a local index on each partition. 

Partitioning is currently a bit limited in Postgres (e.g. you can't have 
incoming foreign keys) but this would fit your requirements pretty much as I 
understand them. 

 

For Nicolas’s situation, that would require 10,000 partitions – not very 
useful, and each partition would be very small.

 

The premise of clustered indexes is that the row data is “in the index”, so no 
row lookups are needed – that’s very effective/performant and clearly works 
well for many scenarios.  

 

In Postgres, as you mentioned, clustering is a “one time” operation but only in 
the sense that after you add more rows, you’ll need to re-cluster the table.  
Depending on the activity model for that table, that may be feasible/ok.  For 
example, if you load it via regular batch scripts, then the clustering could be 
done after those loads.  If you add rows only rarely but then do lots of 
updates, then the clustering would work great.  If this is an active real time 
data table, then clustering would not be viable.

 

But you may be still be fine without them.  Here’s why:  Postgres is a very 
fast database engine.  I am constantly surprised at how performant it is.  I 
came from the SQL Server world where I always leveraged the automatic clustered 
indexes but in Postgres I am not using them and am running some big data with 
no table partitioning (yet) and performance is still very good.  I carefully 
optimize my data models, so that is part of it.  I also carefully optimize my 
stored functions/indexes.  I am migrating data from 500+ mysql databases/~8,000 
tables/~20 billion rows into a single Postgres db.  As my data grows, I may 
shard it.  The new parallel query support in PG v9.6 may also support your 
table model very nicely, depending on how you query into it.

 

So I’d say, just build a prototype PG db, build that one table, load your 
existing data into it (very easy via the mysqly_fdw data wrapper – it’s exactly 
what I am doing…ping me off list if you need some ideas), put a regular index 
on it and run some queries.  If you have a decent size dev server to work on, 
you should see adequate performance.  And, what you’ll be incredibly pleased 
with is the remarkably rich and flexible plpgsql coding environment.  It’s 
heaven for sql devs.

 

BTW, I am putting together a rather detailed examination of Postgres ETL/bulk 
loading performance, on 2 different platforms: 4 cores/16gb ram/3tb SSD on AWS 
(Amazon cloud), and 48 cores/256gb ram/ 10tb SSD on a monster loaner Cisco UCS 
server.  Should have that posted to the Perform list later this week.

 

Mike Sofen (USA)

 



Re: [GENERAL] Stored procedure version control

2016-06-30 Thread Mike Sofen
-Original Message-
>From: Mark Morgan Lloyd Sent: Thursday, June 30, 2016 2:41 AM
>Neil Anderson wrote:
>> On 2016-06-29 12:37 PM, Mark Morgan Lloyd wrote:
>>> Elsewhere, somebody was asking how people implemented version control 
>>> for stored procedures on (MS) SQL Server.
>>>
>>> The consensus was that this is probably best managed by using scripts 
>>> or command files to generate stored procedures etc., but does anybody 
>>> have any comment on that from the POV of PostgreSQL?
>>>

> [etc.] Thanks everybody, summary passed on.
>Mark Morgan Lloyd
>markMLl .AT. telemetry.co .DOT. uk

A bit late to the thread, but here's some specific details on how I've
implemented version control in PG 9.5, in a small team environment deploying
to single database servers in each tier (dev, qa, stage, prod).  It's
working well so far, and allows my stored proc versions to be aligned with
the middle and upper tier code releases.  I'm the lead database
architect-engineer for a brand new genomics application (lots of data).
Details:
 - we're using git for version control, with a base name for each repo that
holds a single micro-service (like "JobManager") and a suffix for the data
tier code ("JobManagerDBMS") making it simple for devops to find the related
code for a micro-service deployment by repo.
 - within a DBMS repo, I've got subfolders like "scripts", "sprocs",
"documentation", where scripts holds ad hoc scripts that need to be run
during a deployment (might be adjusting DDL or seeding or cleaning up data),
sprocs for stored function files that must be compiled into PG, and
documentation holds notes, data models, etc.  We have a simple python script
that compiles/recompiles all stored proc files within a named folder -
deployment done with one call.
 - I only code using source code files, by cloning an existing suitable base
stored proc (akin to a template) to a new file name (like a "get" or "set"
stored proc) and then revising to match the requirement.  In a detailed
comment block within each stored proc, I list a version number (just for
reference, not used programmatically at this point), change history, author,
comments, and one or more sample calls that form the basis of my unit tests.
 - after I've finished the requested work and the stored procs are working
as expected, I update a Version file in the folder, push it into git and
merge it into the development branch.
 - let's say a new enhancement request comes in.  I create a new branch
(like "RequestForNewThingy"), version that, do the work and merge it back in
just like the above.  So we've got isolation and persistence of changes.
 - I happen to be using the new DataGrip code editor, which supports this
beautifully, since my git tree appears on the right side of editor window,
allowing me directly edit/clone without leaving the editor.

My coding efficiency using this model is quite high...the overhead of using
git is trivial.
For rollbacks, we can simply point to the prior stored proc version and
recompile those.  For DDL rollbacks, I have to code those scripts and supply
them...this is the one place I have to spend a bit more time creating a more
automated solution.  I'd love to hear how other folks have solved
programmatic rollbacks.

Mike Sofen  (San Diego, CA USA)





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] OT hardware recommend

2016-06-18 Thread Mike Sofen
From: Adam Brusselback
Sent: Saturday, June 18, 2016 1:26 PM
Subject: Re: [GENERAL] OT hardware recommend

 

It's really amazing how much solid state drives transferred the database 
bottleneck away from disk.

 

 

Adam – so very true.  We used to spend ungodly amounts of time/money/effort to 
mitigate disk performance limitations.  It is almost MAGIC what SSDs do now.

 

Real numbers don’t lie: 5400 rpm disks can muster no more than 65 IOPS (7200s 
get ~90-100, 10k get 140-150).  So:

 

15 x 65 = 975 IOPS (aka boohoo)

 

Using the AS SSD Benchmark, the Samsung 480gb m2 850 EVO in my core i7 laptop 
measures (IOPS then MB/s):

Random 4k blocks:  7,235 iops read, 14,012 iops

Random 4K-64Threads:  97,743 iops read, 68,864 iops write

Random 512B:  14,380 iops read, 19,858 iops write (db comparison here)

MB/s:

Sequential: 500 MB/s read, 449 MB/s write

Random 4K:  28.26 MB/s read,  54.74 MB/s  write

4K-64Threads: 381.81 MB/s read, 269.00 MB/s write (this is closer to what db 
access looks like).

Access Times:  0.070 ms read, 0.050 ms write

 

Thusly,

1 x SSD = 14.75 times faster than a 15 drive array on reads, and 20 times 
faster on writes.

 

Like everyone else has said, just buy a 1 TB Samsung EVO 850 for $300 (USD) and 
call it a day.  :)

 

Mike

 



Re: [GENERAL] PostgresSQL and HIPAA compliance

2016-06-17 Thread Mike Sofen
-Original Message-
From: Alex John
Sent: Friday, June 17, 2016 3:04 AM
To: pgsql-general@postgresql.org
Subject: [GENERAL] PostgresSQL and HIPAA compliance

Hello, I have a few questions regarding the use of PostgreSQL and HIPAA
compliance. I work for a company that plans on storing protected health
information (PHI) on our servers. We have looked at various solutions for
doing so, and RDS is a prime candidate except for the fact that they have
explicitly stated that the Postgres engine is *not* HIPAA compliant.

Users on the IRC channel generally say that the guidelines are more catered
towards building better firewalls and a sane access policy, but I would like
to know if there is anything within the implementation of Postgres itself
that violates said compliance.

If anyone works at a similar company and utilizes postgresql to store PHI,
please let me know.

Thank you,
  Alex

-

HIPAA compliance does not specify (ever) the technical solution to meet the
requirements, so ANY datastore that can be properly managed within the
context of HIPAA compliance is legal and allowed.  Ignore IRCs and search on
securing PHI on relational databases, you'll find lots of details around
data access roles, documentation, processes, data obfuscation, etc.

Mike






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] regarding schema only migration from sqlserver to postgres with runmtk.sh

2016-06-15 Thread Mike Sofen
 

From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of David G. Johnston
Sent: Wednesday, June 15, 2016 1:31 PM
To: Durgamahesh Manne 
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] regarding schema only migration from sqlserver to 
postgres with runmtk.sh

 

On Wed, Jun 15, 2016 at 11:27 AM, Durgamahesh Manne  > wrote:

may i know  recommended approach to migrate the tables from sql server to 
postgresql instead of using 3rd party tools

 

 

​I suppose SQL Server has some means to export in TSV/CSV or other row-oriented 
structured output.  The \copy psql meta-command or COPY SQL ​command could load 
that data.

 

David J.

 

 

There are MANY tools for sql server, many are built in (aka free) that will 
generate DDL, or extract data, or both, plus you can directly query the system 
catalogs, just like PG.  Utterly trivial stuff to use and Postgres can ingest 
all of it since it’s normally either flat file or ansi sql.  

 

MikeS 



Re: [GENERAL] Alternate or Optimization for with hold cursor

2016-05-12 Thread Mike Sofen
>From: Sangeetha Sent: Thursday, May 12, 2016 1:58 AM

Currently , I am using "With hold" cursor. In our case , the With hold
cursor is used to fetch the next record of the given primary key . The
performance is very slow for large data set. Can you provide me some
alternative ways like having own copy of table , or optimization for With
hold cursor?

Thanks and Regards,
S.Sangeetha<
==

Cursors are the last tool I would ever grab out of my sql toolbox (aka, I
never use one) - it converts the enormous power of a relational database
engine into "RBAR" (row by agonizing row).  For a large dataset in
particular, you are killing the server since the entire resultset must be
retained in working memory for the duration of the query as it peels off one
row at a time from that resultset OR if it's larger than your ram, you'll be
paging to disk constantly.  And since you're working on a single row at
time, it will take forever.

Convert the cursor into a normal query and you should see BIG (10-1000x)
gains in speed.  A cursor can always be converted to normal
sql...always...it's not always easy but it's always worth the effort. 
 
Mike Sofen



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Thoughts on "Love Your Database"

2016-05-04 Thread Mike Sofen
From: dandlSent: Wednesday, May 04, 2016 5:05 PM
To: 'Pierre Chevalier Géologue' <pierrechevalierg...@free.fr>



> From: Pierre Chevalier Géologue [ <mailto:pierrechevalierg...@free.fr> 
> mailto:pierrechevalierg...@free.fr]

> ...

> > Then I think you've seriously misunderstood. Most people can indeed 

> >learn to write basic SQL queries, but those are

> >(obviously) not what I'm talking about.

> >

> > To write the business logic of a significant application entirely in 

> >SQL requires PLSQL (or in other dialects, whatever passes for SQL/PSM).

> >It means writing an entire data access layer as a set of stored 

> >procedures, with a substantial set of special functions, types, 

> >triggers and so on. No beginner and few experts have the skills 

> >required to do that in SQL, and then debug that code on the server.

> 

> All right, I understand better now.  I think I also totally missed 

> your point, sorry...

> I'll give a look at andl.

 

I hope you do. Please feel free to contact me with any comments, suggestions, 
etc.

 

I have not completed the Postgres implementation -- probably another couple of 
weeks – 

but in-memory and Sqlite are there.

 

Bonne chance!

 

Regards

David M Bennett FACS

===

 

I disagree.  I’ve worked as database architect/engineer at a number of large 
and small firms in various verticals (healthcare, financials, insurance, 
aerospace, telecom, etc), and created complete database api’s via stored 
procs/stored functions, some of which were quite complex.  I’ve found that a 
mid-level database developer, with modest coaching and good comments in the 
code, can pick up the code, support it and even enhance it.  So the notion that 
experts can only write and maintain quality code isn’t valid in my experience.

 

There is definitely a difference in capability/velocity/solution  solving 
between junior, mid-level and senior developers, but that isn’t a deal killer, 
it’s just something that needs to be managed and accounted for.  

 

One reason for a database api is that ORMs have proved themselves incapable of 
proper scaling and ACID compliance, where stored procs/functions are capable of 
leveraging the massive set-based relational power of the underlying engine, and 
leverage efficient functionalities like windowing functions.

 

So I guess you’d say I’m in the entirely opposite camp, since it’s proven to be 
such an effective solution architecture for many applications that leverage 
relational database engines.

 

Mike Sofen  (San Diego, CA  USA)



Re: [GENERAL] Thoughts on "Love Your Database"

2016-05-04 Thread Mike Sofen
>From: Vincent Veyron   Sent: Wednesday, May 04, 2016 3:35 PM

>I write management applications for businesses, and give _a lot_ of care to
the database structure. 
>I find that the number of lines of code that need to be written is strictly
inversely correlated to the
 >appropriateness of the database design (meaning that the better the
database structure, the lesser 
>code is needed).

>Knowing about the many fine functions Postgresql offers also helps, of
course.
>   Bien à vous, Vincent Veyron 


An inverse corollary is also true, imo: encountering demands for exotic,
wild functions, cursors and code constructs (like triggers calling triggers
or frankly, triggers in general), is nearly always an indication of poor
database design.  I'm not talking about windowing functions or json or CTEs,
btw.

Postgres and mysql have piles and piles of functions that I will never use
and can't even imagine scenarios in which to use them.  So I agree 100% -
it's all about the database (design).   BTW, I'm currently
designing/building OLTP databases for use in genomics research (using
Postgres)...that's big data...where there is zero tolerance for slack db
design that could cause scalability or performance issues.  My stored
functions are...relatively simple.

Mike Sofen  (San Diego, CA USA)




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Function PostgreSQL 9.2

2016-05-03 Thread Mike Sofen
From: David G. Johnston  Sent: Tuesday, May 03, 2016 2:46 PM
To: drum.lu...@gmail.com



…The only other reasonable option is change your model and requirements to 
something less complex.

 

 

Seriously, get yourself the books I have recommended and study them  BEFORE you 
continue attempting to design your database.

You need a firm understanding of logical design & flow, otherwise you will be 
wasting your time. 

That's what I'm doing.. Studying.. asking for some help to get a better 
understand isn't this the purpose of this mail list?

 

​The purpose of this list is whatever people are willing to make of it - within 
reason.  I share the sentiment that your particular method of education is 
becoming burdensome to the people who volunteer their time on these lists to 
answer questions.  As the alternative is to simply stop replying to your emails 
be grateful that someone was at least willing to tell you to try other avenues 
of education.

 

David J.

 

+1   +1   +1   +1:  Lucas – David really nails it: you’ve stopped relying on 
yourself and are relying on the mercy and generosity of strangers to solve your 
problems.  It takes a lot of time to read, dissect, understand and then comment 
on your (or anyone’s) emails, and you’ve proven adept at consuming much of the 
available oxygen on the list.  Please a bit more considerate - you’ve received 
an amazing amount of solid guidance and advice.  I can tell you that when I see 
your name as the author, I now normally delete the email – that’s NOT the 
response you want, right?

 

I’ve worked with a bunch of junior developers over the years, some of whom hit 
a wall and just sit there, waiting for someone to fix them.  Those junior 
developers “never” become senior developers because their peers soon lose 
interest in collaborating with them, if you catch my drift…

 

Mike Sofen



Re: [GENERAL] (VERY) Slow Query - PostgreSQL 9.2

2016-05-03 Thread Mike Sofen
 

From: drum.lu...@gmail.com <mailto:drum.lu...@gmail.com>   Sent: Tuesday, May 
03, 2016 2:55 AM
I'm trying to get the query below a better performance.. but just don't know 
what else I can do...

Please, have a look and let me know if you can help somehow.. also.. if you 
need some extra data jet ask me please.

* Note that the gorfs.inode_segments table is 1.7TB size

I have the following Query:

explain analyze
SELECT split_part(full_path, '/', 4)::INT AS account_id,
   split_part(full_path, '/', 6)::INT AS note_id,
   split_part(full_path, '/', 9)::TEXT AS variation,
   st_size,
   segment_index,
   reverse(split_part(reverse(full_path), '/', 1)) as file_name,
   i.st_ino,
   full_path,
   (i.st_size / 100::FLOAT)::NUMERIC(5,2) || 'MB' AS size_mb
FROM gorfs.inodes i
JOIN gorfs.inode_segments s
  ON i.st_ino = s.st_ino_target
WHERE
  i.checksum_md5 IS NOT NULL
  AND s.full_path ~ '^/userfiles/account/[0-9]+/[a-z]+/[0-9]+'
  AND i.st_size > 0;
  split_part(s.full_path, '/', 4)::INT IN (
 
SELECT account.id
FROM public.ja_clients AS account
WHERE
NOT (
((account.last_sub_pay > EXTRACT('epoch' FROM 
(transaction_timestamp() - CAST('4 Months' AS INTERVAL AND 
(account.price_model > 0)) OR
(account.regdate > EXTRACT('epoch' FROM 
(transaction_timestamp() - CAST('3 Month' AS INTERVAL OR
(((account.price_model = 0) AND (account.jobcredits > 0)) AND 
(account.last_login > EXTRACT('epoch' FROM (transaction_timestamp() - CAST('4 
Month' AS INTERVAL)
) LIMIT 100
);

 

There is one obvious solution: restructure your data, since it is not in a 
“standard” form but you’re trying to query it as if it were…you are turning 
your long full_path string into columns…if performance is a concern, that 
overhead has to be eliminated.  

Your two choices would be to either restructure this table directly (requiring 
a change in app code that was filling it), or use it to fill a proper table 
that already has everything decomposed from the long full_path string via 
post-processing after the insert.  A third consideration would be to archive 
off older/unneeded rows to a history table to reduce row counts.  This is about 
proper structure.

Mike Sofen



Re: [GENERAL] truncate table getting blocked

2016-04-26 Thread Mike Sofen
From: Jayadevan M  Sent: Tuesday, April 26, 2016 6:32 AM



Hello,

 

I have a python script. It opens a cursor…

 

Thanks,

Jayadevan

 

 

 



Re: [GENERAL] Function PostgreSQL 9.2

2016-04-19 Thread Mike Sofen
This is such a poorly designed, hacked together “thing” – it isn’t a database, 
it’s someone’s idea of how to store data when they don’t know how to store 
data, like they moved it from Access or Excel.  

 

Just start over and design a proper relational schema with best practices and 
you’ll save, oh, perhaps 10 years of wasted effort and 12 million emails.  This 
is as close to bandaids on bandaids on steroids that it comes.  Really – 
rethink your solution model.

 

Mike

 

From: drum.lu...@gmail.com <mailto:drum.lu...@gmail.com>Sent: Tuesday, 
April 19, 2016 7:40 PM
Just forgot to say:


Does increment_client_code relate to users or some other table, say clients?

nope.. there is no link between them

If the users.code is empty/null, then the trigger has to get the last number 
from client_code_increment and put on the users.code column



Re: [GENERAL] Re: Why is the comparison between timestamp and date so much slower then between two dates

2016-04-14 Thread Mike Sofen
|-Original Message-
|From: Thomas Kellerer   Sent: Wednesday, April 13, 2016 11:37 PM
|
|Alban Hertroys schrieb am 13.04.2016 um 16:39:
|>>> So my question is: why is comparing a timestamp to a date so much slower?
|>
|> The reason that the other way around is so much more expensive is that
|> the database needs to do that conversion twice for every row in the
|> table. When down-converting now(), the DB only needs to do that once
|> for all rows.
|
|Why does it do that for each row? The value of now() won't change while the
|statement is running, so this conversion could be done once at the start of the
|statement.

The general rule in the SQL Server world is that using a function in a Where 
clause or join will eliminate usage of an index that would have been leveraged 
if the function didn't exist.  The reason is that functions are 
non-deterministic, so the optimizer can't possibly tell in advance what the 
outcome will be and thus takes the safest route to completion.  I'm betting 
that the same logic holds in PG (I just haven't tested it enough to be 
absolutely sure).

In the case of now() in the Where clause, to avoid the conversion/loss of index 
usage, I always place (what should be a static value anyway) the output of 
now() into a local variable and then use that in the Where clause...and get my 
index back.

This is just a style of coding (no functions in where clauses/joins), but one 
that doesn't seem prevalent in PG...instead I see people using functions within 
functions within functions, the cascading impact of which becomes very hard to 
unravel.

Mike Sofen



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] I/O - Increase RAM

2016-04-13 Thread Mike Sofen
|From: John R Pierce  Sent: Wednesday, April 13, 2016 1:53 PM
|
|On 4/13/2016 1:43 PM, drum.lu...@gmail.com wrote:
|> At the moment I'm having 100% I/O during the day. My server has SATA
|> HDs, and it can't be changed now.
|> So, to solve the problem (or at least try) I was thinking about double
|> the RAM, and by doing that, increasing the cache.
|
|depends on if its read or write IO.   many of our database servers are
|nearly 100% write IO, only thing that will speed that up is faster disks and/or
|more disks in raid10.
|--
|john r pierce, recycling bits in santa cruz

Agree with John and also add that if your READ queries or data model are not 
optimized then you could throw a TB of ram at it and see no difference.  You 
need to analyze your queries and find out WHY there's so much i/o.  It sounds 
like thrash to me...so you need to do some homework and get the behaviors 
sorted out, then it should become obvious what needs fixing.

Mike



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] vacuum - reclaiming disk space.

2016-03-19 Thread Mike Blackwell
I have a large table with numerous indexes which has approximately doubled
in size after adding a column - every row was rewritten and 50% of the
tuples are dead.  I'd like to reclaim this space, but VACUUM FULL cannot
seem to finish within the scheduled downtime.

Any suggestions for reclaiming the space without excessive downtime?


* *


Re: [GENERAL] How to delete few elements from array beginning?

2016-03-09 Thread Mike Sofen
>>Alexander Farber wrote on  Wednesday, March 09, 2016 4:11 AM



Hello fellow PostgreSQL users,

what is please the most efficient way to delete a slice from the start of a 
longer array (after I have copied it to another array)?

Do I really have to copy a large slice of the array to itself, like in the last 
line here:

pile_array := pile_array || swap_array;

/* here I copy away swap_len elements */
new_hand := pile_array[1:swap_len];

/* here I don't know how to efficiently remove already copied elements 
*/
pile_array := pile_array[(swap_len + 1):ARRAY_LENGTH(pile_array, 1)];

or is there a better way?

Thank you

Alex

<< 

Have you considered a normal (relational), non-array-based data model for this 
app (2 or 3 tables in a 1:M/M:M) instead of the single table model you’ve 
shown?  That would then allow you to use normal sql set-based operations that 
are readable, understandable, maintainable and very fast/scalable.  

When I see row by row operations (looping or cursors) in what should be a real 
time query…that’s my alarm bell that perhaps the code has wandered off a valid 
solution path.

Mike



[GENERAL] regexp_replace to remove sql comments

2015-10-28 Thread Mike

Hi,

I am trying to clean up the query field returned by the 
pg_stat_statements extension and remove all comments.


Some of the queries in the query field contain comments like '-- some 
comment' and also '/* c style comments */'


I have managed to strip off the '--' comments and also white space but 
after trying numerous regex for this via google but I am stuck.


WITH to_clean AS (
  SELECT
regexp_replace(
   regexp_replace(trim(query), '--[^\r\n]*', '')  --clear up 
comments like this one <-- this is ok

, '\s+', ' ', 'g') as q--clear up white space  <-- this is ok
  FROM public.pg_stat_statements
  WHERE dbid IN (SELECT oid FROM pg_database WHERE datname = 
current_database())

)

SELECT  regexp_replace(q,'/\*.*\*/','') as q  /* strip off comments like 
this */ <-- cannot get a regex to do this

FROM to_clean ORDER BY q


Im now thinking it may be better to do in a pgsql function as I think if 
the comments are in queries then they need to be ignored.


Has anyone done anything like this?

Thanks,

Mike.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] regexp_replace to remove sql comments

2015-10-28 Thread Mike


Thanks with a bit of moving stuff about I think thats sorted it - in 
case anyone every needs it:


  SELECT
   query,
 trim(regexp_replace(
regexp_replace(
   regexp_replace(query,'\/\*.+\*\/','','g'),
'--[^\r\n]*', ' ', 'g')
 , '\s+', ' ', 'g')) as q
   FROM public.pg_stat_statements
   WHERE dbid IN (SELECT oid FROM pg_database WHERE datname =
 current_database())

 order by query

Thanks again,

Mike.

On 28/10/2015 22:43, Marc Mamin wrote:

', '\/\*.+\*\/','','g'):




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Invalid memory alloc request size

2015-08-24 Thread Mike James
Over the weekend, Amazon did some maintenance that resulted in one of our
instances being restarted. Apparently this left the database in a bad
state. This particular instance functions as a slony replication target and
when I went to start up slony, I get the following error message. Here's
some relevant info. The last bit is from the slony log file and it shows
the invalid memory alloc request size error. Is there a way to recover
this or do I go ahead and restore from backup?

ubuntu@ip-10-90-23-122:~$ psql -U postgres -c SELECT version();
 PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2
20080704 (Red Hat 4.1.2-52), 64-bit
(1 row)

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 12.04.4 LTS
Release: 12.04
Codename: precise

2015-08-24 06:50:03 UTC INFO   remoteWorkerThread_1: syncing set 1 with 102
table(s) from provider 1
2015-08-24 06:50:33 UTC ERROR  remoteWorkerThread_1_1: error at end of COPY
IN: ERROR:  invalid memory alloc request size 1970234207
CONTEXT:  COPY sl_log_1, line 97033:
  [image: Clutch Holdings, LLC] http://www.clutch.com Mike James |
Manager of Infrastructure
267.419.6400, ext 204 | mike.james@clutch.com201 S Maple St. | Suite 250 |
Ambler, PA 19002
Clutch.com http://www.clutch.com | Twitter
https://twitter.com/clutchsuccess | LinkedIn
https://www.linkedin.com/company/2837209 | YouTube
https://www.youtube.com/user/clutchsuccess | Clutch Support Center
http://clientsupport.clutch.com/ The only end to end consumer management
platform that empowers consumer-focused businesses to identify, target,
message, and engage their best customers.


[GENERAL] inputs into query planner costing

2015-03-31 Thread Mike Roest
Hi there,
   I'm having an issue with query performance between 2 different pgsql
environments.

Ther first is our current production postgres server with is running 9.3.5
on Centos 5 x64.  The second system is Amazon's RDS postgres as a service.
On our local DB server we have a query that executes in a reasonable amount
of time (600 msec).  On RDS the query will run for more then 10 minutes on
a similarly CPU specced systems.  I've been working through with Amazon
support and I'm looking for more suggestions on where to look (both for me
and to direct Amazon).  The RDS system does use a network filesystem while
our production server is a local RAID10 array, I can see that effecting the
actual performance of the query but not the query planner costing (unless
there's an input to query planner costing that I can't find)

The Query plan costs generated by the 2 systems are vastly different, while
the plans themselves are basically identical other then the materialization
that RDS is doing (if I disable the materialization then they are almost
the same other then a seq scan/heap scan on one small 2000 row table).
All the tables in the query have been analyzed on each server without any
impact

Current Production
Explain:
http://explain.depesz.com/s/Tkyc
Explain Analyze
http://explain.depesz.com/s/UnQt

RDS: (with enable_material=off)
http://explain.depesz.com/s/vDiV

(with enable_material=on)
http://explain.depesz.com/s/HUjx

I have validated that all the query planning configuration variables on
this page
http://www.postgresql.org/docs/9.3/static/runtime-config-query.html are the
same between the 2 environments.  If I modify the local production system
values for things like random_page_cost and seq_page_cost to absurd values
like 6 I can get it to generate a similar planner cost.  Similarly if I
lower the RDS values to absurdly low values like .0001 I can get it to
generate a similarly costed plan (while still performing horridly).

I've reached the end of things I can think about (I'm also working on
rewriting the query but it's a generated query out of a infrastructure
component so it's not a simple change).

Just looking for any ideas on additional things to look into.

The query is available here:
https://www.dropbox.com/s/m31ct6k0mod0576/simplifiedquery.sql?dl=0

-- 
Data's inconvienient when people have opinions.


Re: [GENERAL] inputs into query planner costing

2015-03-31 Thread Mike Roest
Thanks for the responses

For anyone searching in the future I'll answer Tom's questions and list the
boneheaded fix that it ended up actually being (really painful as I've been
fighting this for a week).

1) According to amazon they run stock postgres as far as the query planner
is concerned.
2) Yes sorry I forgot to note on our prod system the random_page_cost was 2
vs 4 on the RDS system.
3) I had run vacuum on all the tables in the query and the pg_relation_size
on the tables aren't way out of wack
4) Yep both default_statistics_target was the default of 100 on both.

I was concentrating completely on the wrong direction here.  What it turned
out to be was the RDS configuration of postgres which we had modified
somewhat I had missed configuring work_mem to something greater then their
default of 1 MB.  Once I brought work_mem upto the same value as our
production server low and behold the query runs fast.

Sorry for wasting everyones time.  Hopefully this will help someone else
down the line.


Re: [GENERAL] 9.4+ partial log-shipping possible?

2015-03-27 Thread Mike
No idea if this is an option for you or not, but if you make tables 
unlogged they won't generate WAL and therefore won't be replicated:


http://rhaas.blogspot.ca/2010/05/global-temporary-and-unlogged-tables.html

Of course unlogged tables have several drawbacks to carefully consider.


On 03/26/2015 08:32 AM, Sven Geggus wrote:

Hello,

I have a question regarding log-shipping replication, but let me first
explain what I currently do.

Usually Openstreetmap Tile Servers are currently set up using a
PostgreSQL/Postgis Database which stores the data need for rendering raster
map-tiles.

After the initial import of a complete OSM dataset it is possible to
replicate changes to this database using cyclic calls to openstreetmaps own
tool called osm2pgsql.

Unfortunately there is one major drawback with this approach:

We need to somehow save the state of the osm database to be able to apply
future changes.  This is currently done using a few tables in the target
database.  However these tables are not needed for map rendering and are
consuming by far the most disk space (still somewhat expensive on SSD)!

So here is my question:

Would it be possible to have a setup, where one master data database will
act in the above matter (still running osm2pgsl) but will also provide
publicly availabe data for log-shipping standby servers?

We would need to be able to explicitely specify the tables to be replicated
or the other way round explicitely exclude a couple of them. If this helps
it would be possible to separate them by tablespaces.

Regards

Sven



--
Mike


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] ltree gist index errors and fill factor questions

2014-12-31 Thread Mike Broers
Hello,

We have an ltree column (tree_path) that has a gist index
(index_nodes_on_tree_path).  This is in a 9.3.5 database.  Recently errors
started occurring in the postgres log on some updates to this table:

fixing incomplete split in index index_nodes_on_tree_path, block 2358

STATEMENT:  UPDATE nodes SET parent_id = $1, contents_id =
$2, contents_type = $3, trough_code = $4, live_flag = $5, ordering
=$6, id = $7, tree_path = $8 WHERE nodes.id = 127695

ERROR:  failed to add item to index page in index_nodes_on_tree_path


Reindexing index_nodes_on_tree_path fixes the problem temporarily, but it
returns.

I'm wondering if this is a bug (googleing the incomplete split error
returns results about a similar bug with b-tree indexes) or a problem with
an inadequate fill factor setting.  It doesnt look like there is a
specified fill factor for this index and I'm not sure what the gist default
is.

CREATE INDEX index_nodes_on_tree_path ON nodes USING gist (tree_path)

The table in question has about 94k rows, an example of the widest
tree_path tuple is 69066.69090.69091.69094

Any advice is appreciated, happy new year!

Mike


Re: [GENERAL] ltree gist index errors and fill factor questions

2014-12-31 Thread Mike Broers
I will do my best to provide a reproducible test case.  Is there any more
information I can supply in the meantime that would help?


Re: [GENERAL] ltree gist index errors and fill factor questions

2014-12-31 Thread Mike Broers
The database is not crashing thankfully.  We are waiting for the errors to
come back to turn up logging in the hopes of creating the reproducible set.


[GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding

2014-12-29 Thread Mike Cardwell
I'd like to store hostnames in a postgres database and I want to fully support
IDNs (Internationalised Domain Names)

I want to be able to recover the original representation of the hostname, so I
can't just encode it with punycode and then store the ascii result. For example,
these two are the same hostnames thanks to unicode case folding [1]:

  tesst.ëxämplé.com
  teßt.ëxämplé.com

They both encode in punycode to the same thing:

  xn--tesst.xmpl.com-cib7f2a

Don't believe me, then try visiting any domain with two s's in, whilst replacing
the s's with ß's. E.g:

  ericßon.com
  nißan.com
  americanexpreß.com

So if I pull out xn--tesst.xmpl.com-cib7f2a from the database, I've no idea
which of those two hostnames was the original representation.

The trouble is, if I store the unicode representation of a hostname instead,
then when I run queries with conditions like:

  WHERE hostname='nißan.com'

that wont pull out rows where hostname='nissan.com'. Also, if I create a unique
index on the hostname field, or even on lower(hostname), that wont stop the
same hostname being inserted more than once, with slightly different
representations.

So the system I've settled with is storing both the originally supplied
representation, *and* the lower cased punycode encoded version in a separate
column for indexing/search. This seems really hackish to me though.

It seems to me that Postgres would benefit from a native hostname type and/or
a pair of punycode encode/decode functions. And perhaps even a simple unicode
case folding function. With the end result that these return TRUE:

  unicode_case_fold('ß') = 'ss'

  'xn--tesst.xmpl.com-cib7f2a' = punycode_encode('teßt.ëxämplé.com')

  punycode_decode('xn--tesst.xmpl.com-cib7f2a') = 'tesst.ëxämplé.com'

A native type would also be able to apply suitable constraints, e.g a maximum
length of 253 octets on a punycode-encoded trailing-dot-excluded hostname, a
limit of 1-63 octets on a punycode encoded label, no leading or trailing hyphens
on a label, etc.

I initially created a pair of functions for using punycode using plperl and the
IDNA::Punycode Perl module, but the performance wasn't good enough, so I settled
with duplicating the data to sacrifice disk space and complexity, for speed.

I'm new to Postgres, and to this list, so if there is a better way for me to
submit this suggestion or solve my problem, please point me in the right
direction.

[1] http://www.unicode.org/Public/UNIDATA/CaseFolding.txt

Regards,

-- 
Mike Cardwell  https://grepular.com https://emailprivacytester.com
OpenPGP Key35BC AF1D 3AA2 1F84 3DC3   B0CF 70A5 F512 0018 461F
XMPP OTR Key   8924 B06A 7917 AAF3 DBB1   BF1B 295C 3C78 3EF1 46B4


signature.asc
Description: Digital signature


Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding

2014-12-29 Thread Mike Cardwell
* on the Mon, Dec 29, 2014 at 04:23:22PM -0700, David G Johnston wrote:

WHERE hostname='nißan.com'

 
 _IF_ Postgres had a punycode function, then you could use:
 WHERE punycode(hostname) = punycode('nißan.com')
 
 If the OP wraps what he is doing up into a function that is what you end up
 getting: a memoized punycode function.
 
 http://en.wikipedia.org/wiki/Memoization
 
 It has to be defined as volatile but basically write the function to check
 for the provided input on the indexed table and if it doesn't exist the
 function will calculate the punycode value and store it onto the table
 before returning the punycode value to the caller.

I'm not sure all that is necessary. It could be quite a simple function,
like the lower() function. So what I would do is this:

CREATE UNIQUE INDEX hostnames_hostname_key ON hostnames 
(lower(punycode_encode(hostname)));

That would prevent adding more than one representation for the same hostname
to the column. And I if I wanted to do a fast, indexed search where I could
supply any representation of the hostname as input, I would just do:

WHERE lower(punycode_encode(hostname)) = 
lower(punycode_encode('any-representation'))

There doesn't need to be any extra table storage for the punycode encoded
version.

-- 
Mike Cardwell  https://grepular.com https://emailprivacytester.com
OpenPGP Key35BC AF1D 3AA2 1F84 3DC3   B0CF 70A5 F512 0018 461F
XMPP OTR Key   8924 B06A 7917 AAF3 DBB1   BF1B 295C 3C78 3EF1 46B4


signature.asc
Description: Digital signature


Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding

2014-12-29 Thread Mike Cardwell
* on the Mon, Dec 29, 2014 at 07:00:05PM -0500, Andrew Sullivan wrote:

 CREATE UNIQUE INDEX hostnames_hostname_key ON hostnames 
 (lower(punycode_encode(hostname)));

 This wouldn't work to get the original back if oyu have any IDNA200
 data, because puncode-encoding the UTF-8 under IDNA2003 and the
 punycode-decoding it doesn't always result in the same label.  See my
 other message.

The original is the thing that is stored in the database. I wouldn't need to
do any conversion to get the original back. In my example I am storing
the original and creating an index on the punycode version.

This is exactly the same method that we commonly use for performing case
insensitive text searches using lower() indexes.

-- 
Mike Cardwell  https://grepular.com https://emailprivacytester.com
OpenPGP Key35BC AF1D 3AA2 1F84 3DC3   B0CF 70A5 F512 0018 461F
XMPP OTR Key   8924 B06A 7917 AAF3 DBB1   BF1B 295C 3C78 3EF1 46B4


signature.asc
Description: Digital signature


Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding

2014-12-29 Thread Mike Cardwell
* on the Mon, Dec 29, 2014 at 05:01:44PM -0700, David G Johnston wrote:

 CREATE UNIQUE INDEX hostnames_hostname_key ON hostnames
 (lower(punycode_encode(hostname)));

 That would prevent adding more than one representation for the same
 hostname
 to the column.
 
 Except two different hostname can resolve to the same
 punycode_encode(hostname) value

That's exactly what I'm taking advantage of...

 so the unique index won't work.

I think you misunderstand what I'm suggesing. I'm suggesting storing the
unicode version in the database, and then creating a unique index on the
result of the punycode function to prevent more than one representation
of the same hostname being inserted.

 It was also mentioned that using the Perl encoding function was
 non-performant; which is why caching the data into a memoization table has
 value.

It was non-performant because it involved unnecessarily starting a Perl
interpreter. Not because the algorithm its self is slow.
 
 WHERE lower(punycode_encode(hostname)) =
  lower(punycode_encode('any-representation'))
 
 
 I'm not for knowing the rules of punycode but I'm not seeing what value
 lower() provides here...

Case insensitive matching. So that EXAMPLE.COM = example.com

-- 
Mike Cardwell  https://grepular.com https://emailprivacytester.com
OpenPGP Key35BC AF1D 3AA2 1F84 3DC3   B0CF 70A5 F512 0018 461F
XMPP OTR Key   8924 B06A 7917 AAF3 DBB1   BF1B 295C 3C78 3EF1 46B4


signature.asc
Description: Digital signature


Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding

2014-12-29 Thread Mike Cardwell
* on the Mon, Dec 29, 2014 at 07:22:21PM -0500, Andrew Sullivan wrote:

 can't just encode it with punycode and then store the ascii result. For 
 example,
 these two are the same hostnames thanks to unicode case folding [1]:
 
   tesst.ëxämplé.com
   teßt.ëxämplé.com
 
 Well, in IDNA2003 they're the same.  In IDNA2008 (RFC 5890 and suite),
 they're not the same.  In UTS46, they're kind of the same, because
 pre-lookup processing maps one of them to the other (it depends which
 mode you're in which way the mapping goes, which is just fantastic
 because you can't tell at the server which mode the client is in.
 IDNA is an unholy mess); but the lookup is still done using the
 IDNA2008 rules, approximately.

Heh. And I just thought I was finally starting to get to grips with this stuff.
 
 They both encode in punycode to the same thing:
 
   xn--tesst.xmpl.com-cib7f2a
 
 Under no circumstances should they encode to that.

Eurgh, you're right. The library I'm using does actually do it right, I just
forgot to split on the dot and encode each label separately when writing the
examples for this email. Sorry for confusing matters.

[snip lots of useful and interesting information]

 You seem to want a bunch of label constraints, not all of which are
 related to IDNA. I think it would be better to break these up into a
 small number of functions.  As it happens, I have a colleague at Dyn
 who I think has some need of some of this too, and so it might be
 worth spinning up a small project to try to get generic functions:
 to_idna2003, to_idna2008, check_ldh, split_labels, and so on.  If this
 seems possibly interesting for collaboration, let me know  I'll try
 to put together the relevant people.

Those functions would be very useful to me. I know a bit of C, but probably not
enough to produce an acceptable patch. If there are people who would also find
these functions useful, and people motivated to implement them, that would
be great...

-- 
Mike Cardwell  https://grepular.com https://emailprivacytester.com
OpenPGP Key35BC AF1D 3AA2 1F84 3DC3   B0CF 70A5 F512 0018 461F
XMPP OTR Key   8924 B06A 7917 AAF3 DBB1   BF1B 295C 3C78 3EF1 46B4


signature.asc
Description: Digital signature


Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding

2014-12-29 Thread Mike Cardwell
* on the Mon, Dec 29, 2014 at 07:25:59PM -0500, Andrew Sullivan wrote:

 This is exactly the same method that we commonly use for performing case
 insensitive text searches using lower() indexes.

 Hmm.  How did you get the original, then?

The original in my case, is the hostname which the end user supplied.
Essentially, when I display it back to them, I want to make sure it is
displayed the same way that it was when they originally submitted it.

 If you have the original
 Unicode version, why don't you switch to IDNA2008 publication rules,
 which are way more reliable?  In that case, you do have a 1:1 lookup
 and you shouldn't have a problem.

I was unaware of the different versions of IDNA. I basically started using
the Perl module IDNA::Punycode in my project and assumed that this was the
only type. Seems like I need to do some more reading.

-- 
Mike Cardwell  https://grepular.com https://emailprivacytester.com
OpenPGP Key35BC AF1D 3AA2 1F84 3DC3   B0CF 70A5 F512 0018 461F
XMPP OTR Key   8924 B06A 7917 AAF3 DBB1   BF1B 295C 3C78 3EF1 46B4


signature.asc
Description: Digital signature


[GENERAL] Problem with constraint exclusion on partitions

2014-12-19 Thread Mike Pultz
Hello,

 

I'm having an issue getting the query planner to skip child tables based on
my check constraints. I have 

 

constraint_exclusion = partition

 

in my config file, and I think I have everything set up correctly.

 

The parent table shows the child tables:

 

cloud_test2=# \d+ engine_sessions

...

Child tables: data.engine_sessions_2008,

  data.engine_sessions_2009,

  data.engine_sessions_2010,

.

  data.engine_sessions_201411,

  data.engine_sessions_201412

 

And the check constraints looks right on the child tables:

 

cloud_test2=# \d data.engine_sessions_2008

...

Check constraints:

engine_sessions_2008_check CHECK (date_created = '2008-01-01
00:00:00-05'::timestamp with time zone AND date_created  '2009-01-01
00:00:00-05'::timestamp with time zone)

 

(date_created is a timestamp with time zone not null default now())

 

Just looking at the explain output- when I select where date_created 
now() - interval '24 hours', the query planner does a sequential scan on
all the child tables:

 

cloud_test2=# explain analyze select * from engine_sessions where
date_created  now() - interval '24 hours';

 

Append  (cost=0.00..59268.32 rows=354 width=97) (actual
time=250.421..255.227 rows=42 loops=1)

   -  Seq Scan on engine_sessions  (cost=0.00..0.00 rows=1 width=96)
(actual time=0.000..0.000 rows=0 loops=1)

 Filter: (date_created  (now() - '24:00:00'::interval))

   -  Seq Scan on engine_sessions_2008  (cost=0.00..3384.94 rows=11
width=96) (actual time=12.086..12.086 rows=0 loops=1)

 Filter: (date_created  (now() - '24:00:00'::interval))

 Rows Removed by Filter: 106568

...

 

   -  Seq Scan on engine_sessions_201411  (cost=0.00..1607.85 rows=5
width=97) (actual time=5.586..5.586 rows=0 loops=1)

 Filter: (date_created  (now() - '24:00:00'::interval))

 Rows Removed by Filter: 46620

   -  Seq Scan on engine_sessions_201412  (cost=0.00..1378.07 rows=180
width=97) (actual time=0.006..4.810 rows=42 loops=1)

 Filter: (date_created  (now() - '24:00:00'::interval))

 Rows Removed by Filter: 39915

Total runtime: 255.322 ms

(58 rows)

 

But when I take the output of now() - interval '24 hours':

 

cloud_test2=# select now() - interval '24 hours';

   ?column?

---

2014-12-18 21:28:47.926603-05

(1 row)

 

And use that directly, it works fine:

 

cloud_test2=# explain analyze select * from engine_sessions where
date_created  '2014-12-18 21:28:47.926603-05';

 QUERY PLAN



-

Append  (cost=0.00..1178.34 rows=181 width=97) (actual time=0.004..3.135
rows=42 loops=1)

   -  Seq Scan on engine_sessions  (cost=0.00..0.00 rows=1 width=96)
(actual time=0.000..0.000 rows=0 loops=1)

 Filter: (date_created  '2014-12-18 21:28:47.926603-05'::timestamp
with time zone)

   -  Seq Scan on engine_sessions_201412  (cost=0.00..1178.34 rows=180
width=97) (actual time=0.003..3.130 rows=42 loops=1)

 Filter: (date_created  '2014-12-18 21:28:47.926603-05'::timestamp
with time zone)

 Rows Removed by Filter: 39915

Total runtime: 3.151 ms

(7 rows)

 

The types match:

 

cloud_test2=# select pg_typeof(now() - interval '24 hours');

pg_typeof 

--

timestamp with time zone

 

Is there something I'm missing?

 

Thanks!

 

Mike



Re: [GENERAL] Problem with constraint exclusion on partitions

2014-12-19 Thread Mike Pultz
Hey David,

 What version are you using?

Sorry, I'm on 9.3.4.

Now() is a volatile function so the planner cannot omit partitions.

Replace that with a constant and now it can.

I'm not sure what you mean- now() is a stable function:

cloud_test2=# \df+ now()

   Schema   | Name | Result data type | Argument data types |  Type
| Security | Volatility | Owner | Language | Source code |   Description

+--+--+-+---
-+--++---+--+-+-
-
 pg_catalog | now  | timestamp with time zone | | normal
| invoker  | stable | pgres | internal | now | current
transaction time


it's the timestamp at the start of the transaction- so the planner should
have a set value for all rows.

Am I missing something else?

Mike





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pl/pgsql trigger function - compare *most* columns in NEW vs. OLD

2014-12-18 Thread Mike Blackwell
Is there a simple notation for comparing most columns in the new and old
records in a pl/pgsql trigger function?  Something like

(new.b, new.c, new.d) = (old.b, old.c, old.d)

works to compare all the columns except 'a', but is fragile in that it
needs to be updated any time a column is added to the table, and is rather
messy looking when the actual table has many columns.

Is there a better approach to detecting a change in any column except a few
specific ones?


* mike.blackw...@rrd.com*


[GENERAL] Vacuum freeze

2014-12-04 Thread Mike Blackwell
check_postgres.pl (--action=autovac_freeze) recently complained that we
needed to run VACUUM FREEZE.  Doing so generated a boatload of WAL files -
perhaps on the order of the of the database itself.

Is VACUUM FREEZE something that is normally handled by autovac?  If so, how
would we approach finding what caused it not to happen automatically?  Or
if it's not, what's the normal approach to scheduling it manually to
prevent this flood of WAL?


[GENERAL] Estimating WAL usage during pg_basebackup

2014-10-30 Thread Mike Blackwell
I need to get an idea of how much WAL space will be required during a long
(many hours) pg_basebackup over a relatively slow network connection.  This
is for a server that's not yet running PITR / streaming.

Any thoughts?
* mike.blackw...@rrd.com*


[GENERAL] Any Postgres experts not afraid of the camera?

2014-09-03 Thread Mike Christensen
http://meta.stackoverflow.com/questions/270574/an-experiment-stack-overflow-tv?cb=1


[GENERAL] Regular expression question with Postgres

2014-07-24 Thread Mike Christensen
I'm curious why this query returns 0:

SELECT 'AAA' ~ '^A{,4}$'

Yet, this query returns 1:

SELECT 'AAA' ~ '^A{0,4}$'

Is this a bug with the regular expression engine?


Re: [GENERAL] Regular expression question with Postgres

2014-07-24 Thread Mike Christensen
Yea seems right.  I was testing the expression on Rubular (Which uses the
Ruby parser) and it worked.  I guess Ruby allows this non-standard
expression with the missing lower bounds.  Every reference I could find,
though, agrees only the upper bound is optional.


On Thu, Jul 24, 2014 at 1:42 PM, David G Johnston 
david.g.johns...@gmail.com wrote:

 Mike Christensen-2 wrote
  I'm curious why this query returns 0:
 
  SELECT 'AAA' ~ '^A{,4}$'
 
  Yet, this query returns 1:
 
  SELECT 'AAA' ~ '^A{0,4}$'
 
  Is this a bug with the regular expression engine?

 Apparently since {,#} is not a valid regexp expression the engine simply
 interprets it as a literal and says 'AAA' != 'A{,4}'


 http://www.postgresql.org/docs/9.3/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXP

 Table 9-13. Regular Expression Quantifiers

 Note the all of the { } expressions have a lower bound (whether explicit or
 implied).

 David J.




 --
 View this message in context:
 http://postgresql.1045698.n5.nabble.com/Regular-expression-question-with-Postgres-tp5812777p5812778.html
 Sent from the PostgreSQL - general mailing list archive at Nabble.com.


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] Regular expression question with Postgres

2014-07-24 Thread Mike Christensen
Yea looks like Postgres has it right, well.. per POSIX standard anyway.
 JavaScript also has it right, as does Python and .NET.  Ruby is just weird.


On Thu, Jul 24, 2014 at 1:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Mike Christensen m...@kitchenpc.com writes:
  I'm curious why this query returns 0:
  SELECT 'AAA' ~ '^A{,4}$'

  Yet, this query returns 1:

  SELECT 'AAA' ~ '^A{0,4}$'

  Is this a bug with the regular expression engine?

 Our regex documentation lists the following variants of bounds syntax:
 {m}
 {m,}
 {m,n}
 Nothing about {,n}.  I rather imagine that the engine is deciding that
 that's just literal text and not a bounds constraint ...

 regression=# SELECT 'A{,4}' ~ '^A{,4}$';
  ?column?
 --
  t
 (1 row)

 ... yup, apparently so.

 A look at the POSIX standard says that it has the same idea of what
 is a valid bounds constraint:

 When an ERE matching a single character or an ERE enclosed in
 parentheses is followed by an interval expression of the format
 {m}, {m,}, or {m,n}, together with that interval expression
 it shall match what repeated consecutive occurrences of the ERE
 would match. The values of m and n are decimal integers in the
 range 0 = m= n= {RE_DUP_MAX}, where m specifies the exact or
 minimum number of occurrences and n specifies the maximum number
 of occurrences. The expression {m} matches exactly m occurrences
 of the preceding ERE, {m,} matches at least m occurrences, and
 {m,n} matches any number of occurrences between m and n,
 inclusive.

 regards, tom lane



Re: [GENERAL] WARNING: database must be vacuumed within 8439472 transactions

2014-07-07 Thread Mike Christensen
Sounds like you just have to wait until it finishes..


On Mon, Jul 7, 2014 at 12:56 PM, Prabhjot Sheena 
prabhjot.she...@rivalwatch.com wrote:

 Hello
We are using postgresql 8.3 database for last 5 yrs for this
 production database and its running fine. This is our critical database
 which runs 24*7. This weekend we started getting these messages

 HINT:  To avoid a database shutdown, execute a full-database VACUUM.
 WARNING:  database  must be vacuumed within 8439472 transactions

 i am currently running this command

 vacuumdb --analyze db

 while this command is running i m still getting these messages

 WARNING:  database  must be vacuumed within 2645303 transactions.

 The value of number of transactions is going down every minute

 Can anyone tell me what is the best way to sort up this issue.

 Thanks
 Avi






[GENERAL] EBCDIC conversion

2014-03-27 Thread Mike Blackwell
We have a need to check certain text fields to be sure they'll convert
properly to EBCDIC.  A check constraint with a convert() was the initial
thought, but there doesn't seem to be a default conversion from UTF8 to
EBCDIC.  Does anyone have an implementation they'd care to share, or
suggestions on the best way to implement the conversion function?  Our
PostgreSQL servers currently only have the default languages installed.



http://www.rrdonnelley.com/
* mike.blackw...@rrd.com*


Re: [GENERAL] PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

2014-01-28 Thread Mike Christensen
I've had the same problem as well with NHibernate (On .NET) with Postgres
ENUM types.  Luckily, NHibernate is incredibly powerful and you *can* get
everything working flawlessly, however it takes some serious digging into
the source code and reading the docs to figure it out.  The main issue is
that NHibernate, out of the box, wants to map an ENUM as a number.  For
example:

INSERT INTO FOO SomeEnumColumn VALUES (1);

This will cause an error, because PG is looking for a string value (Even
though ENUMs are stored as numeric values under the covers).  It's pretty
easy to configure NHibernate to convert ENUMs to strings (there's tons of
blog posts on that)..  However, this causes NHibernate to write:

INSERT INTO FOO SomeEnumColumn VALUES ('EnumValue'::text);

Which will also cause an error.  I've found the only way around it is to
configure NHibernate to treat ENUMs as Objects which will simply generate:

INSERT INTO FOO SomeEnumColumn VALUES ('EnumValue'); -- No casting here,
yay!

This works.  However, to agree with the original poster's point, if
Postgres could be a little more forgiving about values that could be
interpreted as correct (like an implicit cast between numeric and enum and
string and enum) then we wouldn't have these issues..

Mike


On Tue, Jan 28, 2014 at 1:37 PM, John R Pierce pie...@hogranch.com wrote:

 On 1/28/2014 1:20 PM, Tom Lane wrote:

 I think you can fix it by explicitly casting your placeholders, eg
 ?::macaddr.


 that might work for a wrapper that lets you roll your own SQL, but I
 thought he said one of these autogenerated SQL, taking it out of his
 control.




 --
 john r pierce  37N 122W
 somewhere on the middle of the left coast



 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

2014-01-28 Thread Mike Christensen
Interesting!

I'd be curious as to what types of bugs were caused by these implicit
casts..

Note 8.3 was in the days back before ORMs became popular, so just write
better SQL was a perfectly decent solution to the problem back then.  Now
days, this requirement might make Postgres incompatible with certain ORMs
out there, which is a bummer.  I'm wondering if these ambiguities you speak
of could be solved in other ways.  Such as implicitly cast iff the
intention is not ambiguous, otherwise raise some sort of ambiguous error
or default to some behavior.

Mike


On Tue, Jan 28, 2014 at 2:46 PM, John R Pierce pie...@hogranch.com wrote:

 On 1/28/2014 2:35 PM, Mike Christensen wrote:

 This works.  However, to agree with the original poster's point, if
 Postgres could be a little more forgiving about values that could be
 interpreted as correct (like an implicit cast between numeric and enum and
 string and enum) then we wouldn't have these issues..


 it had more implicit casts prior to (I think) 8.3, but there were many
 ambiguities where things could be interpreted to mean radically different
 sorts of operations, so they tightened things up in 8.3+ (or was it 8.4+ ?)




 --
 john r pierce  37N 122W
 somewhere on the middle of the left coast



 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

2014-01-28 Thread Mike Christensen
How do you create casts in Postgres?


On Tue, Jan 28, 2014 at 3:24 PM, Andrew Sullivan a...@crankycanuck.cawrote:

 On Tue, Jan 28, 2014 at 02:55:03PM -0800, Mike Christensen wrote:

  I'd be curious as to what types of bugs were caused by these implicit
  casts..

 Typically, they were cases when there was an ambiguity that the
 programmer didn't understand, causing applications to blow up in
 surprising and wonderful ways.

 There are things you can do if you're really prepared for the gun
 aimed at your feet.  Since you can create casts in Postgres, you can
 actually add back many of the implicit casts yourself.

  Such as implicitly cast iff the intention is not ambiguous

 I think if the developers could write code that read minds, they'd be
 working on more profitable enterprises ;-)

 Best,

 A

 --
 Andrew Sullivan
 a...@crankycanuck.ca


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

2014-01-28 Thread Mike Christensen
Oh.  The CREATE CAST command.  Wow, I was totally unaware of this entire
feature!


On Tue, Jan 28, 2014 at 3:36 PM, Mike Christensen m...@kitchenpc.comwrote:

 How do you create casts in Postgres?


 On Tue, Jan 28, 2014 at 3:24 PM, Andrew Sullivan a...@crankycanuck.cawrote:

 On Tue, Jan 28, 2014 at 02:55:03PM -0800, Mike Christensen wrote:

  I'd be curious as to what types of bugs were caused by these implicit
  casts..

 Typically, they were cases when there was an ambiguity that the
 programmer didn't understand, causing applications to blow up in
 surprising and wonderful ways.

 There are things you can do if you're really prepared for the gun
 aimed at your feet.  Since you can create casts in Postgres, you can
 actually add back many of the implicit casts yourself.

  Such as implicitly cast iff the intention is not ambiguous

 I think if the developers could write code that read minds, they'd be
 working on more profitable enterprises ;-)

 Best,

 A

 --
 Andrew Sullivan
 a...@crankycanuck.ca


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general





Re: [GENERAL] question about checksum in 9.3

2014-01-14 Thread Mike Broers
Thanks for the responses, that clarifies the checksum feature for me.

FWIW, my pgbench tests between a 9.2 server and a 9.3 server with checksum
showed very similar performance characteristics and system resource
utilization.  Im going to set up another load test with our actual
application to see if that reveals any noticeable performance difference.

Thanks again
Mike


On Mon, Jan 13, 2014 at 7:11 PM, Michael Paquier
michael.paqu...@gmail.comwrote:

 On Tue, Jan 14, 2014 at 1:50 AM, Mike Broers mbro...@gmail.com wrote:
  Hello, I am in the process of planning a 9.3 migration of postgres and I
 am
  curious about the checksum features available.  In my test 9.3 instance
 it
  seemed like this feature provides a log entry of the exact database/oid
 of
  the corrupt object when it is accessed, but not much else.  I can't find
  much documentation on anything else this feature provides.
 Few things:
 - The only way to know if a server is using data checksums is to use
 pg_controldata.
 - Be aware as well of the potential performance impact on your CPU,
 checksums are checked each time a page is read, and recalculated each
 time a page is updated.
 - ignore_checksum_failure can be used to ignore failures. Don't use
 that on a production system.

  Is there a built-in method of scanning the server to check for
 corruption or
  will I have to wait for a corrupt object to be accessed to see the log
  entry?
 You can as well access manually tables with some for example
 sequential scan to check if blocks are broken or not.

  Is there a relation that stores last checksum status or anyway of
  reporting on what objects are identified by postgres as corrupt or not
  corrupt?
 No, you could build one though with a background worker that scans
 relation pages and registers that failing blocks.

  Are there any other features of the checksum I am missing besides the log
  entry?
 9.4 has a new GUC parameter called data_checksums that allow you to
 check with a psql client if checksums are used on a server.

 Regards,
 --
 Michael



[GENERAL] question about checksum in 9.3

2014-01-13 Thread Mike Broers
Hello, I am in the process of planning a 9.3 migration of postgres and I am
curious about the checksum features available.  In my test 9.3 instance it
seemed like this feature provides a log entry of the exact database/oid of
the corrupt object when it is accessed, but not much else.  I can't find
much documentation on anything else this feature provides.

Is there a built-in method of scanning the server to check for corruption
or will I have to wait for a corrupt object to be accessed to see the log
entry?  Is there a relation that stores last checksum status or anyway of
reporting on what objects are identified by postgres as corrupt or not
corrupt?

Are there any other features of the checksum I am missing besides the log
entry?

Thanks
Mike


Re: [GENERAL] Npgsql - Where can I find Npgsql.NpgsqlServices

2013-12-12 Thread Mike Christensen
Thanks!  Got it working after messing around for a while..

I decided to check out EF for a new project I'm working on.  So far, I
think I like NHibernate better.  A lot more control, and works flawlessly
with Postgres and Npgsql.

Mike




On Thu, Dec 12, 2013 at 8:34 AM, Francisco Figueiredo Jr. 
franci...@npgsql.org wrote:


 Hi, Mike.

 We are working to get EF6 properly supported and our master branch already
 has support for it.
 Unfortunately pgfoundry is down at the moment.

 Please, if you can compile Npgsql from source code, please get a current
 snapshot from github and give it a try: https://github.com/npgsql/Npgsql and
 directly https://github.com/npgsql/Npgsql/archive/master.zip
 Just open the project on visual studio and select the correct
 configuration (net-45) and you will get Npgsql.dll and
 Npgsql.EntityFramework.dll in the EntityFramework/bin folder.

 I hope it helps.

 Let me know if you have any question.



 On Thu, Dec 12, 2013 at 1:24 AM, Mike Christensen m...@kitchenpc.comwrote:

 It seems I need NpgsqlServices to use Npgsql with EF6, however I can't
 figure out where you get this thing!

 I've tried installing it through NuGet:

 PM Install-Package Npgsql -pre
 Installing 'Npgsql 2.0.14.1'.
 Successfully installed 'Npgsql 2.0.14.1'.
 Adding 'Npgsql 2.0.14.1' to EFTest.
 Successfully added 'Npgsql 2.0.14.1' to EFTest.


 However, this doesn't have it.  I've also tried installing the beta from:


 http://pgfoundry.org/frs/download.php/3494/Npgsql2.0.13.91-bin-ms.net4.5Ef6.zip

 No luck there.  Any ideas?

 Mike




 --
 Regards,

 Francisco Figueiredo Jr.
 Npgsql Lead Developer
 http://www.npgsql.org
 http://gplus.to/franciscojunior
 http://fxjr.blogspot.com
 http://twitter.com/franciscojunior



[GENERAL] Npgsql - Where can I find Npgsql.NpgsqlServices

2013-12-11 Thread Mike Christensen
It seems I need NpgsqlServices to use Npgsql with EF6, however I can't
figure out where you get this thing!

I've tried installing it through NuGet:

PM Install-Package Npgsql -pre
Installing 'Npgsql 2.0.14.1'.
Successfully installed 'Npgsql 2.0.14.1'.
Adding 'Npgsql 2.0.14.1' to EFTest.
Successfully added 'Npgsql 2.0.14.1' to EFTest.


However, this doesn't have it.  I've also tried installing the beta from:

http://pgfoundry.org/frs/download.php/3494/Npgsql2.0.13.91-bin-ms.net4.5Ef6.zip

No luck there.  Any ideas?

Mike


Re: [GENERAL] Re: corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-26 Thread Mike Broers
The restore of a post-crash production backup worked as hoped and the 2nd
replication slave is back into its happy hot standby state.

So if this problem replicated to our standby servers does that indicate
that the potential problematic fsync occurred during a pg_xlog write?
 Would breaking replication at the time of the crash have prevented this
from cascading or was it already too late at that point?

Thanks again for the input, its been very helpful!
Mike




On Mon, Nov 25, 2013 at 12:20 PM, Mike Broers mbro...@gmail.com wrote:

 Thanks Shaun,

 Im planning to schedule a time to do the vacuum freeze suggested
 previously.  So far the extent of the problem seems limited to the one
 session table and the one session row that was being used by a heavy bot
 scan at the time of the crash.  Currently Im testing a recovery of a
 production backup from today to rebase one of the replication targets that
 I was using to test fixes last week.  Hopefully that validates the current
 backups and I can proceed inquiring with our managed services provider
 about the false notification of the disk write and ways to prevent that
 going forward.

 I'll update the list if I uncover anything interesting in the process
 and/or need more advice, thanks again for your input - its much appreciated
 as always.  Nothing like a little crash corruption to get the blood flowing!

 Mike


 On Mon, Nov 25, 2013 at 10:29 AM, Shaun Thomas 
 stho...@optionshouse.comwrote:

  Update - I have two hot replication slaves of this db, both have the
 problem.
  I took one out of recovery and ran REINDEX table session_session and it
  fixed the errors about this row.  Now Im going to run vacuum and see if
  there are other tables that complain, but Im guessing if so I will need
 to see
  if there is a way to force vacuum to continue on error, worst case I
 might
  have to script a table by table vacuum script I guess..  If anyone has
 a better
  suggestion for determining the extent of the damage Id appreciate it.

 Oh man. I'm sorry, Mike.

 One of the cardinal rules I have is to disconnect any replication
 following a database crash. It's just too easy for damaged replicated rows
 to be propagated unless you're on 9.3 and have checksums enabled. If you
 want to perform a  table-by-table check, don't vacuum the database, but the
 individual tables. I'd go with a DO loop and have it raise notices into the
 log so you can investigate further:

 COPY (
 SELECT 'VACUUM ' || oid::regclass::text || ';'
   FROM pg_class
  WHERE relkind = 'r'
 ) to '/tmp/vac_all.sql';

 Run the /tmp/vac_all.sql through psql and pipe the contents into a log
 file. Any table that doesn't vacuum successfully will need to be repaired
 manually. One way you can do this if there are dupes, is by checking the
 ctid value after disabling index scans:

 SET enable_indexscan TO False;

 SELECT ctid, * FROM [broken_table] WHERE ...;

 Just construct the WHERE clause based on the error output, and you should
 get all rows if there are dupes. You'll need to figure out which row to
 keep, then delete the bad row based on the ctid. Do this as many times as
 it takes, then reindex to make sure the proper row versions are indexed.

 It's also a good idea to dump any table that came back with an error,
 just in case.

 After you've done all of that, you should re-base your replicas once
 you've determined your production system is usable. In the meantime, I
 highly recommend you set up a VIP you can assign to one of your replicas if
 your production system dies again, and remove any autostart code. If your
 production system crashes, switch the VIP immediately to a replica, and
 invalidate your old production system. Data corruption is insidious when
 streaming replication is involved.

 Look into tools like repmgr to handle managing your replicas as a cluster
 to make forced invalidation and re-basing easier.

 Good luck!

 --
 Shaun Thomas
 OptionsHouse | 141 W. Jackson Blvd | Suite 500 | Chicago IL, 60604
 312-676-8870
 stho...@optionshouse.com

 __

 See http://www.peak6.com/email_disclaimer/ for terms and conditions
 related to this email





Re: [GENERAL] Re: corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-25 Thread Mike Broers
Thanks Shaun,

Im planning to schedule a time to do the vacuum freeze suggested
previously.  So far the extent of the problem seems limited to the one
session table and the one session row that was being used by a heavy bot
scan at the time of the crash.  Currently Im testing a recovery of a
production backup from today to rebase one of the replication targets that
I was using to test fixes last week.  Hopefully that validates the current
backups and I can proceed inquiring with our managed services provider
about the false notification of the disk write and ways to prevent that
going forward.

I'll update the list if I uncover anything interesting in the process
and/or need more advice, thanks again for your input - its much appreciated
as always.  Nothing like a little crash corruption to get the blood flowing!

Mike


On Mon, Nov 25, 2013 at 10:29 AM, Shaun Thomas stho...@optionshouse.comwrote:

  Update - I have two hot replication slaves of this db, both have the
 problem.
  I took one out of recovery and ran REINDEX table session_session and it
  fixed the errors about this row.  Now Im going to run vacuum and see if
  there are other tables that complain, but Im guessing if so I will need
 to see
  if there is a way to force vacuum to continue on error, worst case I
 might
  have to script a table by table vacuum script I guess..  If anyone has a
 better
  suggestion for determining the extent of the damage Id appreciate it.

 Oh man. I'm sorry, Mike.

 One of the cardinal rules I have is to disconnect any replication
 following a database crash. It's just too easy for damaged replicated rows
 to be propagated unless you're on 9.3 and have checksums enabled. If you
 want to perform a  table-by-table check, don't vacuum the database, but the
 individual tables. I'd go with a DO loop and have it raise notices into the
 log so you can investigate further:

 COPY (
 SELECT 'VACUUM ' || oid::regclass::text || ';'
   FROM pg_class
  WHERE relkind = 'r'
 ) to '/tmp/vac_all.sql';

 Run the /tmp/vac_all.sql through psql and pipe the contents into a log
 file. Any table that doesn't vacuum successfully will need to be repaired
 manually. One way you can do this if there are dupes, is by checking the
 ctid value after disabling index scans:

 SET enable_indexscan TO False;

 SELECT ctid, * FROM [broken_table] WHERE ...;

 Just construct the WHERE clause based on the error output, and you should
 get all rows if there are dupes. You'll need to figure out which row to
 keep, then delete the bad row based on the ctid. Do this as many times as
 it takes, then reindex to make sure the proper row versions are indexed.

 It's also a good idea to dump any table that came back with an error, just
 in case.

 After you've done all of that, you should re-base your replicas once
 you've determined your production system is usable. In the meantime, I
 highly recommend you set up a VIP you can assign to one of your replicas if
 your production system dies again, and remove any autostart code. If your
 production system crashes, switch the VIP immediately to a replica, and
 invalidate your old production system. Data corruption is insidious when
 streaming replication is involved.

 Look into tools like repmgr to handle managing your replicas as a cluster
 to make forced invalidation and re-basing easier.

 Good luck!

 --
 Shaun Thomas
 OptionsHouse | 141 W. Jackson Blvd | Suite 500 | Chicago IL, 60604
 312-676-8870
 stho...@optionshouse.com

 __

 See http://www.peak6.com/email_disclaimer/ for terms and conditions
 related to this email



[GENERAL] Browsing postgres database using the Eclipse QuantumDB plugin

2013-11-22 Thread Mike Kienenberger
Has anyone successfully connected and browsed a postgres database
using the Eclipse QuantumDB plugin?

I can connect and execute sql, but the existing table list is always
empty as if no meta information is ever provided to the browser
plugin.   At first, I thought it might be a permission problem with
the database user I provided, so I also tried it with the postgres
user, but had the same results.

I've always had QuantumDB work with Oracle, hsqldb, h2database, and
other platforms.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Browsing postgres database using the Eclipse QuantumDB plugin

2013-11-22 Thread Mike Kienenberger
I wanted to make sure that it wasn't a permission configuration
problem in postgres first, since all of the other databases have
worked without a similar issue.


On Fri, Nov 22, 2013 at 9:54 AM, Adrian Klaver adrian.kla...@gmail.com wrote:
 On 11/22/2013 05:46 AM, Mike Kienenberger wrote:

 Has anyone successfully connected and browsed a postgres database
 using the Eclipse QuantumDB plugin?


 You might get a answer sooner here:

 http://sourceforge.net/p/quantum/discussion/24178/



 I can connect and execute sql, but the existing table list is always
 empty as if no meta information is ever provided to the browser
 plugin.   At first, I thought it might be a permission problem with
 the database user I provided, so I also tried it with the postgres
 user, but had the same results.

 I've always had QuantumDB work with Oracle, hsqldb, h2database, and
 other platforms.




 --
 Adrian Klaver
 adrian.kla...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Browsing postgres database using the Eclipse QuantumDB plugin

2013-11-22 Thread Mike Kienenberger
Adrian, Thanks!

Searching through those forum posts (I had tried doing general google
searches first before posting here, but didn't turn up that forum) I found this:

http://sourceforge.net/p/quantum/discussion/24178/thread/17724b28/?limit=25#1166

which seems to indicate that at least back in 2007, there was an issue
with the quantum support for postgres.  Perhaps that's still the case
with the version I am using.



On Fri, Nov 22, 2013 at 10:28 AM, Mike Kienenberger mkien...@gmail.com wrote:
 I wanted to make sure that it wasn't a permission configuration
 problem in postgres first, since all of the other databases have
 worked without a similar issue.


 On Fri, Nov 22, 2013 at 9:54 AM, Adrian Klaver adrian.kla...@gmail.com 
 wrote:
 On 11/22/2013 05:46 AM, Mike Kienenberger wrote:

 Has anyone successfully connected and browsed a postgres database
 using the Eclipse QuantumDB plugin?


 You might get a answer sooner here:

 http://sourceforge.net/p/quantum/discussion/24178/



 I can connect and execute sql, but the existing table list is always
 empty as if no meta information is ever provided to the browser
 plugin.   At first, I thought it might be a permission problem with
 the database user I provided, so I also tried it with the postgres
 user, but had the same results.

 I've always had QuantumDB work with Oracle, hsqldb, h2database, and
 other platforms.




 --
 Adrian Klaver
 adrian.kla...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-22 Thread Mike Broers
vacuumb avz, pg_dumpall, and vacuum freeze analyze on the former standby
database that received the corruption via replication all came back without
errors.  Is the vacuum freeze intended to potentially fix problems or just
reveal if other tables may have corruption, Im trying to decide if this
needs to be run in production.


On Thu, Nov 21, 2013 at 5:09 PM, Kevin Grittner kgri...@ymail.com wrote:

 Mike Broers mbro...@gmail.com wrote:

  Is there anything I should look out for with vacuum freeze?

 Just check the logs and the vacuum output for errors and warnings.

 --
 Kevin Grittner
 EDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



Re: [GENERAL] corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-21 Thread Mike Broers
Thanks for the response.  fsync and full_page_writes are both on.

Our database runs on a managed hosting provider's vmhost server/san, I can
possibly request for them to provide some hardware test results - do you
have any specifics diagnostics in mind?  The crash was apparently due to
our vmhost suddenly losing power, the only row that it has complained with
the chunk error also migrated into both standby servers, and as previously
stated was fixed with a reindex of the parent table in one of the standby
servers after taking it out of recovery.  The vacuumdb -avz on this test
copy didnt have any errors or warnings, im going to also run a pg_dumpall
on this host to see if any other rows are problematic.

Is there something else I can run to confirm we are more or less ok at the
database level after the pg_dumpall or is there no way to be sure and a
fresh initdb is required.

I am planning on running the reindex in actual production tonight during
our maintenance window, but was hoping if that worked we would be out of
the woods.



On Thu, Nov 21, 2013 at 3:56 PM, Kevin Grittner kgri...@ymail.com wrote:

 Mike Broers mbro...@gmail.com wrote:

  Hello we are running postgres 9.2.5 on RHEL6, our production
  server crashed hard and when it came back up our logs were
  flooded with:

  ERROR:  unexpected chunk number 0 (expected 1) for toast value 117927127
 in pg_toast_19122

 Your database is corrupted.  Unless you were running with fsync =
 off or full_page_writes = off, that should not happen.  It is
 likely to be caused by a hardware problem (bad RAM, a bad disk
 drive, or network problems if your storage is across a network).

 If it were me, I would stop the database service and copy the full
 data directory tree.

 http://wiki.postgresql.org/wiki/Corruption

 If fsync or full_page_writes were off, your best bet is probably to
 go to your backup.  If you don't go to a backup, you should try to
 get to a point where you can run pg_dump, and dump and load to a
 freshly initdb'd cluster.

 If fsync and full_page_writes were both on, you should run hardware
 diagnostics at your earliest opportunity.  When hardware starts to
 fail, the first episode is rarely the last or the most severe.

 --
 Kevin Grittner
 EDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



Re: [GENERAL] corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-21 Thread Mike Broers
Thanks, after this pg_dumpall I am going to see what kind of impact I can
expect from running VACUUM FREEZE ANALYZE (normally I just run vacuumdb
-avz nightly via a cron job) and schedule time to run this in production
against all the tables in the database.  Is there anything I should look
out for with vacuum freeze?

Much appreciated,
Mike


On Thu, Nov 21, 2013 at 4:51 PM, Kevin Grittner kgri...@ymail.com wrote:

 Mike Broers mbro...@gmail.com wrote:

  Thanks for the response.  fsync and full_page_writes are both on.

  [ corruption appeared following power loss on the machine hosing
  the VM running PostgreSQL ]

 That leaves three possibilities:
   (1)  fsync doesn't actually guarantee persistence in your stack.
   (2)  There is a hardware problem which has not been recognized.
   (3)  There is a so-far unrecognized bug in PostgreSQL.

 Based on my personal experience, those are listed in descending
 order of probability.  I seem to recall reports of some VM for
 which an fsync did not force data all the way to persistent
 storage, but I don't recall which one.  You might want to talk to
 your service provider about what guarantees they make in this
 regard.

  Is there something else I can run to confirm we are more or less
  ok at the database level after the pg_dumpall or is there no way
  to be sure and a fresh initdb is required.

 Given that you had persistence options in their default state of
 on, and the corruption appeared after a power failure in a VM
 environment, I would guess that the damage is probably limited.
 That said, damage from this sort of event can remain hidden and
 cause data loss later.  Unfortunately we do not yet have a
 consistency checker that can root out such problems.  If you can
 arrange a maintenance window to dump and load to a fresh initdb,
 that would eliminate the possibility that some hidden corruption is
 lurking.  If that is not possible, running VACUUM FREEZE ANALYZE
 will reduce the number of things that can go wrong, without
 requiring down time.

 --
 Kevin Grittner
 EDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



[GENERAL] Re: corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-21 Thread Mike Broers
Update - I have two hot replication slaves of this db, both have the
problem.  I took one out of recovery and ran REINDEX table session_session
and it fixed the errors about this row.  Now Im going to run vacuum and see
if there are other tables that complain, but Im guessing if so I will need
to see if there is a way to force vacuum to continue on error, worst case I
might have to script a table by table vacuum script I guess..  If anyone
has a better suggestion for determining the extent of the damage Id
appreciate it.


On Thu, Nov 21, 2013 at 2:10 PM, Mike Broers mbro...@gmail.com wrote:

 Hello we are running postgres 9.2.5 on RHEL6, our production server
 crashed hard and when it came back up our logs were flooded with:

 STATEMENT:  SELECT session_session.session_key,
 session_session.session_data, session_session.expire_date,
 session_session.nonce FROM session_session WHERE
 (session_session.session_key = 'gk9aap5d7btp6tzquh0kf73gpfmik5w3'  AND
 session_session.expire_date  '2013-11-21 13:27:33.107913' )

 ERROR:  unexpected chunk number 0 (expected 1) for toast value 117927127
 in pg_toast_19122

 We restarted the application and whatever session was constantly hitting
 that row stopped, but Im concerned about remediation.  When I attempt to
 read from that row the error occurs.

 select * from session_session where session_key =
 'gk9aap5d7btp6tzquh0kf73gpfmik5w3';
 ERROR:  unexpected chunk number 0 (expected 1) for toast value 117927127
 in pg_toast_19122

 When I attempt to delete this row I get this error:
 delete from session_session where session_key =
 'gk9aap5d7btp6tzquh0kf73gpfmik5w3';
 ERROR:  tuple concurrently updated

 We happen to have a maintenance window tonight so I will have some time
 when the app is down to run some database fixes.  I saw other threads
 suggesting a reindex of the toast table, but this is a 14GB table and I'm
 not sure how long that will take or if it will even be successful.  We also
 have a full db vacuum/analyze scheduled nightly for 2am so I am expecting
 to learn if there are other impacted tables, but its troubling if I dont
 know what the remediation.  This particular table could be truncated if
 necessary if that is an option but Im not sure about other tables.

 Any suggestions for how to handle the tuple concurrently updated error? Or
 if a reindex is likely to help with the unexpected chunk error?

 Thanks
 Mike



[GENERAL] corruption issue after server crash - ERROR: unexpected chunk number 0

2013-11-21 Thread Mike Broers
Hello we are running postgres 9.2.5 on RHEL6, our production server crashed
hard and when it came back up our logs were flooded with:

STATEMENT:  SELECT session_session.session_key,
session_session.session_data, session_session.expire_date,
session_session.nonce FROM session_session WHERE
(session_session.session_key = 'gk9aap5d7btp6tzquh0kf73gpfmik5w3'  AND
session_session.expire_date  '2013-11-21 13:27:33.107913' )

ERROR:  unexpected chunk number 0 (expected 1) for toast value 117927127 in
pg_toast_19122

We restarted the application and whatever session was constantly hitting
that row stopped, but Im concerned about remediation.  When I attempt to
read from that row the error occurs.

select * from session_session where session_key =
'gk9aap5d7btp6tzquh0kf73gpfmik5w3';
ERROR:  unexpected chunk number 0 (expected 1) for toast value 117927127 in
pg_toast_19122

When I attempt to delete this row I get this error:
delete from session_session where session_key =
'gk9aap5d7btp6tzquh0kf73gpfmik5w3';
ERROR:  tuple concurrently updated

We happen to have a maintenance window tonight so I will have some time
when the app is down to run some database fixes.  I saw other threads
suggesting a reindex of the toast table, but this is a 14GB table and I'm
not sure how long that will take or if it will even be successful.  We also
have a full db vacuum/analyze scheduled nightly for 2am so I am expecting
to learn if there are other impacted tables, but its troubling if I dont
know what the remediation.  This particular table could be truncated if
necessary if that is an option but Im not sure about other tables.

Any suggestions for how to handle the tuple concurrently updated error? Or
if a reindex is likely to help with the unexpected chunk error?

Thanks
Mike


Re: [GENERAL] Call for design: PostgreSQL mugs

2013-09-12 Thread Mike Christensen
Oooh can we make the handle an elephant trunk? (Ok, now I'm sure I'm adding
all sorts of expense - but hey you'll save so much money using Postgres you
can afford an expensive coffee mug!)


On Thu, Sep 12, 2013 at 5:30 AM, Andreas 'ads' Scherbaum 
adsm...@wars-nicht.de wrote:

 On 09/10/2013 10:15 PM, Kevin Grittner wrote:

 patrick keshishian pkesh...@gmail.com wrote:

 One more cute idea that came to me last night. Here is a very


  poor attempt at it by yours truly; keep in mind I'm not a graphics
 artist. This image is for illustration purposes only!

  
 http://sidster.com/gallery/**2013/09/10/elephant_paw.sml.**jpghttp://sidster.com/gallery/2013/09/10/elephant_paw.sml.jpg

 Can you picture a bunch of these on a meeting table? If that image
 does not scream Stampede!, I don't know what does. =)

 Again, a great conversation starter.


 I think I like this idea best so far!

 Some supporting documentation for a final version:

 http://www.asknature.org/**strategy/**29c12a353dab52ad8d4eb5d4337cef**b9http://www.asknature.org/strategy/29c12a353dab52ad8d4eb5d4337cefb9


 I agree, that's a good idea. Instead of the original PG logo near the top,
 the mug can show a real elephant face along with the foots on both sides.
 And then have PostgreSQL somewhere written - maybe turned 90° near the
 handle, on both sides?

 How does this sound? And next question: who can design this? ;-)


 --
 Andreas 'ads' Scherbaum
 German PostgreSQL User Group
 European PostgreSQL User Group - Board of Directors
 Volunteer Regional Contact, Germany - PostgreSQL Project



 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/**mailpref/pgsql-generalhttp://www.postgresql.org/mailpref/pgsql-general



[GENERAL] EF / npgsql and VIEWs

2013-09-05 Thread Mike Blackwell
I understand from one of our developers there may be issues using VIEWs
with Entity Framework and npgsql.  Can anyone with some experience using
PostgreSQL in a .NET environment comment?

__
*Mike Blackwell | Technical Analyst, Distribution Services/Rollout
Management | RR Donnelley*
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
mike.blackw...@rrd.com
http://www.rrdonnelley.com


http://www.rrdonnelley.com/
* mike.blackw...@rrd.com*


Re: [GENERAL] function with unknown params

2013-07-09 Thread Mike Christensen
You passed in:

22/1/2013

Which is 22 divided by 1, divided by 2013 - which is an integer..


On Tue, Jul 9, 2013 at 10:17 AM, giozh gi...@yahoo.it wrote:

 ok, it works. But why on error message i had that two unknown data type? if
 was an error on date type, why it don't signal that?



 --
 View this message in context:
 http://postgresql.1045698.n5.nabble.com/function-with-unknown-params-tp5763215p5763224.html
 Sent from the PostgreSQL - general mailing list archive at Nabble.com.


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



[GENERAL] PERFORM statement

2013-07-08 Thread Mike Christensen
I was reading about Postgres stored procs in the FAQ:

https://wiki.postgresql.org/wiki/FAQ#Does_PostgreSQL_have_stored_procedures.3F

It claims that an alternative syntax to:

SELECT theNameOfTheFunction(arg1, arg2);

Is:

PERFORM theNameOfTheFunction(arg1, arg2);

However, when I try the following:

CREATE TABLE app_for_leave
(
  sno integer NOT NULL,
  eid integer,
  ename varchar(20),
  sd date,
  ed date,
  sid integer,
  status boolean DEFAULT false,
  CONSTRAINT pk_snoa PRIMARY KEY (sno)
);

CREATE FUNCTION MyInsert(_sno integer, _eid integer, _sd date, _ed date,
_sid integer, _status boolean)
  RETURNS void AS
  $BODY$
  BEGIN
INSERT INTO app_for_leave(sno, eid, sd, ed, sid, status)
VALUES(_sno, _eid, _sd, _ed, _sid, _status);
  END;
  $BODY$
  LANGUAGE 'plpgsql' VOLATILE
  COST 100;

PERFORM MyInsert(1,101,'2013-04-04','2013-04-04',2,'f' );

I get the error:

ERROR: syntax error at or near PERFORM
SQL state: 42601
Character: 1

Is the FAQ out of date or was this feature removed?  I'm using 9.2.1.
 Thanks!

Mike


Re: [GENERAL] PERFORM statement

2013-07-08 Thread Mike Christensen
Ah ok that makes sense.  The FAQ wasn't exactly clear about that.


On Mon, Jul 8, 2013 at 9:38 PM, Tony Theodore tony.theod...@gmail.comwrote:


 On 09/07/2013, at 2:20 PM, Mike Christensen m...@kitchenpc.com wrote:


 PERFORM MyInsert(1,101,'2013-04-04','2013-04-04',2,'f' );

 I get the error:

 ERROR: syntax error at or near PERFORM
 SQL state: 42601
 Character: 1

 Is the FAQ out of date or was this feature removed?  I'm using 9.2.1.
  Thanks!


 I believe PERFORM is a PL/pgSQL construct, not an SQL one. You'd need to
 execute it inside a function.

 Cheers,

 Tony





Re: [GENERAL] View's plan not taking advantage of WHERE?

2013-06-05 Thread Mike Summers
It appears that the culprit is a cached query plan, the tables in the UNION
have changed and no long match however the View does not throw a each
UNION query must have the same number of columns error.

Is there a way to force the View's query plan to be updated on each access?


Re: [GENERAL] View's plan not taking advantage of WHERE?

2013-06-05 Thread Mike Summers
From what I'm reading the View is frozen when it's created, including it's
plan, and the usual solution is to use a set returning function... is this
not true?

I've double checked all schemas and the view is only defined once.

Thanks.


Re: [GENERAL] View's plan not taking advantage of WHERE?

2013-06-05 Thread Mike Summers
Thanks Scott, interesting.

Other than the tests in the original post do you have any suggestions?

Thanks for your time.


Re: [GENERAL] View's plan not taking advantage of WHERE?

2013-06-05 Thread Mike Summers
Thanks Tom,

I've found other discussion of this, that aggregates foul-up the planner
with views.

GROUP BY  DISTINCT don't work, we're trying to grab a subset of records
and backfill any nulls to present a complete, single record...  we're stuck
with a view as this is used by a Rails app.

We'll see what else we can come-up with.

Thanks again.


On Wed, Jun 5, 2013 at 9:16 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Mike Summers msummer...@gmail.com writes:
  Other than the tests in the original post do you have any suggestions?

 If you're speaking of

 http://www.postgresql.org/message-id/CAJGeMG89QbDxMab7-aPD_yXVsGx7Q=auxym9ufvaq06crz4...@mail.gmail.com
 that has nothing to do with cached plans, obsolete or otherwise.
 You seem to be wishing that the planner would deduce x = constant from
 aggregate_function(x) = constant, which is wrong on its face.
 AFAICS it's not even correct for the special case that the aggregate
 function is first_not_null(), since you have multiple occurrences of
 that in the view and there's no certainty that they all choose to return
 values from the same row.

 Even if the optimization is valid given some additional assumptions that
 you've not told us, it's going to be a sufficiently narrow case that
 I doubt we'd ever be willing to expend planner cycles on checking for it.

 If you want WHERE clauses to be pushed down into this query you need to
 think of some other way to define the query.  Perhaps something
 involving GROUP BY or DISTINCT instead of first_not_null() would be more
 amenable to optimization.

 regards, tom lane



[GENERAL] View's plan not taking advantage of WHERE?

2013-06-04 Thread Mike Summers
I have a VIEW that does not appear to take advantage of the WHERE when
given the opportunity:

db=# explain select * from best_for_sale_layouts;
   QUERY PLAN


 Aggregate  (cost=1977.21..1977.22 rows=1 width=118)
   -  Sort  (cost=1938.18..1940.05 rows=748 width=122)
 Sort Key: sources.for_sale_layout_rank
 -  Hash Join  (cost=1.04..1902.48 rows=748 width=122)
   Hash Cond: (for_sale_layouts.source_id = sources.id)
   -  Append  (cost=0.00..1613.60 rows=74760 width=118)
 -  Seq Scan on for_sale_layouts  (cost=0.00..806.74
rows=37374 width=118)
 -  Seq Scan on assessor_records  (cost=0.00..806.86
rows=37386 width=118)
   -  Hash  (cost=1.02..1.02 rows=2 width=8)
 -  Seq Scan on sources  (cost=0.00..1.02 rows=2
width=8)
(10 rows)

db=# explain analyze select * from best_for_sale_layouts where address_id =
2871034;;
  QUERY
PLAN
---
 Aggregate  (cost=1979.33..1979.34 rows=1 width=118) (actual
time=93569.509..93569.510 rows=1 loops=1)
   Filter: (first_not_null(for_sale_layouts.address_id) = 2871034)
   -  Sort  (cost=1938.18..1940.05 rows=748 width=122) (actual
time=320.652..464.523 rows=74748 loops=1)
 Sort Key: sources.for_sale_layout_rank
 Sort Method: external sort  Disk: 5840kB
 -  Hash Join  (cost=1.04..1902.48 rows=748 width=122) (actual
time=0.057..198.500 rows=74748 loops=1)
   Hash Cond: (for_sale_layouts.source_id = sources.id)
   -  Append  (cost=0.00..1613.60 rows=74760 width=118)
(actual time=0.022..94.871 rows=74748 loops=1)
 -  Seq Scan on for_sale_layouts  (cost=0.00..806.74
rows=37374 width=118) (actual time=0.021..22.361 rows=37374 loops=1)
 -  Seq Scan on assessor_records  (cost=0.00..806.86
rows=37386 width=118) (actual time=0.011..23.383 rows=37374 loops=1)
   -  Hash  (cost=1.02..1.02 rows=2 width=8) (actual
time=0.015..0.015 rows=2 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 1kB
 -  Seq Scan on sources  (cost=0.00..1.02 rows=2
width=8) (actual time=0.007..0.010 rows=2 loops=1)
 Total runtime: 93573.390 ms
(14 rows)

If I run the View's select with the WHERE in psql I get what I expect
(first_not_null is an aggregate function):

db=# explain analyze SELECT

first_not_null(a.id) as id,
first_not_null(a.address_id) as address_id,

first_not_null(a.created_at) as created_at,
first_not_null(a.updated_at) as updated_at
  FROM (SELECT b.*, for_sale_layout_rank
FROM ((SELECT *
   FROM for_sale_layouts
   UNION ALL SELECT *
 FROM assessor_records) AS b INNER JOIN sources ON
b.source_id = sources.id)
  ORDER BY for_sale_layout_rank) AS a
where address_id = 2871034;

   QUERY PLAN

---
 Aggregate  (cost=21.23..21.24 rows=1 width=118) (actual time=0.571..0.571
rows=1 loops=1)
   -  Sort  (cost=17.64..17.64 rows=2 width=122) (actual time=0.272..0.274
rows=2 loops=1)
 Sort Key: sources.for_sale_layout_rank
 Sort Method: quicksort  Memory: 25kB
 -  Nested Loop  (cost=0.00..17.63 rows=2 width=122) (actual
time=0.199..0.253 rows=2 loops=1)
   Join Filter: (for_sale_layouts.source_id = sources.id)
   Rows Removed by Join Filter: 2
   -  Append  (cost=0.00..16.54 rows=2 width=118) (actual
time=0.140..0.185 rows=2 loops=1)
 -  Index Scan using
index_for_sale_layouts_on_address_id on for_sale_layouts  (cost=0.00..8.27
rows=1 width=118) (actual time=0.139..0.142 rows=1 loops=1)
   Index Cond: (address_id = 2871034)
 -  Index Scan using
index_assessor_layouts_on_address_id on assessor_records  (cost=0.00..8.27
rows=1 width=118) (actual time=0.038..0.039 rows=1 loops=1)
   Index Cond: (address_id = 2871034)
   -  Materialize  (cost=0.00..1.03 rows=2 width=8) (actual
time=0.022..0.025 rows=2 loops=2)
 -  Seq Scan on sources  (cost=0.00..1.02 rows=2
width=8) (actual time=0.020..0.023 rows=2 loops=1)
 Total runtime: 0.802 ms
(15 rows)

Is there anything I can do to get the View to update its plan?

Improvements are welcome, although for other reasons (Rails' ActiveRecord)
the View is a must.

Thanks in advance.


Re: [GENERAL] Success stories of PostgreSQL implementations in different companies

2013-05-23 Thread Mike Christensen
On Thu, May 23, 2013 at 2:51 PM, Steve Crawford 
scrawf...@pinpointresearch.com wrote:

 On 05/23/2013 02:36 PM, Oscar Calderon wrote:

 Hi, this question isn't technical, but is very important for me to know.
 Currently, here in El Salvador our company brings PostgreSQL support, but
 Oracle and SQL Server are more popular here.

 Even with that, some clients are being encouraged to change to PostgreSQL
 to lower their companies costs in technologies, but very often they ask if
 there are success stories of PostgreSQL implementations in companies in our
 region or around the world, success stories (if is possible) with some
 information like number of concurrent users, some hardware specs or storage
 size.

 I think that in my country is more common to hear success stories like
 that about other databases like Oracle because is more expanded here, but i
 would like if there's a place or if you can share with me some real
 experiences or success stories that you ever heard of successful
 implementations of PostgreSQL in companies to talk with people when they
 ask that kind of things.

  Start with the web-site - especially:
 http://www.postgresql.org/**about/users/http://www.postgresql.org/about/users/
 http://www.postgresql.org/**about/quotesarchive/http://www.postgresql.org/about/quotesarchive/

 I don't know about name-recognition in El Salvador but Etsy, Wisconsin
 Courts, Skype, Affilias, FlightAware, NTT are quite recognizable here.


And don't forget about everyone's favorite recipe search engine,
www.kitchenpc.com - powered by Postgres 9.1..

/ShamelessPlug


Re: [GENERAL] dblink does not resolve DNS, but works with IP

2013-05-15 Thread Mike Christensen
Ah, gotcha!  I guess whatever sample I was originally copying from used
hostaddr for some reason..  Thanks for the clarification, Tom!


On Wed, May 15, 2013 at 6:08 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Mike Christensen m...@kitchenpc.com writes:
  Though I'm a bit curious why there's a host and hostaddr.  Why can't it
  just resolve whatever you give it?

 Well, it will ... if you use the host parameter.  The whole point of
 hostaddr is that for that parameter, it will not try a DNS lookup.
 You'd only use that if you had issues with the speed or reliability
 of your DNS service.

 regards, tom lane



[GENERAL] dblink does not resolve DNS, but works with IP

2013-05-14 Thread Mike Christensen
If I have this:

CREATE OR REPLACE VIEW Link.Foo AS
  select * from dblink(
'hostaddr=123.123.123.123 dbname=KitchenPC user=Website
password=secret',
'select * from Foo') as ...

Then it works.  However, if I do:

CREATE OR REPLACE VIEW Link.Foo AS
  select * from dblink(
'hostaddr=db.domain.com dbname=KitchenPC user=Website password=secret',
'select * from Foo') as ...

Then I get:

ERROR:  could not establish connection
DETAIL:  could not translate host name db.domain.com to address: Unknown
host

However, from a command prompt I can ping db.domain.com and get
123.123.123.123.

Does dblink just not support DNS resolution?  I really don't want to hard
code IP addresses in my scripts.  Thanks!


Re: [GENERAL] dblink does not resolve DNS, but works with IP

2013-05-14 Thread Mike Christensen
Excellent!  Thanks so much.


On Tue, May 14, 2013 at 9:25 PM, Adrian Klaver adrian.kla...@gmail.comwrote:

 On 05/14/2013 09:17 PM, Mike Christensen wrote:

 If I have this:

 CREATE OR REPLACE VIEW Link.Foo AS
select * from dblink(
  'hostaddr=123.123.123.123 dbname=KitchenPC user=Website
 password=secret',
  'select * from Foo') as ...

 Then it works.  However, if I do:

 CREATE OR REPLACE VIEW Link.Foo AS
select * from dblink(
  'hostaddr=db.domain.com http://db.domain.com dbname=KitchenPC

 user=Website password=secret',
  'select * from Foo') as ...

 Then I get:

 ERROR:  could not establish connection
 DETAIL:  could not translate host name db.domain.com
 http://db.domain.com to address: Unknown host


 However, from a command prompt I can ping db.domain.com
 http://db.domain.com and get 123.123.123.123.


 Does dblink just not support DNS resolution?  I really don't want to
 hard code IP addresses in my scripts.  Thanks!


 See below for explanation of hostaddr and host. Short version, you are
 looking for host:

 http://www.postgresql.org/**docs/9.2/interactive/libpq-**
 connect.html#LIBPQ-**PARAMKEYWORDShttp://www.postgresql.org/docs/9.2/interactive/libpq-connect.html#LIBPQ-PARAMKEYWORDS


 --
 Adrian Klaver
 adrian.kla...@gmail.com



Re: [GENERAL] dblink does not resolve DNS, but works with IP

2013-05-14 Thread Mike Christensen
Though I'm a bit curious why there's a host and hostaddr.  Why can't it
just resolve whatever you give it?


On Tue, May 14, 2013 at 9:31 PM, Mike Christensen m...@kitchenpc.comwrote:

 Excellent!  Thanks so much.


 On Tue, May 14, 2013 at 9:25 PM, Adrian Klaver adrian.kla...@gmail.comwrote:

 On 05/14/2013 09:17 PM, Mike Christensen wrote:

 If I have this:

 CREATE OR REPLACE VIEW Link.Foo AS
select * from dblink(
  'hostaddr=123.123.123.123 dbname=KitchenPC user=Website
 password=secret',
  'select * from Foo') as ...

 Then it works.  However, if I do:

 CREATE OR REPLACE VIEW Link.Foo AS
select * from dblink(
  'hostaddr=db.domain.com http://db.domain.com dbname=KitchenPC

 user=Website password=secret',
  'select * from Foo') as ...

 Then I get:

 ERROR:  could not establish connection
 DETAIL:  could not translate host name db.domain.com
 http://db.domain.com to address: Unknown host


 However, from a command prompt I can ping db.domain.com
 http://db.domain.com and get 123.123.123.123.


 Does dblink just not support DNS resolution?  I really don't want to
 hard code IP addresses in my scripts.  Thanks!


 See below for explanation of hostaddr and host. Short version, you are
 looking for host:

 http://www.postgresql.org/**docs/9.2/interactive/libpq-**
 connect.html#LIBPQ-**PARAMKEYWORDShttp://www.postgresql.org/docs/9.2/interactive/libpq-connect.html#LIBPQ-PARAMKEYWORDS


 --
 Adrian Klaver
 adrian.kla...@gmail.com





[GENERAL] Where in the source code does postgres write to disk?

2013-04-20 Thread Mike Levine
According to
http://www.postgresql.org/docs/9.2/static/storage-file-layout.html

When a table or index exceeds 1 GB, it is divided into gigabyte-sized
segments. The first segment's file name is the same as the filenode;
subsequent segments are named filenode.1, filenode.2, etc.

I was wondering where in the source code this is dealt with. I have been
searching for the last few hours but have had no luck


Any help guiding me to the location in the source code where postgres
writes the buffer to disk would be greatly appreciated.


Thanks


[GENERAL] Where in the source code does postgres write to disk?

2013-04-20 Thread Mike Levine
According to
http://www.postgresql.org/docs/9.2/static/storage-file-layout.html

When a table or index exceeds 1 GB, it is divided into gigabyte-sized
segments. The first segment's file name is the same as the filenode;
subsequent segments are named filenode.1, filenode.2, etc.

I was wondering where in the source code this is dealt with. I have been
searching for the last few hours but have had no luck


Any help guiding me to the location in the source code where postgres
writes the buffer to disk would be greatly appreciated.


Thanks


[GENERAL] Fetching Server configured port from C Module

2013-04-17 Thread Mike Roest
Hi There,
   I'm having a bit of an issue finding a C function to fetch the
configured server port from a C module.

We have written a C module to allow for remote clients to call a function
to run pg_dump/pg_restore remotely but create files locally on the db
server.

Currently it works fine if the client connects through a network socket as
we're using inet_server_port to get the port to pass onto pg_dump/restore.
 But if the client is connected through a unix socket (actually a remote
client connecting to pgbouncer which is connecting to postgres though the
unix socket) inet_server_port is null.  I've looked for a function that we
can use to get the configured server port but haven't had any luck.

I could hard code the port in the module when we build it but it would be
nice to be able to change the configured postgres port and not have to
rebuild the module.

Anyone have any suggestions?

I've posted the code for our backup module here:

http://pastebin.com/wQ6VidWn


Re: [GENERAL] Fetching Server configured port from C Module

2013-04-17 Thread Mike Roest
 SHOW PORT;

 test= SELECT setting FROM pg_settings WHERE name = 'port';
 setting
-
 5432

Both of these are from a query context.  This is in a C module, I suppose I
could run a query but there has to be a direct C function to get this data.


Re: [GENERAL] Fetching Server configured port from C Module

2013-04-17 Thread Mike Roest
Perfect thanks Bruce that worked.

I just extern'd PostPortNumber in my module and everything seems to be
working.

--Mike


Re: [GENERAL] how to create materialized view in postgresql 8.3

2013-04-08 Thread Mike Christensen
This is the number one requested feature on Uservoice:

http://postgresql.uservoice.com/forums/21853-general/suggestions/247548-materialized-views


On Mon, Apr 8, 2013 at 9:27 AM, John R Pierce pie...@hogranch.com wrote:

  On 4/7/2013 11:58 PM, Zahid Quadri wrote:


 is it possible to created materialized view in postgresql 8.3 if yes
 please provide some sample.


 in older versions, the best you could do was to create a table and
 populate it with your 'view', then drop it when you're done (or truncate
 and repopulate it to update the 'view').



 --
 john r pierce  37N 122W
 somewhere on the middle of the left coast




  1   2   3   4   5   6   7   8   9   10   >