date:20131009

[GENERAL] Getting getsockopt(TCP_KEEPALIVE) failed LOG message in PG Logs on Solaris 10

2013-10-09 Thread Dhiraj Chawla

Hi,

I am getting getsockopt(TCP_KEEPALIVE) failed: Option not supported by
protocol log message in the PG Logs whenever I run a query referencing
pg_catalog.pg_settings on Solaris 10 (both Sprac and x86-64).

You can reproduce this case by running a query like: SELECT name, setting,
unit FROM pg_catalog.pg_settings;

This happens only the first time you execute the query though.

This issue was reproducible to me on both PostgreSQL 9.0.4 and PostgreSQL
9.2.4. This happens when I run the query via psql or pgAdminIII by
connecting to the database server over the TCP socket (i.e. by providing
host details in -h). Same behavior is not reproducible if I connect to
database server over Unix Socket.

I am not sure if this has been reported before, but I thought I should
bring this issue to your notice.

regards,

*Dhiraj Chawla*
Senior Software Engineer
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91-20-30589522

[GENERAL] Incorrect index being used

2013-10-09 Thread Jesse Long


Hi PostgreSQL community,

I have the following query, run immediately after executing VACUUM in 
the database. There is only one connection to the database.


The query runs for much longer than I expect it to run for, and I think 
this is due to it using the incorrect subplan. As you can see, subplans 
1 and 3 make use of and index, but these subplans are not used. 
Subplans  and 4 are seqscan, and they are used.


How can I get PostgreSQL to use subplan 1 and 3?

Thanks,
Jesse

testdb= explain analyse SELECT * FROM ARCHIVE_DOCUMENT AS r0 WHERE r0.NODE_ID = 
29 AND r0.ARCHIVE_DATE = '2013-07-08 18:28:00' AND (EXISTS (SELECT r1.* FROM 
ARCHIVE_DOCUMENT_INDEX AS r1 WHERE r1.ARCHIVE_ID = r0.ID AND r1.NODE_ID = r0.NODE_ID 
AND r1.VALUE = 'BSH70002152' ) OR EXISTS ( SELECT r2.* FROM ARCHIVE_DOCUMENT_INDEX AS 
r2 WHERE r2.ARCHIVE_ID = r0.ID AND r2.NODE_ID = r0.NODE_ID AND r2.VALUE = 'TC212592' 
) ) ORDER BY r0.ARCHIVE_DATE DESC LIMIT 10;


 QUERY PLAN



 Limit  (cost=0.56..151.79 rows=10 width=122) (actual time=44601.350..97649.196 
rows=2 loops=1)

   -  Index Scan Backward using idx_archive_document_x_node_id_archive_date on 
archive_document r0  (cost=0.56..8258406.24 rows=546105 width=122) (actual 
time=44601.33

 Index Cond: ((node_id = 29) AND (archive_date = '2013-07-08 
18:28:00'::timestamp without time zone))

 Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR 
(alternatives: SubPlan 3 or hashed SubPlan 4))

 Rows Removed by Filter: 710851

 SubPlan 1

   -  Index Only Scan using 
archive_document_index_x_archive_id_node_id_value on archive_document_index r1  
(cost=0.57..4.59 rows=1 width=0) (never executed)

 Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) 
AND (value = 'BSH70002152'::text))

 Heap Fetches: 0

 SubPlan 2

   -  Seq Scan on archive_document_index r1_1  (cost=0.00..1958104.00 
rows=1520 width=16) (actual time=44418.383..44558.293 rows=4 loops=1)

 Filter: ((value)::text = 'BSH70002152'::text)

 Rows Removed by Filter: 95009919

 SubPlan 3

   -  Index Only Scan using 
archive_document_index_x_archive_id_node_id_value on archive_document_index r2  
(cost=0.57..4.59 rows=1 width=0) (never executed)

 Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) 
AND (value = 'TC212592'::text))

 Heap Fetches: 0

 SubPlan 4

   -  Seq Scan on archive_document_index r2_1  (cost=0.00..1958104.00 
rows=1520 width=16) (actual time=41659.464..41663.342 rows=1 loops=1)

 Filter: ((value)::text = 'TC212592'::text)

 Rows Removed by Filter: 95009922

 Total runtime: 97683.836 ms

(22 rows)



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can checkpoint creation be parallel?

2013-10-09 Thread Albe Laurenz

高健 wrote:
 The background writer and ordinary backends might write data
 (for their own reasons) that the
 checkpointer would have otherwise needed to write anyway.
 
 And does the ordinary backends refer to
 the ones created when a client make a connection to PG?

Yes.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hi, Friends, are there any ETL tools (free or commercial) available for PostgreSQL?

2013-10-09 Thread raghu ram

On Mon, Oct 7, 2013 at 8:32 PM, sunpeng blueva...@gmail.com wrote:

 Hi, Friends, are there any ETL tools (free or commercial) available for
 PostgreSQL?


*ETL Tools for PostgreSQL::*

Definition: An ETL process data to load into the database from a flat file

A. Extract
B. Transform
C. Load

1. PGLoader -  Load .csv file
2. Benetl 1.8 - Load .txt or .csv or .xls file
3. Talend Open Studio = http://www.talend.com/index.php
4. Pentaho Kettle = http://kettle.pentaho.com/
5. Visual Importer = Data can be imported from flat files,Excel.MS
Access,Oracle.Mysql.Interbase,PostgreSQL,oleDB,ODBC and DBF files.
6. CloverETL = http://en.wikipedia.org/wiki/CloverETL

Both [3 4] of these ETL solutions have a lot of the same types of features:


   - Both are Java based and run on Linux and Windows (Talend is
   Eclipse-based)
   - Visual designers for creating the transformations
   - Connectivity for a myriad of databases, including all the big DBs,
   text formats, etc.
   - Supports distributing jobs across multiple servers if you are doing
   serious lifting
   - Excellent error handling and error notification systems
   - Active and helpful forums (Kettle is older and seems to have a larger
   community however)
   - Free and open source
   - They are complex enough to handle a lot of tasks, but not so much as
   to kill you
   - There are versions of both (GeoKettle and Spatial Data Integrator)
   that are tailored for GIS

Thanks  Regards
Raghu Ram

Re: [GENERAL] Many, many materialised views - Performance?

2013-10-09 Thread Alban Hertroys

On Oct 9, 2013, at 4:08, Kevin Grittner kgri...@ymail.com wrote:

 Toby Corkindale toby.corkind...@strategicdata.com.au wrote:
 
 In this instance, we have a lot of queries that build certain aggregate
 results, which are very slow. The queries were initially all implemented
 as views, but then we started doing a type of materialising of our own,
 turning them into tables with CREATE TABLE AS SELECT 
 This does make the results very fast to access now, but the side effect
 is a vast number of (very small) tables.
 
 If you have multiple tables with identical layout but different
 subsets of the data, you will probably get better performance by
 putting them into a single table with indexes which allow you to
 quickly search the smaller sets within the table.


I was thinking just that while reading Toby's message. For example, you could 
put the results of several related aggregations into a single materialized 
view, if they share the same key columns (year, month, factory or something 
similar).
I'm not sure the new built-in materialized views can be updated like that 
though, unless you manage to combine those aggregations into a single 
monster-query, but that will probably not perform well...
What we tend to do at work (no PostgreSQL, unfortunately) is to use external 
tools to combine those aggregated results and store that back into the database 
(which we often need to do anyway, as we deal with several databases on several 
servers).

Additionally, if you have that many tables, it sounds like you partitioned your 
data.
With aggregated results, the need for partitioning is much less (or perhaps it 
isn't even needed at all). And perhaps you don't even need the data from all 
partitions; say if you have monthly partitions of data, do you really need 
aggregated results from 5 years ago?

That said, users excel in finding data to request that you thought they 
wouldn't need.

Which brings me to another question: Do your users really need the data from 
all those views or do they only think they need that?

Frequently, users create elaborate Excel sheets and then request tons of data 
to fill them, while what they're really interested in is the _result_ of that 
Excel sheet. If you can provide them with that, they're happy and you can rest 
assured that they're at least using correct results. Plus, it removes some of 
_this_ burden from your database.

I've seen users who're busy creating sheets like that for 2 weeks, twice a 
year, to create data that I can prepare for them in a couple of days into a 
report that takes 2 minutes to load (which is long, but not compared to their 2 
weeks).

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Incorrect index being used

2013-10-09 Thread Albe Laurenz

Jesse Long wrote:
 I have the following query, run immediately after executing VACUUM in
 the database. There is only one connection to the database.

You should run ANALYZE, not VACUUM.

 The query runs for much longer than I expect it to run for, and I think
 this is due to it using the incorrect subplan. As you can see, subplans
 1 and 3 make use of and index, but these subplans are not used.
 Subplans  and 4 are seqscan, and they are used.
 
 How can I get PostgreSQL to use subplan 1 and 3?

They are only possible if an Index Only Scan is possible, which
can only be used if the respective table entries are visible for
all transactions.

 testdb= explain analyse SELECT * FROM ARCHIVE_DOCUMENT AS r0 WHERE 
 r0.NODE_ID = 29 AND
 r0.ARCHIVE_DATE = '2013-07-08 18:28:00' AND (EXISTS (SELECT r1.* FROM 
 ARCHIVE_DOCUMENT_INDEX AS r1
 WHERE r1.ARCHIVE_ID = r0.ID AND r1.NODE_ID = r0.NODE_ID AND r1.VALUE = 
 'BSH70002152' ) OR EXISTS (
 SELECT r2.* FROM ARCHIVE_DOCUMENT_INDEX AS r2 WHERE r2.ARCHIVE_ID = r0.ID AND 
 r2.NODE_ID = r0.NODE_ID
 AND r2.VALUE = 'TC212592' ) ) ORDER BY r0.ARCHIVE_DATE DESC LIMIT 10;

[...]

   Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR 
 (alternatives: SubPlan 3 or hashed
 SubPlan 4))
 
   Rows Removed by Filter: 710851
 
   SubPlan 1
 
 -  Index Only Scan using 
 archive_document_index_x_archive_id_node_id_value on
 archive_document_index r1  (cost=0.57..4.59 rows=1 width=0) (never executed)
 
   Index Cond: ((archive_id = r0.id) AND (node_id = 
 r0.node_id) AND (value =
 'BSH70002152'::text))
 
   Heap Fetches: 0
 
   SubPlan 2
 
 -  Seq Scan on archive_document_index r1_1  
 (cost=0.00..1958104.00 rows=1520 width=16)
 (actual time=44418.383..44558.293 rows=4 loops=1)
 
   Filter: ((value)::text = 'BSH70002152'::text)
 
   Rows Removed by Filter: 95009919
 
   SubPlan 3
 
 -  Index Only Scan using 
 archive_document_index_x_archive_id_node_id_value on
 archive_document_index r2  (cost=0.57..4.59 rows=1 width=0) (never executed)
 
   Index Cond: ((archive_id = r0.id) AND (node_id = 
 r0.node_id) AND (value =
 'TC212592'::text))
 
   Heap Fetches: 0
 
   SubPlan 4
 
 -  Seq Scan on archive_document_index r2_1  
 (cost=0.00..1958104.00 rows=1520 width=16)
 (actual time=41659.464..41663.342 rows=1 loops=1)
 
   Filter: ((value)::text = 'TC212592'::text)
 
   Rows Removed by Filter: 95009922

The estimates are quite off.
Does ANALYZE archive_document, possibly after increasing
default_statistics_target, make a difference?

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Incorrect index being used

2013-10-09 Thread Jesse Long


On 09/10/2013 12:10, Albe Laurenz wrote:

Jesse Long wrote:

I have the following query, run immediately after executing VACUUM in
the database. There is only one connection to the database.

You should run ANALYZE, not VACUUM.


The query runs for much longer than I expect it to run for, and I think
this is due to it using the incorrect subplan. As you can see, subplans
1 and 3 make use of and index, but these subplans are not used.
Subplans  and 4 are seqscan, and they are used.

How can I get PostgreSQL to use subplan 1 and 3?

They are only possible if an Index Only Scan is possible, which
can only be used if the respective table entries are visible for
all transactions.


testdb= explain analyse SELECT * FROM ARCHIVE_DOCUMENT AS r0 WHERE r0.NODE_ID 
= 29 AND
r0.ARCHIVE_DATE = '2013-07-08 18:28:00' AND (EXISTS (SELECT r1.* FROM 
ARCHIVE_DOCUMENT_INDEX AS r1
WHERE r1.ARCHIVE_ID = r0.ID AND r1.NODE_ID = r0.NODE_ID AND r1.VALUE = 
'BSH70002152' ) OR EXISTS (
SELECT r2.* FROM ARCHIVE_DOCUMENT_INDEX AS r2 WHERE r2.ARCHIVE_ID = r0.ID AND 
r2.NODE_ID = r0.NODE_ID
AND r2.VALUE = 'TC212592' ) ) ORDER BY r0.ARCHIVE_DATE DESC LIMIT 10;

[...]


   Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR 
(alternatives: SubPlan 3 or hashed
SubPlan 4))

   Rows Removed by Filter: 710851

   SubPlan 1

 -  Index Only Scan using 
archive_document_index_x_archive_id_node_id_value on
archive_document_index r1  (cost=0.57..4.59 rows=1 width=0) (never executed)

   Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) 
AND (value =
'BSH70002152'::text))

   Heap Fetches: 0

   SubPlan 2

 -  Seq Scan on archive_document_index r1_1  
(cost=0.00..1958104.00 rows=1520 width=16)
(actual time=44418.383..44558.293 rows=4 loops=1)

   Filter: ((value)::text = 'BSH70002152'::text)

   Rows Removed by Filter: 95009919

   SubPlan 3

 -  Index Only Scan using 
archive_document_index_x_archive_id_node_id_value on
archive_document_index r2  (cost=0.57..4.59 rows=1 width=0) (never executed)

   Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) 
AND (value =
'TC212592'::text))

   Heap Fetches: 0

   SubPlan 4

 -  Seq Scan on archive_document_index r2_1  
(cost=0.00..1958104.00 rows=1520 width=16)
(actual time=41659.464..41663.342 rows=1 loops=1)

   Filter: ((value)::text = 'TC212592'::text)

   Rows Removed by Filter: 95009922

The estimates are quite off.
Does ANALYZE archive_document, possibly after increasing
default_statistics_target, make a difference?

Yours,
Laurenz Albe



Hi Laurenz,

Thank you for the feedback.

There is no problem with row visibility, there is only one connection to 
the database - the connection I am using to do these selects.


Thanks you for the advise regarding ANALYZE. I ran ANALYZE on both 
tables concerned, but not much changed:



 QUERY PLAN



 Limit  (cost=0.56..151.85 rows=10 width=122) (actual time=40841.984..85668.213 
rows=2 loops=1)

   -  Index Scan Backward using idx_archive_document_x_node_id_archive_date on 
archive_document r0  (cost=0.56..7627640.20 rows=504186 width=122) (actual 
time=40841.98

 Index Cond: ((node_id = 29) AND (archive_date = '2013-07-08 
18:28:00'::timestamp without time zone))

 Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR 
(alternatives: SubPlan 3 or hashed SubPlan 4))

 Rows Removed by Filter: 710851

 SubPlan 1

   -  Index Only Scan using 
archive_document_index_x_archive_id_node_id_value on archive_document_index r1  
(cost=0.57..4.59 rows=1 width=0) (never executed)

 Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) 
AND (value = 'BSH70002152'::text))

 Heap Fetches: 0

 SubPlan 2

   -  Seq Scan on archive_document_index r1_1  (cost=0.00..1958101.80 
rows=1568 width=16) (actual time=36633.365..40841.909 rows=4 loops=1)

 Filter: ((value)::text = 'BSH70002152'::text)

 Rows Removed by Filter: 95009919

 SubPlan 3

   -  Index Only Scan using 
archive_document_index_x_archive_id_node_id_value on archive_document_index r2  
(cost=0.57..4.59 rows=1 width=0) (never executed)

 Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) 
AND (value = 'TC212592'::text))

 Heap Fetches: 0

 SubPlan 4

   -  Seq Scan on archive_document_index r2_1  (cost=0.00..1958101.80 
rows=1568 width=16) (actual time=40241.599..44462.485 rows=1 loops=1)

Re: [GENERAL] Incorrect index being used

2013-10-09 Thread Jesse Long


On 09/10/2013 12:57, Jesse Long wrote:

On 09/10/2013 12:10, Albe Laurenz wrote:

Jesse Long wrote:

I have the following query, run immediately after executing VACUUM in
the database. There is only one connection to the database.

You should run ANALYZE, not VACUUM.


The query runs for much longer than I expect it to run for, and I think
this is due to it using the incorrect subplan. As you can see, subplans
1 and 3 make use of and index, but these subplans are not used.
Subplans  and 4 are seqscan, and they are used.

How can I get PostgreSQL to use subplan 1 and 3?

They are only possible if an Index Only Scan is possible, which
can only be used if the respective table entries are visible for
all transactions.

testdb= explain analyse SELECT * FROM ARCHIVE_DOCUMENT AS r0 WHERE 
r0.NODE_ID = 29 AND
r0.ARCHIVE_DATE = '2013-07-08 18:28:00' AND (EXISTS (SELECT r1.* 
FROM ARCHIVE_DOCUMENT_INDEX AS r1
WHERE r1.ARCHIVE_ID = r0.ID AND r1.NODE_ID = r0.NODE_ID AND r1.VALUE 
= 'BSH70002152' ) OR EXISTS (
SELECT r2.* FROM ARCHIVE_DOCUMENT_INDEX AS r2 WHERE r2.ARCHIVE_ID = 
r0.ID AND r2.NODE_ID = r0.NODE_ID

AND r2.VALUE = 'TC212592' ) ) ORDER BY r0.ARCHIVE_DATE DESC LIMIT 10;

[...]

   Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR 
(alternatives: SubPlan 3 or hashed

SubPlan 4))

   Rows Removed by Filter: 710851

   SubPlan 1

 -  Index Only Scan using 
archive_document_index_x_archive_id_node_id_value on
archive_document_index r1  (cost=0.57..4.59 rows=1 width=0) (never 
executed)


   Index Cond: ((archive_id = r0.id) AND (node_id = 
r0.node_id) AND (value =

'BSH70002152'::text))

   Heap Fetches: 0

   SubPlan 2

 -  Seq Scan on archive_document_index r1_1 
(cost=0.00..1958104.00 rows=1520 width=16)

(actual time=44418.383..44558.293 rows=4 loops=1)

   Filter: ((value)::text = 'BSH70002152'::text)

   Rows Removed by Filter: 95009919

   SubPlan 3

 -  Index Only Scan using 
archive_document_index_x_archive_id_node_id_value on
archive_document_index r2  (cost=0.57..4.59 rows=1 width=0) (never 
executed)


   Index Cond: ((archive_id = r0.id) AND (node_id = 
r0.node_id) AND (value =

'TC212592'::text))

   Heap Fetches: 0

   SubPlan 4

 -  Seq Scan on archive_document_index r2_1 
(cost=0.00..1958104.00 rows=1520 width=16)

(actual time=41659.464..41663.342 rows=1 loops=1)

   Filter: ((value)::text = 'TC212592'::text)

   Rows Removed by Filter: 95009922

The estimates are quite off.
Does ANALYZE archive_document, possibly after increasing
default_statistics_target, make a difference?

Yours,
Laurenz Albe



Hi Laurenz,

Thank you for the feedback.

There is no problem with row visibility, there is only one connection 
to the database - the connection I am using to do these selects.


Thanks you for the advise regarding ANALYZE. I ran ANALYZE on both 
tables concerned, but not much changed:


QUERY PLAN

 



 Limit  (cost=0.56..151.85 rows=10 width=122) (actual 
time=40841.984..85668.213 rows=2 loops=1)


   -  Index Scan Backward using 
idx_archive_document_x_node_id_archive_date on archive_document r0  
(cost=0.56..7627640.20 rows=504186 width=122) (actual time=40841.98


 Index Cond: ((node_id = 29) AND (archive_date = '2013-07-08 
18:28:00'::timestamp without time zone))


 Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR 
(alternatives: SubPlan 3 or hashed SubPlan 4))


 Rows Removed by Filter: 710851

 SubPlan 1

   -  Index Only Scan using 
archive_document_index_x_archive_id_node_id_value on 
archive_document_index r1  (cost=0.57..4.59 rows=1 width=0) (never 
executed)


 Index Cond: ((archive_id = r0.id) AND (node_id = 
r0.node_id) AND (value = 'BSH70002152'::text))


 Heap Fetches: 0

 SubPlan 2

   -  Seq Scan on archive_document_index r1_1 
(cost=0.00..1958101.80 rows=1568 width=16) (actual 
time=36633.365..40841.909 rows=4 loops=1)


 Filter: ((value)::text = 'BSH70002152'::text)

 Rows Removed by Filter: 95009919

 SubPlan 3

   -  Index Only Scan using 
archive_document_index_x_archive_id_node_id_value on 
archive_document_index r2  (cost=0.57..4.59 rows=1 width=0) (never 
executed)


 Index Cond: ((archive_id = r0.id) AND (node_id = 
r0.node_id) AND (value = 'TC212592'::text))


 Heap Fetches: 0

 SubPlan 4

   -  Seq Scan on archive_document_index r2_1 
(cost=0.00..1958101.80 rows=1568 width=16) (actual 
time=40241.599..44462.485 rows=1 loops=1)


 Filter: ((value)::text =

Re: [GENERAL] Many, many materialised views - Performance?

2013-10-09 Thread Bill Moran

On Tue, 8 Oct 2013 19:08:45 -0700 (PDT) Kevin Grittner kgri...@ymail.com 
wrote:

  In this instance, we have a lot of queries that build certain aggregate
  results, which are very slow. The queries were initially all implemented
  as views, but then we started doing a type of materialising of our own,
  turning them into tables with CREATE TABLE AS SELECT 
  This does make the results very fast to access now, but the side effect
  is a vast number of (very small) tables.

I missed the start of this thread, so apologies if my suggestion is off-base.

When there are lots of tables, I've seen performance improvements from
distributing the tables through schemas.  It seems to improve name
resolution performance.

-- 
Bill Moran wmo...@potentialtech.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_similarity

2013-10-09 Thread janek12

Now the Steps:
$ USE_PGXS=1 make
$ USE_PGXS=1 make install
are working.
But I still don't have the directory 'SHAREDIR/contrib/pg_similarity.sql'
 
Janek Sendrowski
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hi, Friends, are there any ETL tools (free or commercial) available for PostgreSQL?

2013-10-09 Thread Tony Theodore


On 09/10/2013, at 8:39 PM, raghu ram raghuchenn...@gmail.com wrote:

 
 On Mon, Oct 7, 2013 at 8:32 PM, sunpeng blueva...@gmail.com wrote:
 Hi, Friends, are there any ETL tools (free or commercial) available for 
 PostgreSQL?
 
 
 ETL Tools for PostgreSQL::
 
 Definition: An ETL process data to load into the database from a flat file
 
 A. Extract
 B. Transform
 C. Load
 
 1. PGLoader -  Load .csv file
 2. Benetl 1.8 - Load .txt or .csv or .xls file
 3. Talend Open Studio = http://www.talend.com/index.php
 4. Pentaho Kettle = http://kettle.pentaho.com/
 5. Visual Importer = Data can be imported from flat files,Excel.MS 
 Access,Oracle.Mysql.Interbase,PostgreSQL,oleDB,ODBC and DBF files.
 6. CloverETL = http://en.wikipedia.org/wiki/CloverETL
 
 Both [3 4] of these ETL solutions have a lot of the same types of features:
 
 Both are Java based and run on Linux and Windows (Talend is Eclipse-based)
 Visual designers for creating the transformations
 Connectivity for a myriad of databases, including all the big DBs, text 
 formats, etc.
 Supports distributing jobs across multiple servers if you are doing serious 
 lifting
 Excellent error handling and error notification systems
 Active and helpful forums (Kettle is older and seems to have a larger 
 community however)
 Free and open source
 They are complex enough to handle a lot of tasks, but not so much as to kill 
 you
 There are versions of both (GeoKettle and Spatial Data Integrator) that are 
 tailored for GIS 

Not exactly an ETL tool, but I've recently felt a sense of liberation since 
switching to make/python/psql/wget... for my ETL needs. Make is probably the 
cornerstone as it handles dependencies (think foreign key ordering with schema 
introspection) and parallel loading in a very intuitive way.

Cheers,

Tony

Re: [GENERAL] Hi, Friends, are there any ETL tools (free or commercial) available for PostgreSQL?

2013-10-09 Thread Tony Theodore


On 09/10/2013, at 11:03 PM, Tony Theodore tony.theod...@gmail.com wrote:

 On 09/10/2013, at 8:39 PM, raghu ram raghuchenn...@gmail.com wrote:
 
 ETL Tools for PostgreSQL::
 
 Definition: An ETL process data to load into the database from a flat file
 
 A. Extract
 B. Transform
 C. Load
 
 Not exactly an ETL tool, but I've recently felt a sense of liberation since 
 switching to make/python/psql/wget... for my ETL needs. Make is probably the 
 cornerstone as it handles dependencies (think foreign key ordering with 
 schema introspection) and parallel loading in a very intuitive way.

Forgot to mention that I do something more like ELT. Postgres itself is the 
best Transform tool.

Cheers,

Tony

[GENERAL] no syntax error on limit1

2013-10-09 Thread Willy-Bas Loos

Hi,

Postgres 9.1.9 gives me no syntax error on this, but all the records:
with a as (values
(1),(2),(3))
select *
from a
limit1

Cheers,

WBL

-- 
Quality comes from focus and clarity of purpose -- Mark Shuttleworth

Re: [GENERAL] no syntax error on limit1

2013-10-09 Thread Szymon Guz

On 9 October 2013 14:24, Willy-Bas Loos willy...@gmail.com wrote:

 Hi,

 Postgres 9.1.9 gives me no syntax error on this, but all the records:
 with a as (values
 (1),(2),(3))
 select *
 from a
 limit1


Hi,
that's quite OK. The limit1 is just an alias for the table a

Szymon

Re: [GENERAL] no syntax error on limit1

2013-10-09 Thread Willy-Bas Loos

duh!
thx.


On Wed, Oct 9, 2013 at 2:30 PM, Szymon Guz mabew...@gmail.com wrote:


 On 9 October 2013 14:24, Willy-Bas Loos willy...@gmail.com wrote:

 Hi,

 Postgres 9.1.9 gives me no syntax error on this, but all the records:
 with a as (values
 (1),(2),(3))
 select *
 from a
 limit1


 Hi,
 that's quite OK. The limit1 is just an alias for the table a

 Szymon




-- 
Quality comes from focus and clarity of purpose -- Mark Shuttleworth

Re: [GENERAL] no syntax error on limit1

2013-10-09 Thread Jov

here the limit1 is a table alias， = as limit1

jov
在 2013-10-9 下午8:27，Willy-Bas Loos willy...@gmail.com写道：

 Hi,

 Postgres 9.1.9 gives me no syntax error on this, but all the records:
 with a as (values
 (1),(2),(3))
 select *
 from a
 limit1

 Cheers,

 WBL

 --
 Quality comes from focus and clarity of purpose -- Mark Shuttleworth

Re: [GENERAL] Incorrect index being used

2013-10-09 Thread Albe Laurenz

Jesse Long wrote:
 There is no problem with row visibility, there is only one connection to
 the database - the connection I am using to do these selects.

No idea why the plans cannot be used.

It might be helpful to see the table and index definitions.

 Thanks you for the advise regarding ANALYZE. I ran ANALYZE on both
 tables concerned, but not much changed:

Did you try increasing default_statistics_target before ANALYZE?

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_similarity

2013-10-09 Thread janek12

Thanks a lot!
Now it's working :)
 
Janek Sendrowksi


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] streaming replication timeout error

2013-10-09 Thread Adrian Klaver


On 10/08/2013 07:58 PM, 高健 wrote:

Hello:

My customer encountered some connection timeout, while using one
primary-one standby streaming replication.

The original log is japanese, because there are no error-code like
oracle's ora-xxx,
I tried to translate the japanese information into English, But that
might be not correct English for PG.


The most important part is:

2013-09-22 09:52:47 JST[28297][51d1fbcb.6e89-2][0][XX000]FATAL:  Could
not receive data from WAL stream: could not receive data from server:
connection timeout
scp: /opt/PostgresPlus/9.2AS/data/arch/00AC01F1004A: No such
file or directory

I was asked about:
In what occasion will the above fatal error occur?

I looked into the postgresql.conf file for the primary and standby server.
And made some experiences.

I found:
Senario I:

If the wal file wanted is removed manually：

Both in primary and standby, log will be like this:
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
segment 00010011 has already been removed






But I haven't found a  good explanation fitting the logs' FATAL error.
Can anybody give me some info?


Would seem to me the interesting part is:

 scp: /opt/PostgresPlus/9.2AS/data/arch/00AC01F1004A: No 
such  file or directory


Are using scp to move WAL files to an archive directory?

If so, it seems scp is having issues, either network interruption or the 
file is disappearing under it.




Thanks in advance

jian gao




--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] I need more specific instructions for switching to digest mode for this list

2013-10-09 Thread Bob Futrelle

Trying to switch to the digest didn't work.
How do I find more specific details about switching?


On Wed, Oct 9, 2013 at 12:27 AM, Bob Futrelle bob.futre...@gmail.comwrote:

   set pgsql-general digest

Re: [GENERAL] I need more specific instructions for switching to digest mode for this list

2013-10-09 Thread Raymond O'Donnell

On 09/10/2013 16:47, Bob Futrelle wrote:
 Trying to switch to the digest didn't work.
 How do I find more specific details about switching?
 
 
 On Wed, Oct 9, 2013 at 12:27 AM, Bob Futrelle bob.futre...@gmail.com
 mailto:bob.futre...@gmail.com wrote:
 
   set pgsql-general digest
 

You sent your command to the list, rather than to the list server. I
can't remember the the correct address off the top of my head, but it
might be something like pgsql-general-requ...@postgresql.org.

Ray.


-- 
Raymond O'Donnell :: Galway :: Ireland
r...@iol.ie


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] I need more specific instructions for switching to digest mode for this list

2013-10-09 Thread Igor Neyman

 -Original Message-
 From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-
 ow...@postgresql.org] On Behalf Of Raymond O'Donnell
 Sent: Wednesday, October 09, 2013 11:54 AM
 To: Bob Futrelle
 Cc: pgsql-general@postgresql.org
 Subject: Re: [GENERAL] I need more specific instructions for switching to
 digest mode for this list
 
 On 09/10/2013 16:47, Bob Futrelle wrote:
  Trying to switch to the digest didn't work.
  How do I find more specific details about switching?
 
 
  On Wed, Oct 9, 2013 at 12:27 AM, Bob Futrelle bob.futre...@gmail.com
  mailto:bob.futre...@gmail.com wrote:
 
set pgsql-general digest
 
 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general

It says it all right at the bottom:

 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Incorrect index being used

2013-10-09 Thread Tom Lane

Jesse Long j...@unknown.za.net writes:
 The query runs for much longer than I expect it to run for, and I think 
 this is due to it using the incorrect subplan. As you can see, subplans 
 1 and 3 make use of and index, but these subplans are not used. 
 Subplans  and 4 are seqscan, and they are used.
 How can I get PostgreSQL to use subplan 1 and 3?

You can't, and you would not like the results if you did.

The construct that's being described (perhaps not very intelligibly)
by this EXPLAIN output is an alternative pair of subplans.  Actually
there are two such alternative pairs in this example.  The indexscan
variants are subplans that would be fast if executed only once or
twice.  The seqscan variants, if used, are used to load a hashtable
that is then probed for each row of the outer plan.  If there are a
lot of rows to be considered in the outer plan, then it's better to
pay the price of loading the hashtable, because each hashtable probe
will be a lot cheaper than doing a fresh indexscan with the comparison
value from the current outer row.

In this example, we can see that the outer scan that the subplans
are attached to eliminated 710851 rows by means of the subplan filters,
meaning that the subplans were probed 710851+2 times.  If each of those
probes had been done with a separate indexscan, you'd likely still be
waiting for the result.  Using the seqscan+hashtable was definitely the
right choice here.

BTW, the reason it looks like this rather than just hard-wiring the
seqscan choice is a planner implementation artifact --- at the time
that the subplan plans are created, we don't know how many rows are
expected to pass through the outer plan level.  So we plan it both
ways and leave the choice to be made during executor startup.

What I'd suggest is that you see if you can't get rid of the EXISTS() OR
EXISTS() construction in favor of a single EXISTS clause --- I'm too lazy
to work out the details but it looks like you could do the OR in the WHERE
clause of a single EXISTS sub-select.  That would allow the planner to
convert the EXISTS into a semi-join, which might work better than what
you've got.  As is, you're dealing with fairly generic sub-select logic
that isn't going to be terribly well optimized.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] String reverse funtion?

2013-10-09 Thread David Johnston

ginkgo36 wrote
 Hello everyone
  
 I have to reverse a string like EA;BX;CA to CA;BX;EA. or EA,BX,CA to
 CA,BX,EA
  
 Is there any function to do this?
 
 Thanks all!

No.  You will have to write your own.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/String-reverse-funtion-tp5773871p5773887.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] String reverse funtion?

2013-10-09 Thread John Meyer


On 10/9/2013 11:52 AM, David Johnston wrote:

ginkgo36 wrote

Hello everyone
  
I have to reverse a string like EA;BX;CA to CA;BX;EA. or EA,BX,CA to

CA,BX,EA
  
Is there any function to do this?


Thanks all!

No.  You will have to write your own.

David J.





Based upon the example, it's probably very easy to use a split/explode 
in your language of choice (VB.NET, perl, python, etc).



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] String reverse funtion?

2013-10-09 Thread Pavel Stehule

2013/10/9 John Meyer johnme...@pueblocomputing.com

 On 10/9/2013 11:52 AM, David Johnston wrote:

 ginkgo36 wrote

 Hello everyone
   I have to reverse a string like EA;BX;CA to CA;BX;EA. or EA,BX,CA to
 CA,BX,EA
   Is there any function to do this?

 Thanks all!

 No.  You will have to write your own.

 David J.




 Based upon the example, it's probably very easy to use a split/explode in
 your language of choice (VB.NET, perl, python, etc).


or SQL

select string_agg(u, ';' order by r desc)
   from (select row_number() over () r, u
from unnest(string_to_array('EA;BX;CA',';')) u) x;
 string_agg

 CA;BX;EA
(1 row)

Regards

Pavel




 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/**mailpref/pgsql-generalhttp://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] String reverse funtion?

2013-10-09 Thread John Meyer


That works too.  Point being ginkgo36 has his solution.
On 10/9/2013 12:07 PM, Pavel Stehule wrote:




2013/10/9 John Meyer johnme...@pueblocomputing.com 
mailto:johnme...@pueblocomputing.com


On 10/9/2013 11:52 AM, David Johnston wrote:

ginkgo36 wrote

Hello everyone
  I have to reverse a string like EA;BX;CA to CA;BX;EA. or
EA,BX,CA to
CA,BX,EA
  Is there any function to do this?

Thanks all!

No.  You will have to write your own.

David J.




Based upon the example, it's probably very easy to use a
split/explode in your language of choice (VB.NET http://VB.NET,
perl, python, etc).


or SQL

select string_agg(u, ';' order by r desc)
   from (select row_number() over () r, u
from unnest(string_to_array('EA;BX;CA',';')) u) x;
 string_agg

 CA;BX;EA
(1 row)

Regards

Pavel




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org

mailto:pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [HACKERS] Urgent Help Required

2013-10-09 Thread Jim Nasby


On 10/8/13 5:55 AM, shailesh singh wrote:
 HINT:  To avoid a database shutdown, execute a full-database VACUUM in 
debug.
 ERROR:  could not access status of transaction 449971277
 DETAIL:  could not open file pg_clog/01AD: No such file or directory

Unless I'm mistaken, that missing CLOG file is a bad sign...?
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] declare constraint as valid

2013-10-09 Thread Torsten Förtsch

Hi,

assuming a constraint is added to a table as NOT VALID. Now I know it IS
valid. Can I simply declare it as valid by

update pg_constraint
   set convalidated='t'
 where conrelid=(select c.oid
   from pg_class c
   join pg_namespace n on (n.oid=c.relnamespace)
  where c.relname='tablename'
and n.nspname='schemaname')
   and conname='constraintname';

instead of

alter table tablename validate constraint ...

Or does the latter have other side effects?

I am asking because I want to avoid the ACCESS EXCLUSIVE lock required
by the ALTER TABLE. I am sure there are no rows violating the constraint.

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] streaming replication timeout error

2013-10-09 Thread 高健

Hello:

Thanks for replying.

The recovery.conf file on standby(DB2) is like that:

standby_mode = 'on'
primary_conninfo = 'host=DB1 port=5432 application_name=testpg
user=postgres connect_timeout=10 keepalives_idle=5 keepalives_interval=1'
recovery_target_timeline = 'latest'
restore_command  = 'scp -o ConnectTimeout 5 -i
/opt/PostgresPlus/9.2AS/.ssh/id_edb
DB1:/opt/PostgresPlus/9.2AS/data/arch/%f %p'


I  am not familiar with the scp command,  I think that here scp is used to
copy archive wal log files from primary  to standby...

Maybe the ConnectionTimeout is too small, And sometimes when network is not
very well,
the restore_command will fail and return FATAL error?

In fact I am a little confused about restore_command, we are using
streaming replication, but why restore_command is still needed to copy
archive wal log, isn't it  the old warm standby (file shipping)?

Best Regards
jian gao



2013/10/9 Adrian Klaver adrian.kla...@gmail.com

 On 10/08/2013 07:58 PM, 高健 wrote:

 Hello:

 My customer encountered some connection timeout, while using one
 primary-one standby streaming replication.

 The original log is japanese, because there are no error-code like
 oracle's ora-xxx,
 I tried to translate the japanese information into English, But that
 might be not correct English for PG.


 The most important part is:

 2013-09-22 09:52:47 JST[28297][51d1fbcb.6e89-2][0]**[XX000]FATAL:  Could
 not receive data from WAL stream: could not receive data from server:
 connection timeout
 scp: /opt/PostgresPlus/9.2AS/data/**arch/00AC01F1004A: No
 such
 file or directory

 I was asked about:
 In what occasion will the above fatal error occur?

 I looked into the postgresql.conf file for the primary and standby server.
 And made some experiences.

 I found:
 Senario I:

 If the wal file wanted is removed manually：

 Both in primary and standby, log will be like this:
 FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
 segment 00010011 has already been removed




 But I haven't found a  good explanation fitting the logs' FATAL error.
 Can anybody give me some info?


 Would seem to me the interesting part is:


  scp: /opt/PostgresPlus/9.2AS/data/**arch/00AC01F1004A: No
 such  file or directory

 Are using scp to move WAL files to an archive directory?

 If so, it seems scp is having issues, either network interruption or the
 file is disappearing under it.



 Thanks in advance

 jian gao



 --
 Adrian Klaver
 adrian.kla...@gmail.com

Re: [GENERAL] ERROR: invalid value ???? for YYYY

2013-10-09 Thread Brian Wong

But from a user's perspective, why would it ever make sense that by adding
an additional where clause, it actually brings in more data into the
picture?  If I have query returning 100 rows.  Adding an additional where
clause should only cut down the number of rows, not increase it.  And the
extra data that's showing up is being added to the resultset cuz without
the additional where clause, the result set did not contain any of those
rows like pg_statistics/etc.

Brian


On Tue, Oct 8, 2013 at 4:10 PM, Steve Crawford 
scrawf...@pinpointresearch.com wrote:

  On 10/08/2013 01:50 PM, Brian Wong wrote:

  I'm posting this question to 
 pgsql-generalhttp://www.postgresql.org/list/pgsql-general/.
 Hopefully someone can share some insights with me.

  I have a bunch of tables in the database and in a separate schema.  The
 tables' names are in this format:

  ???_???_???_MMDD

 where the last 8 characters is a date.

 *When I query either the information_schema.tables or pg_tables
 extracting the last 8 characters out and converting it to a date, it works:
 *

 select table_name, to_date(right(table_name, 8), 'MMDD') blah
 from information_schema.tables
 where table_schema = ''
 and table_catalog = '';

  *But as soon as I reference it in the where clause, it gives a weird
 error:*

 select table_name, to_date(right(table_name, 8), 'MMDD') blah
 from information_schema.tables
 where table_schema = ''
 and table_catalog = ''
 *and to_date(right(table_name, 8), 'MMDD') is not null;*

 *ERROR:  invalid value tati for *
 DETAIL:  Value must be an integer.

  It seems like some strange values were passed into the to_date function,
 but I'm seeing that the rightmost 8 characters of all the table names are
 dates.  So only date strings are passed to the to_date function.
 Absolutely nothing containing the string tati is passed to the to_date
 function.  What is going on?  Is that a bug?


 As Rowan said, you can't necessarily rely on the order of testing in the
 where clause. In part, this is because one of the first things the planner
 does is to take the whole shebang and rewrite it into a consolidated
 statement that it can then refine and optimize. The tests that take place
 in the view can ultimately happen before, after, or intermixed with the
 tests in your where-clause as long as they are logically equivalent.

 In your example, you are querying information_schema.tables which is not a
 table, it is a view that references, among other things, a subset of the
 pg_catalog.pg_class table.

 When the planner gets through with its first steps you won't be calling
 information_schema.tables, you will be calling pg_catalog.pg_class and
 doing some where-clause tests that logically combine your where-clause with
 those in the view.

 Why tati? When I query pg_class directly, the first row has a relname
 of pg_statistic  - it's not in the schema/catalog you seek but the
 executor hasn't checked for that, yet. The right eight characters of that
 relname are are tatistic thus the characters in the  position are
 tati so based on the plan and testing order this just happens to be the
 first thing upon which the execution chokes.

 Cheers,
 Steve

Re: [GENERAL] ERROR: invalid value ???? for YYYY

2013-10-09 Thread David Johnston

Brian Wong-2 wrote
 But from a user's perspective, why would it ever make sense that by adding
 an additional where clause, it actually brings in more data into the
 picture?  If I have query returning 100 rows.  Adding an additional where
 clause should only cut down the number of rows, not increase it.  And the
 extra data that's showing up is being added to the resultset cuz without
 the additional where clause, the result set did not contain any of those
 rows like pg_statistics/etc.

No it does not.  Your general case is flawed in that adding an OR condition
will indeed cause more rows to be added.  In this case you are adding an AND
clause but since it is at the same level as the existing conditions all
possible records must evaluate against this new clause even if one of the
other clauses returns false.  There is no short-circuiting.  This may be
confusing but that is part of the education process.  If that expression
(the one in the where clause) did not cause an error there would be at most
the same number of records output as the original query.  And the same
number of rows are being processed at the WHERE clause in both cases.  Since
one of the conditions only makes sense on a sub-set of valid rows there must
be two levels of WHERE clauses in the query - or use CASE regexp THEN test
ELSE false END so the substring test only is performed against valid table
names.

David J.









--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/ERROR-invalid-value-for--tp5773787p5773944.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] String reverse funtion?

2013-10-09 Thread ginkgo36

Thanks alot everyone,
Actually, I just start learning about Programming language. So, each your
solution is a good suggestion for me to learning. Thanks so much.
Please help me one User-defined Functions to solves this when my data like
this:
EA;BX;CA
CA;EA
BX;EA
And when run UDFs, output is:
CA;BX;EA
EA;CA
EA;BX

One again, thanks so much

Pavel Stehule wrote
 Based upon the example, it's probably very easy to use a split/explode in
 your language of choice (VB.NET, perl, python, etc).
 
 
 or SQL
 
 select string_agg(u, ';' order by r desc)
from (select row_number() over () r, u
 from unnest(string_to_array('EA;BX;CA',';')) u) x;
  string_agg
 
  CA;BX;EA
 (1 row)
 
 Regards
 
 Pavel
 
 


 --
 Sent via pgsql-general mailing list (

 pgsql-general@

 )
 To make changes to your subscription:
 http://www.postgresql.org/**mailpref/pgsql-generallt;http://www.postgresql.org/mailpref/pgsql-generalgt;






--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/String-reverse-funtion-tp5773871p5773950.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] [GENERAL] Urgent Help Required

2013-10-09 Thread John R Pierce


On 10/8/2013 8:35 AM, Chris Travers wrote:
First, while vacuum is usually preferred to vacuum full, in this case, 
I usually find that vacuum full clears up enough cruft to be worth it 
(not always, but especially if you are also having performance issues).



IIRC, vacuum full was pretty broken in 8.1, which the output the 
original postered showed indicated they were running.


--
john r pierce  37N 122W
somewhere on the middle of the left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Forms for entering data into postgresql‏

2013-10-09 Thread Sudhir P . B .

I have developed an application using MS SQL. I have used MS Access for 
creating forms to enter data into the database. I am thinking of changing over 
to postgresql and would also like to use any other available open source tool 
for creating forms. Are there any free applications available for creating 
forms similar to the ones I have made in MS Access?. Any alternative 
suggestions will be appreciated.

[GENERAL] Re: [GENERAL] Forms for entering data into postgresql‏

2013-10-09 Thread bricklen

On Wed, Oct 9, 2013 at 7:05 PM, Sudhir P.B. pb_sud...@hotmail.com wrote:

 I have developed an application using MS SQL. I have used MS Access for
 creating forms to enter data into the database. I am thinking of changing
 over to postgresql and would also like to use any other available open
 source tool for creating forms. Are there any free applications available
 for creating forms similar to the ones I have made in MS Access?. Any
 alternative suggestions will be appreciated.


Here is a list of tools that has been more or less maintained over the past
several years:
http://wiki.postgresql.org/wiki/Community_Guide_to_PostgreSQL_GUI_Tools
I have no idea how many of them are still actively developed or used though.

Re: [GENERAL] streaming replication timeout error

2013-10-09 Thread Adrian Klaver


On 10/09/2013 05:51 PM, 高健 wrote:

Hello:

Thanks for replying.

The recovery.conf file on standby(DB2) is like that:

standby_mode = 'on'
primary_conninfo = 'host=DB1 port=5432 application_name=testpg
user=postgres connect_timeout=10 keepalives_idle=5 keepalives_interval=1'
recovery_target_timeline = 'latest'
restore_command  = 'scp -o ConnectTimeout 5 -i
/opt/PostgresPlus/9.2AS/.ssh/id_edb
DB1:/opt/PostgresPlus/9.2AS/data/arch/%f %p'


I  am not familiar with the scp command,  I think that here scp is used
to copy archive wal log files from primary  to standby...

Maybe the ConnectionTimeout is too small, And sometimes when network is
not very well,
the restore_command will fail and return FATAL error?

In fact I am a little confused about restore_command, we are using
streaming replication, but why restore_command is still needed to copy
archive wal log, isn't it  the old warm standby (file shipping)?


Best explanation is in the docs:

http://www.postgresql.org/docs/9.3/static/warm-standby.html

At startup, the standby begins by restoring all WAL available in the 
archive location, calling restore_command. Once it reaches the end of 
WAL available there and restore_command fails, it tries to restore any 
WAL available in the pg_xlog directory. If that fails, and streaming 
replication has been configured, the standby tries to connect to the 
primary server and start streaming WAL from the last valid record found 
in archive or pg_xlog. If that fails or streaming replication is not 
configured, or if the connection is later disconnected, the standby goes 
back to step 1 and tries to restore the file from the archive again. 
This loop of retries from the archive, pg_xlog, and via streaming 
replication goes on until the server is stopped or failover is triggered 
by a trigger file.



Basically by having a restore_command and primary_conninfo you are 
telling the standby to do both, following the sequence described above.


FYI ConnectTimeout is a SSH option passed to scp.

man ssh_config will get you more information.

Would seem both your streaming and archiving are using the same network, 
is that correct?


If so you have a single point of failure, the network.




Best Regards
jian gao




--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Re: [GENERAL] Forms for entering data into postgresql‏

2013-10-09 Thread Adrian Klaver


On 10/09/2013 07:05 PM, Sudhir P.B. wrote:

I have developed an application using MS SQL. I have used MS Access for
creating forms to enter data into the database. I am thinking of
changing over to postgresql and would also like to use any other
available open source tool for creating forms. Are there any free
applications available for creating forms similar to the ones I have
made in MS Access?. Any alternative suggestions will be appreciated.


There are tools available, as mentioned in a previous post. There are 
good tools in the list and I have tried quite a few of them. Just know 
they will require a bit more effort then Access to get a finished form. 
I am not the biggest fan of Access, but it does shine in getting fairly 
involved forms up quickly. FYI, you can use it with Postgres, which may 
be the way to go until you have completed the change over to Postgres. 
Welcome to the Postgres community.



--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Many, many materialised views - Performance?

2013-10-09 Thread Toby Corkindale


On 09/10/13 21:05, Alban Hertroys wrote:

On Oct 9, 2013, at 4:08, Kevin Grittner kgri...@ymail.com wrote:


Toby Corkindale toby.corkind...@strategicdata.com.au wrote:


In this instance, we have a lot of queries that build certain
aggregate results, which are very slow. The queries were
initially all implemented as views, but then we started doing a
type of materialising of our own, turning them into tables with
CREATE TABLE AS SELECT  This does make the results very fast
to access now, but the side effect is a vast number of (very
small) tables.


If you have multiple tables with identical layout but different
subsets of the data, you will probably get better performance by
putting them into a single table with indexes which allow you to
quickly search the smaller sets within the table.



I was thinking just that while reading Toby's message. For example,
you could put the results of several related aggregations into a
single materialized view, if they share the same key columns (year,
month, factory or something similar). I'm not sure the new built-in
materialized views can be updated like that though, unless you manage
to combine those aggregations into a single monster-query, but that
will probably not perform well... What we tend to do at work (no
PostgreSQL, unfortunately) is to use external tools to combine those
aggregated results and store that back into the database (which we
often need to do anyway, as we deal with several databases on several
servers).


Thanks for the suggestions, all.
As I noted in an earlier email -- we're aware that the schema could be 
better designed, but making large changes is tricky in production systems.


Many of the tables are actually unique, but only in the sense that you 
have various (common) identifier fields, and then a few (unique) 
aggregate-results per table.


eg:
int id_key_1, int id_key_2, .., float FooBarXResult

I suspect the correct way to handle this would actually be a table that 
looked like:


int id_key_1, int id_key_2, .., text result_name, float result_value

Although that would in turn make other queries more verbose, for 
example, currently one can do:


select *
from FooResult
join BarResult using (id_key_1, id_key_2)
where FooResultX  0.9 and BarResultY  0.1;

I guess that turns into something like this:
select id_key_1, id_key_2,
   a.result_value as FooResultX, b.result_value as FooResultY
from AllResults a
join AllResults b using (id_key_1, id_key_2)
where a.result_name = FooResultX
and a.result_value  0.9
and b.result_name = BarResultY
and b.result_value  0.1;

So it's all do-able, but it does look nicer to separate things into 
their own tables with named columns.



Additionally, if you have that many tables, it sounds like you
partitioned your data. With aggregated results, the need for
partitioning is much less (or perhaps it isn't even needed at all).
And perhaps you don't even need the data from all partitions; say if
you have monthly partitions of data, do you really need aggregated
results from 5 years ago?


You're correct, we do have partitioned tables due to the amount of data 
in the system, but that's for just the non-aggregated data.

Those tables perform just fine!
It's the hundreds of thousands of views and tables with just a few rows 
in them that worry me.. :)




That said, users excel in finding data to request that you thought
they wouldn't need.

 Which brings me to another question: Do your users really need the
 data from all those views or do they only think they need that?

Ah, indeed, users have not individually requested each of these many 
thousands of tables and views. They are part of a large application, and 
the results from all of those are required by it.


If I rewrote the application today, I'd be looking at doing things very 
differently, knowing how it would eventually scale.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Many thousands of partitions

2013-10-09 Thread Grzegorz Tańczyk


Hello,

Thanks for sharing your experiences with the problem.

W dniu 2013-10-09 00:47, Gabriel E. Sánchez Martínez pisze:
Partioning seems to be a good idea if a single table would be too big 
to fit in your server's file cache, and also for management, since you 
can drop partitions of old data and create new ones without having to 
reindex and lock.  Does your data partition nicely by date, for 
example?  If most of the inserts are new data and old data is 
read-mostly, then partitioning may make sense because you would not 
have to reindex old partitions.  In fact, you could very well not have 
an index on the hot, write-mostly partition of, say, the current 
month, until the write activity on that table diminishes, which would 
make inserts faster.  If, on the other hand, your writes are scattered 
across many partitions, a single large table with an index may be a 
better solution. 


Changes are scattered, so single large table already is not a good 
solution. I like the idea of hot, write-mostly partition, because I 
might as well use only two partitions and merge changes from small table 
to the large one once the processing is done.
Rows are grouped by some key and when I start processing some group I 
could move all rows from large table in a batch (INSERT INTO .. SELECT 
.. WHERE group=x; DELETE FROM WHERE group=x). This way the read only 
part of the system will continue work without problems and processing 
should be much faster.


Although this will not solve the problem of neverending vacuums on large 
table, postgres could easily become the bottleneck. I am using 8.3 for 
this, but I will make an upgrade at some point, however I don't think it 
will change the design.


Thanks

--
Regards,
  Grzegorz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

40 matches

Mail list logo