Re: [PERFORM] Indexes not used in DELETE

2009-05-08 Thread Viktor Rosenfeld

Hi Tom,

I should have looked at the analyzed plan first. The culprit for the  
slow query were trigger function calls on foreign keys.


Ciao,
Viktor

Am 08.05.2009 um 01:06 schrieb Tom Lane:


Viktor Rosenfeld  writes:
   ->  Seq Scan on corpus toplevel  
(cost=0.00..1.39 rows=1 width=54)
 Filter: (top_level AND (id =  
25::numeric))



Specifically, I'm wondering why the innermost scan on corpus
(toplevel) does not use the index idx_corpus__toplevel


The cost estimate indicates that there are so few rows in corpus
that an indexscan would be a waste of time.


and why the
join between corpus (toplevel) and corpus (child) is not a merge join
using the index corpus_pre_key to access the child table.


Same answer.  Populate the table and the plan will change.

regards, tom lane

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org 
)

To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Statistics use with functions

2009-05-08 Thread Matthew Wakeling


I'm running a rather complex query and noticed a peculiarity in the usage 
of statistics that seriously affects the plan generated. I can extract the 
relevant bit:


modmine-r9=# select * from pg_stats where tablename = 'geneflankingregion' AND 
attname IN ('distance', 'direction');
 schemaname | tablename  |  attname  | null_frac | avg_width | 
n_distinct | most_common_vals |   most_common_freqs 
   | histogram_bounds | correlation
++---+---+---++--++--+-
 public | geneflankingregion | distance  | 0 | 6 |  
5 | {5.0kb,0.5kb,1.0kb,2.0kb,10.0kb} | 
{0.201051,0.200798,0.200479,0.199088,0.198583} |  |0.197736
 public | geneflankingregion | direction | 0 |10 |  
2 | {downstream,upstream}| {0.500719,0.499281}  
  |  |0.495437
(2 rows)

modmine-r9=# SELECT COUNT(*) FROM geneflankingregion;
 count

 455020
(1 row)

modmine-r9=# explain analyse SELECT * FROM geneflankingregion WHERE distance = 
'10.0kb' AND direction = 'upstream';
 QUERY PLAN

 Seq Scan on geneflankingregion  (cost=0.00..15507.30 rows=45115 width=213) 
(actual time=0.053..181.764 rows=45502 loops=1)
   Filter: ((distance = '10.0kb'::text) AND (direction = 'upstream'::text))
 Total runtime: 227.245 ms
(3 rows)

modmine-r9=# explain analyse SELECT * FROM geneflankingregion WHERE 
LOWER(distance) = '10.0kb' AND LOWER(direction) = 'upstream';
   
QUERY PLAN

 Bitmap Heap Scan on geneflankingregion
   (cost=66.95..88.77 rows=11 width=213)
   (actual time=207.555..357.359 rows=45502 loops=1)
   Recheck Cond: ((lower(distance) = '10.0kb'::text) AND (lower(direction) = 
'upstream'::text))
   ->  BitmapAnd
 (cost=66.95..66.95 rows=11 width=0)
 (actual time=205.978..205.978 rows=0 loops=1)
 ->  Bitmap Index Scan on geneflankingregion__distance_equals
   (cost=0.00..31.34 rows=2275 width=0)
   (actual time=79.380..79.380 rows=91004 loops=1)
   Index Cond: (lower(distance) = '10.0kb'::text)
 ->  Bitmap Index Scan on geneflankingregion__direction_equals
   (cost=0.00..35.35 rows=2275 width=0)
   (actual time=124.639..124.639 rows=227510 loops=1)
   Index Cond: (lower(direction) = 'upstream'::text)
 Total runtime: 401.740 ms
(8 rows)

When I wrap the fields in the constraints in a LOWER() function, the 
planner stops looking at the statistics and makes a wild guess, even 
though it is very obvious from just looking what the result should be. 
Embedded in a much larger query, the inaccuracy in the number of rows (11 
instead of 45502) causes major planning problems. Also, why does the 
BitmapAnd say zero actual rows?


I understand this probably isn't Priority No. 1, and there are some 
interesting corner cases when n_distinct is higher than the histogram 
width, but would it be possible to fix this one up?


Matthew

--
I would like to think that in this day and age people would know better than
to open executables in an e-mail. I'd also like to be able to flap my arms
and fly to the moon.-- Tim Mullen

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Statistics use with functions

2009-05-08 Thread Tom Lane
Matthew Wakeling  writes:
> When I wrap the fields in the constraints in a LOWER() function, the 
> planner stops looking at the statistics and makes a wild guess, even 
> though it is very obvious from just looking what the result should be. 

Well, in general the planner can't assume anything about the statistics
of a function result, since it doesn't know how the function behaves.
In this case, however, you evidently have an index on lower(distance)
which should have caused ANALYZE to gather stats on the values of that
functional expression.  It looks like there might be something wrong
there --- can you look into pg_stats and see if there is such an entry
and if it looks sane?

> Also, why does the BitmapAnd say zero actual rows?

There isn't any reasonably-inexpensive way for EXPLAIN ANALYZE to
determine how many rows are represented by a bitmap result, so it
doesn't try.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-08 Thread Paolo Rizzi

Hi all,
recently I came across a question from a customer of mine, asking me if 
it would feasible to run PostgreSQL along with PostGIS on embedded hardware.
They didn't give me complete information, but it should be some kind of 
industrial PC with a 600MHz CPU. Memory should be not huge nor small, 
maybe a couple of GBytes, hard disk should be some type of industrial 
Compact Flash of maybe 16 GBytes.


They are thinking about using this setup on-board of public buses and 
trams, along with a GPS receiver, for self-localization. So that when 
the bus or tram enters defined zones or passes near defined points, 
events are triggered.

The database could probably be used completely read-only or almost that.

What performances do you think would be possible for PostgreSQL+PostGIS 
on such hardware???


Bye
Paolo

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Statistics use with functions

2009-05-08 Thread Tom Lane
Matthew Wakeling  writes:
> Ah, now I see it - I re-analysed, and found entries in pg_stats where 
> tablename is the name of the index. Now the query plans correctly and has 
> the right estimates. So, one needs to analyse AFTER creating indexes - 
> didn't know that.

Yes, for functional indexes it's helpful to do that.  Doesn't matter
for plain-old-plain-old indexes.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Statistics use with functions

2009-05-08 Thread Matthew Wakeling

On Fri, 8 May 2009, Tom Lane wrote:

In this case, however, you evidently have an index on lower(distance)
which should have caused ANALYZE to gather stats on the values of that
functional expression.  It looks like there might be something wrong
there --- can you look into pg_stats and see if there is such an entry
and if it looks sane?


What should I be looking for? I don't see anything obvious from this:

modmine-r9=# select attname from pg_stats where tablename = 
'geneflankingregion';

Ah, now I see it - I re-analysed, and found entries in pg_stats where 
tablename is the name of the index. Now the query plans correctly and has 
the right estimates. So, one needs to analyse AFTER creating indexes - 
didn't know that.


modmine-r9=# explain analyse SELECT * FROM geneflankingregion WHERE 
LOWER(distance) = '10.0kb' AND LOWER(direction) = 'upstream';

QUERY PLAN
-
 Bitmap Heap Scan on geneflankingregion
   (cost=1197.19..11701.87 rows=45614 width=212)
   (actual time=18.336..153.825 rows=45502 loops=1)
   Recheck Cond: (lower(distance) = '10.0kb'::text)
   Filter: (lower(direction) = 'upstream'::text)
   ->  Bitmap Index Scan on geneflankingregion__distance_equals
 (cost=0.00..1185.78 rows=91134 width=0)
 (actual time=16.565..16.565 rows=91004 loops=1)
 Index Cond: (lower(distance) = '10.0kb'::text)
 Total runtime: 199.282 ms
(6 rows)

Matthew

--
It is better to keep your mouth closed and let people think you are a fool
than to open it and remove all doubt.  -- Mark Twain

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Transparent table partitioning in future version of PG?

2009-05-08 Thread Robert Haas
On Thu, May 7, 2009 at 10:52 PM,   wrote:
>>> Hopefully, notions of partitioning won't be directly tied to chunking of
>>> data for parallel query access. Most queries access recent data and
>>> hence only a single partition (or stripe), so partitioning and
>>> parallelism and frequently exactly orthogonal.
>>
>> Yes, I think those things are unrelated.
>
> I'm not so sure (warning, I am relativly inexperianced in this area)
>
> it sounds like you can take two basic approaches to partition a database
>
> 1. The Isolation Plan
[...]
> 2. The Load Balancing Plan

Well, even if the table is not partitioned at all, I don't see that it
should preclude parallel query access.  If I've got a 1 GB table that
needs to be sequentially scanned for rows meeting some restriction
clause, and I have two CPUs and plenty of I/O bandwidth, ISTM it
should be possible to have them each scan half of the table and
combine the results.  Now, this is not easy and there are probably
substantial planner and executor changes required to make it work, but
I don't know that it would be particularly easier if I had two 500 MB
partitions instead of a single 1 GB table.

IOW, I don't think you should need to partition if all you want is
load balancing.  Partitioning should be for isolation, and load
balancing should happen when appropriate, whether there is
partitioning involved or not.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-08 Thread Joshua D. Drake
On Fri, 2009-05-08 at 18:06 +0200, Paolo Rizzi wrote:
> Hi all,
> recently I came across a question from a customer of mine, asking me if 
> it would feasible to run PostgreSQL along with PostGIS on embedded hardware.
> They didn't give me complete information, but it should be some kind of 
> industrial PC with a 600MHz CPU. Memory should be not huge nor small, 
> maybe a couple of GBytes, hard disk should be some type of industrial 
> Compact Flash of maybe 16 GBytes.
> 

Well the CPU is slow the but rest isn't so bad.

> They are thinking about using this setup on-board of public buses and 
> trams, along with a GPS receiver, for self-localization. So that when 
> the bus or tram enters defined zones or passes near defined points, 
> events are triggered.
> The database could probably be used completely read-only or almost that.
> 
> What performances do you think would be possible for PostgreSQL+PostGIS 
> on such hardware???
> 

If you aren't doing a lot of writing I don't see a huge barrier to this.

Sincerely,

Joshua D. Drkae


-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Transparent table partitioning in future version of PG?

2009-05-08 Thread Scott Carey

On 5/7/09 7:52 PM, "da...@lang.hm"  wrote:

> 
> 
> I believe that the isolation plan is probably more common than the load
> balancing plan, but I don't see them as being that different for the
> database engine point of view. To tune a system that can handle the
> isolation plan for load balancing, the key thing to do would be to have a
> knob to disable the partition planning, and just blindly send the search
> out to every partition.

Lots of good points.  However, implicit in the above is that the process of
identifying which partitions contain the data is expensive.
Right now it is (1.5 sec if 6000 partitions with the most simple possible
constraint (column = CONSTANT).

But identifying which partitions can contain a value is really nothing more
than an index.  If you constrain the possible partitioning functions to
those where a single partition key can only exist in one partition, then
this index and its look up should be very fast even for large partition
counts.  From what I can tell empirically, the current system does this in
more of a sequential scan, running the constraint checks for each
possibility.   
Furthremore, the actual tables don't have to contain the data if the key is
a column identity function (date = X ) rather than a range or hash.

At the core, partitioning is really just a form of 'chunky' indexing that
doesn't fragment, or need re-indexing, or have much MVCC complexity.

> 
> David Lang
> 
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
> 


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-08 Thread Paolo Rizzi

Joshua D. Drake ha scritto:

On Fri, 2009-05-08 at 18:06 +0200, Paolo Rizzi wrote:

Hi all,
recently I came across a question from a customer of mine, asking me if 
it would feasible to run PostgreSQL along with PostGIS on embedded hardware.
They didn't give me complete information, but it should be some kind of 
industrial PC with a 600MHz CPU. Memory should be not huge nor small, 
maybe a couple of GBytes, hard disk should be some type of industrial 
Compact Flash of maybe 16 GBytes.




Well the CPU is slow the but rest isn't so bad.

They are thinking about using this setup on-board of public buses and 
trams, along with a GPS receiver, for self-localization. So that when 
the bus or tram enters defined zones or passes near defined points, 
events are triggered.

The database could probably be used completely read-only or almost that.

What performances do you think would be possible for PostgreSQL+PostGIS 
on such hardware???




If you aren't doing a lot of writing I don't see a huge barrier to this.

Sincerely,

Joshua D. Drkae

Thank you!!!
Indeed I also think it could be done, but I searched the Web and found 
no previous experience of the like, so maybe it's just too weird putting 
a spatial-enabled RDBMS on-board buses...!?!


Anyway I found the TurnKey PostgreSQL appliance. It's a small 
Ubuntu-based live-cd with PostgreSQL and PostGIS preconfigured.
I could suggest these people to try it out on hardware similar to what 
they intend to use, to have a feel of how it behaves.


Bye
Paolo




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Transparent table partitioning in future version of PG?

2009-05-08 Thread david

On Fri, 8 May 2009, Robert Haas wrote:


On Thu, May 7, 2009 at 10:52 PM,   wrote:

Hopefully, notions of partitioning won't be directly tied to chunking of
data for parallel query access. Most queries access recent data and
hence only a single partition (or stripe), so partitioning and
parallelism and frequently exactly orthogonal.


Yes, I think those things are unrelated.


I'm not so sure (warning, I am relativly inexperianced in this area)

it sounds like you can take two basic approaches to partition a database

1. The Isolation Plan

[...]

2. The Load Balancing Plan


Well, even if the table is not partitioned at all, I don't see that it
should preclude parallel query access.  If I've got a 1 GB table that
needs to be sequentially scanned for rows meeting some restriction
clause, and I have two CPUs and plenty of I/O bandwidth, ISTM it
should be possible to have them each scan half of the table and
combine the results.  Now, this is not easy and there are probably
substantial planner and executor changes required to make it work, but
I don't know that it would be particularly easier if I had two 500 MB
partitions instead of a single 1 GB table.

IOW, I don't think you should need to partition if all you want is
load balancing.  Partitioning should be for isolation, and load
balancing should happen when appropriate, whether there is
partitioning involved or not.


actually, I will contridict myself slightly.

with the Isolation Plan there is not nessasarily a need to run the query 
on each parition in parallel.


  if parallel queries are possible, it will benifit Isolation Plan 
paritioning, but the biggest win with this plan is just reducing the 
number of paritions that need to be queried.


with the Load Balancing Plan there is no benifit in partitioning unless 
you have the ability to run queries on each parition in parallel



using a seperate back-end process to do a query on a seperate partition is 
a fairly straightforward, but not trivial thing to do (there are 
complications in merging the result sets, including the need to be able to 
do part of a query, merge the results, then use those results for the next 
step in the query)


  I would also note that there does not seem to be a huge conceptual 
difference between doing these parallel queries on one computer and 
shipping the queries off to other computers.



however, trying to split the work on a single table runs into all sorts of 
'interesting' issues with things needing to be shared between the multiple 
processes (they both need to use the same indexes, for example)


so I think that it is much easier for the database engine to efficiantly 
search two 500G tables instead of one 1T table.


David Lang

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-08 Thread Fernando Hevia
 

> -Mensaje original-
> De: Paolo Rizzi
> 
> Hi all,
> recently I came across a question from a customer of mine, 
> asking me if it would feasible to run PostgreSQL along with 
> PostGIS on embedded hardware.
> They didn't give me complete information, but it should be 
> some kind of industrial PC with a 600MHz CPU. Memory should 
> be not huge nor small, maybe a couple of GBytes, hard disk 
> should be some type of industrial Compact Flash of maybe 16 GBytes.
> 
> They are thinking about using this setup on-board of public 
> buses and trams, along with a GPS receiver, for 
> self-localization. So that when the bus or tram enters 
> defined zones or passes near defined points, events are triggered.
> The database could probably be used completely read-only or 
> almost that.
> 

Hi Paolo,

I'm not really responding to your question. It happens that I collaborated
on a postgres/postgis based solution for public transportation and the
motive why you are trying to put the database in the embedded hardware is
puzzling to me. In this solution we used a centralized PG database, the
devices in buses captured geographical position and other business related
data and fetched it by cellular network to the central server. 
Calculations on position where made on the server and related events where
fetched back accordingly.

If possible, I would like to know what drives you to put a database on each
device? You dont have a wireless link on each unit?


> What performances do you think would be possible for 
> PostgreSQL+PostGIS on such hardware???

We never considered that solution so I couldn´t say.

> 
> Bye
> Paolo
> 

Regards,
Fernando.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Transparent table partitioning in future version of PG?

2009-05-08 Thread Scott Carey

On 5/8/09 11:20 AM, "da...@lang.hm"  wrote:
> 
> with the Load Balancing Plan there is no benifit in partitioning unless
> you have the ability to run queries on each parition in parallel
> 

I think there is a benefit to partitioning in this case.  If the statistics
on other columns are highly skewed WRT the column(s) partitioned, the
planner statistics will be better.  It may have to access every partition,
but it doesn't have to access every partition in the same way.

Perhaps something like:  user_id = 'FOO' is one of the most common vals in
date partition A, and one of the  least common vals in B, so a where clause
with user_id = 'FOO' will sequential scan one and index scan another.

For really large tables with data correlation that varies significantly,
this can be a huge performance gain even if all partitions are accessed.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Transparent table partitioning in future version of PG?

2009-05-08 Thread Robert Haas
>> IOW, I don't think you should need to partition if all you want is
>> load balancing.  Partitioning should be for isolation, and load
>> balancing should happen when appropriate, whether there is
>> partitioning involved or not.
>
> actually, I will contridict myself slightly.
>
[...]
> however, trying to split the work on a single table runs into all sorts of
> 'interesting' issues with things needing to be shared between the multiple
> processes (they both need to use the same indexes, for example)

I disagree with this part of your email.  It is already the case that
tables and indexes need to support concurrent access by multiple
Postgres processes.  I don't see why that part of the problem would be
any more difficult for parallel query execution than it would be for
executing two different and unrelated queries on the same table.

> so I think that it is much easier for the database engine to efficiantly
> search two 500G tables instead of one 1T table.

And that leads me to the opposite conclusion on this point.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Transparent table partitioning in future version of PG?

2009-05-08 Thread Tom Lane
Robert Haas  writes:
>> so I think that it is much easier for the database engine to efficiantly
>> search two 500G tables instead of one 1T table.

> And that leads me to the opposite conclusion on this point.

I don't think there would be any difference on that score, either.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-08 Thread Paolo Rizzi
 


-Mensaje original-
De: Paolo Rizzi

Hi all,
recently I came across a question from a customer of mine, 
asking me if it would feasible to run PostgreSQL along with 
PostGIS on embedded hardware.
They didn't give me complete information, but it should be 
some kind of industrial PC with a 600MHz CPU. Memory should 
be not huge nor small, maybe a couple of GBytes, hard disk 
should be some type of industrial Compact Flash of maybe 16 GBytes.


They are thinking about using this setup on-board of public 
buses and trams, along with a GPS receiver, for 
self-localization. So that when the bus or tram enters 
defined zones or passes near defined points, events are triggered.
The database could probably be used completely read-only or 
almost that.




Hi Paolo,

I'm not really responding to your question. It happens that I collaborated
on a postgres/postgis based solution for public transportation and the
motive why you are trying to put the database in the embedded hardware is
puzzling to me. In this solution we used a centralized PG database, the
devices in buses captured geographical position and other business related
data and fetched it by cellular network to the central server. 
Calculations on position where made on the server and related events where

fetched back accordingly.

If possible, I would like to know what drives you to put a database on each
device? You dont have a wireless link on each unit?
Indeed I was as puzzled as you when they described me their idea, but I 
think it makes sense. The buses and trams have to be independent of the 
radio link  because there are certain operations that have to performed 
at the right moment in the right place (like oiling wheels or letting 
down sand or salt or some other action).
However they _are_ going to use a centralized server, and putting the 
same technology (PostgreSQL/PostGIS) both on-board and on-center, would 
let them simplify development, configuration and maintenance.
Now that hardware is continuously getting cheaper and more powerful, 
moving "intelligence" on-board may be a smart move...





What performances do you think would be possible for 
PostgreSQL+PostGIS on such hardware???


We never considered that solution so I couldn´t say.

In fact I searched the Web and found nobody that did that before :-)




Bye
Paolo



Regards,
Fernando.


Bye
Paolo




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-08 Thread PFC


They didn't give me complete information, but it should be some kind of  
industrial PC with a 600MHz CPU. Memory should be not huge nor small,  
maybe a couple of GBytes, hard disk should be some type of industrial  
Compact Flash of maybe 16 GBytes.


It should work perfectly OK.

	Remember that you need a fast CPU if you have a database server that  
processes many queries from many users simultaneously.
	Since your "server" will process very few queries (maybe one per second,  
something like that) even a slow (by modern standards) 600 MHz CPU will be  
more than enough...
	I'd say for such an application, your hardware is way overkill (it would  
work on a smartphone...) but since hardware is so cheap...


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-08 Thread Paolo Rizzi


They didn't give me complete information, but it should be some kind 
of industrial PC with a 600MHz CPU. Memory should be not huge nor 
small, maybe a couple of GBytes, hard disk should be some type of 
industrial Compact Flash of maybe 16 GBytes.


It should work perfectly OK.

Remember that you need a fast CPU if you have a database server that 
processes many queries from many users simultaneously.
Since your "server" will process very few queries (maybe one per 
second, something like that) even a slow (by modern standards) 600 MHz 
CPU will be more than enough...
I'd say for such an application, your hardware is way overkill (it 
would work on a smartphone...) but since hardware is so cheap...
A smartphone... you're right, I didn't think of that, but the hardware I 
described is very much like the one of a modern smartphone!!!
Are you saying that PostgreSQL+PostGIS can actually run on a 
smartphone??? Intriguing...

Did anyone ever actually tried that???

Bye
Paolo

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad Plan for Questionnaire-Type Query

2009-05-08 Thread Tom Lane
David Blewett  writes:
> On Thu, May 7, 2009 at 6:44 PM, Tom Lane  wrote:
>> Look into pg_stats for the rows concerning the columns used in the
>> query's WHERE and JOIN/ON clauses.

> Okay, here you go:
> http://rafb.net/p/20y8Oh72.html

I got some time to poke into this, but didn't get very far --- the
joins that seem to be the main problem involve
canvas_textresponse.submission_id which you didn't include stats for.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware

2009-05-08 Thread Tom Lane
Paolo Rizzi  writes:
> Are you saying that PostgreSQL+PostGIS can actually run on a 
> smartphone??? Intriguing...
> Did anyone ever actually tried that???

If it's a supported CPU type and you've got a suitable build toolchain,
sure.  Seven or eight years ago we were getting a good laugh out of the
fact that you could run PG on a PlayStation 2.

The real issue with the kind of hardware you're describing is going to
be the finite write lifetime of a flash device.  For a low-update
application it'll probably be okay, but PG could very easily destroy a
flash in no time if you aren't careful to minimize updates.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Transparent table partitioning in future version of PG?

2009-05-08 Thread Craig Ringer
Robert Haas wrote:

> Well, even if the table is not partitioned at all, I don't see that it
> should preclude parallel query access.  If I've got a 1 GB table that
> needs to be sequentially scanned for rows meeting some restriction
> clause, and I have two CPUs and plenty of I/O bandwidth, ISTM it
> should be possible to have them each scan half of the table and
> combine the results.  Now, this is not easy and there are probably
> substantial planner and executor changes required to make it work, but
> I don't know that it would be particularly easier if I had two 500 MB
> partitions instead of a single 1 GB table.

The point of partitioning in this scenario is primarily that you can put
the different partitions in different tablespaces, most likely on
independent disk devices. You therefore get more I/O bandwidth.

--
Craig Ringer

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad Plan for Questionnaire-Type Query

2009-05-08 Thread Tom Lane
David Blewett  writes:
> Apparently there was a typo in the query that I didn't notice that
> excluded that table's columns. Here is the new output including it:
> http://pastesite.com/7017

Thanks.  Could I trouble you for one other data point --- about how many
rows are in each of these tables?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance