Re: [SQL] Invalid message format Exception

2010-05-13 Thread Gnanakumar
Hi Rob,

> I'm sure most will urge you to move to UTF-8 encoding asap.
Did you mean the database encoding to changed from SQL_ASCII to UTF-8?

> Have you tracked down the "offending" insert statement?  Perhaps it's a
> trigger trying to generate a log message?
No, I don't have any trigger on this table for INS/UPD/DEL.


-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] best paging strategies for large datasets?

2010-05-13 Thread silly sad

On 05/12/10 09:41, Louis-David Mitterrand wrote:

Hi,

I have a large dataset (page 1 at http://www.cruisefish.net/stat.md) and
am in the process of developping a pager to let users leaf through it
(30K rows).

Ideally I'd like to know when requesting any 'page' of data where I am
within the dataset: how many pages are available each way, etc.

Of course that can be done by doing a count(*) query before requesting a
limit/offset subset. But the main query is already quite slow, so I'd
like to minimize them.


nowadays i tend to bet on AJAX.
in other words i propose to move some calculations to a client side at all.

and this particular situation might looks similar to the following:

First u count(*) the rows and select a requested page
returning to a client the count result bundled "with a page of rows"

(1) client renders the acquired rows
(2)__memorize__ what part of the data he just got
(3) and stores the count result to calculate "the pager div"

all the subsequent clicks on "the pager div" should not immediately 
generate requests and decides if the request is needed.







--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] best paging strategies for large datasets?

2010-05-13 Thread Louis-David Mitterrand
On Wed, May 12, 2010 at 12:26:17PM -0400, Justin Graf wrote:
> oops typos
> On 5/12/2010 1:41 AM, Louis-David Mitterrand wrote:
> > Hi,
> >
> > I have a large dataset (page 1 at http://www.cruisefish.net/stat.md) and
> > am in the process of developping a pager to let users leaf through it
> > (30K rows).
> 
> That's not that big of a record set.

Well for me it's a big one :) But then again it's my first serious web
app.

> > Ideally I'd like to know when requesting any 'page' of data where I am
> > within the dataset: how many pages are available each way, etc.
> >
> > Of course that can be done by doing a count(*) query before requesting a
> > limit/offset subset. But the main query is already quite slow, so I'd
> > like to minimize them.
> >
> 
> What do you mean by quite slow??

Like several seconds. I have to cache the results.

> On a 30K record table count() and query speed should not be a problem..

This query is a large multi-join of times series data, not a single
table. And it's not (prematurely :) optimized.

I'm planning a materialized view for it.

> > But I am intrigued by window functions, especially the row_number() and
> > ntile(int) ones.
> >
> > Adding "row_number() over (order by)" to my query will
> > return the total number of rows in the first row, letting my deduce the
> > number of pages remaining, etc. row_number() apparently adds very little
> > cost to the main query.
> 
> That will get a sequential number,  but you still don't know how many 
> records are in the table,  limit and offset block that value.
> I don't see how this helps?
> 
>   Limit and Offset with Total Record count tell us where we are in the 
> record set and which page we are on.

Hmm, good to know. I hadn't tried that yet.

> RecordCount/Limit = Number of pages
> CurrentPage = (offset%RecordCount)/Limit

These simple formulas we bill handy.

> to complicate things further what if the site allows user to change the 
> number of records displayed per page.  The pager logic needs to figure 
> out how many records need to be return per page, and what the next and 
> previous iterations are. Without the total record count  I don't see how 
> that is even possible.
> 
> I have written pagers in ASP and PHP

Thanks for your input. I now realize I'll have to get a total count in a
separate (cached) query, or else I'll only be able to provide a basic
"previous/next" pager.

Cheers,

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] best paging strategies for large datasets?

2010-05-13 Thread Louis-David Mitterrand
On Thu, May 13, 2010 at 11:36:53AM +0400, silly sad wrote:
> On 05/12/10 09:41, Louis-David Mitterrand wrote:
> >Hi,
> >
> >I have a large dataset (page 1 at http://www.cruisefish.net/stat.md) and
> >am in the process of developping a pager to let users leaf through it
> >(30K rows).
> >
> >Ideally I'd like to know when requesting any 'page' of data where I am
> >within the dataset: how many pages are available each way, etc.
> >
> >Of course that can be done by doing a count(*) query before requesting a
> >limit/offset subset. But the main query is already quite slow, so I'd
> >like to minimize them.
> 
> nowadays i tend to bet on AJAX.
> in other words i propose to move some calculations to a client side at all.
> 
> and this particular situation might looks similar to the following:
> 
> First u count(*) the rows and select a requested page
> returning to a client the count result bundled "with a page of rows"
> 
> (1) client renders the acquired rows
> (2)__memorize__ what part of the data he just got
> (3) and stores the count result to calculate "the pager div"
> 
> all the subsequent clicks on "the pager div" should not immediately
> generate requests and decides if the request is needed.

Yes, rendering the results throught ajax is a good idea, but one has to
be careful not to expose one's LIMIT and OFFSET to the client, but only
the "page" number. Or else the client could query the whole data set. A
lot of "professional" web site have that hole.

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] best paging strategies for large datasets?

2010-05-13 Thread silly sad



First u count(*) the rows and select a requested page
returning to a client the count result bundled "with a page of rows"

(1) client renders the acquired rows
(2)__memorize__ what part of the data he just got
(3) and stores the count result to calculate "the pager div"

all the subsequent clicks on "the pager div" should not immediately
generate requests and decides if the request is needed.


Yes, rendering the results throught ajax is a good idea, but one has to
be careful not to expose one's LIMIT and OFFSET to the client, but only
the "page" number. Or else the client could query the whole data set. A
lot of "professional" web site have that hole.



this is not a hole, it is only a matter of aesthetic

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Greetings folks, dumb question maybe?

2010-05-13 Thread Jasen Betts
On 2010-05-12, Josh  wrote:
> Hello, I'm a little new at this so please bear with me.
>
> I am trying to create a function that loads 100M test records into a 
> database,  however I am having a hard time building the function that 
> does so.
>
> I'm trying to do this in PGAdmin III for Ubuntu.  Is there something 
> that I have wrong with this?  I know that this works in MySQL (and yes I 
> know that MySQL bends the SQL Standards), but I am not sure what I am 
> doing wrong exactly.  I am coming up with the error that says there's an 
> error in my syntax near the v INTEGER := 0 line.  I get the same error 
> in psql as I do in the PGAdmin III.
>
> I have the following so far:
>
> DECLARE
> v INTEGER := 0;
> BEGIN
> while v < 1
> DO
> INSERT INTO unpart_tbl_test VALUES
> (v, 'test string data', adddate('1995-01-01', (rand(v)*36520) mod 3652));
> v := v + 1;
> END WHILE;
> END;
>
> Any insight would be greatly appreciated.
>
> - J

don't need a function for that one.

INSERT INTO unpart_tbl_test 
 SELECT  GENERATE_SERIES(0,999),
 'teststring data',
 date '1995-01-01' +(floor(random()*36520)::int % 3652);

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] [Trigger] Help needed with NEW.* and TG_TABLE_NAME

2010-05-13 Thread Jasen Betts
On 2010-05-11, Torsten Zühlsdorff  wrote:
> Hello,
>
> i have a problem with a trigger written in pl/pgsql.
>
> It looks like this:
>
> CREATE OR REPLACE FUNCTION versionize()
> RETURNS TRIGGER
> AS $$
> BEGIN
>
>NEW.revision := addContentRevision (OLD.content_id, OLD.revision);
>
>/* not working line, just a stub:
>EXECUTE 'INSERT INTO ' || TG_TABLE_NAME  || ' SELECT $1 ' USING NEW;
>*/
>
>RETURN NULL;
>
> END;
> $$ LANGUAGE 'plpgsql' VOLATILE;
>
> The function should be used at different tables and is invoked before
> UPDATEs. Everything what happens is the function call of
> addContentRevision. After this call all data (with the updated revision
> column) should be stored in the table as a new row.

What many people have missed is that you want to INSERT when the DML
comnabd UPDATE is used.

for things like that usually a rule is used instead, but I can see where
that may be unsuitable for your needs.  I found the following 
to work on a simple test case.


The problem is that INSERT in PLPGSQL needs a fixed table-name, and
that "EXECUTE" can't use variable-names, and further that quote_literal
doesn't convert ROW variables into something that can be used in a
VALUES clause.

so, Here's what I did.

 CREATE OR REPLACE FUNCTION versionize()
 RETURNS TRIGGER
 AS $$
 BEGIN

-- Not havign a definition for addContentRevision
-- I had this line commented out during testing.
NEW.revision := addContentRevision (OLD.content_id, OLD.revision);

EXECUTE 'INSERT INTO '||TG_TABLE_NAME||' SELECT (' || 
   QUOTE_LITERAL(NEW) || '::' || TG_TABLE_NAME ||').*' ;

RETURN NULL;

 END;
 $$ LANGUAGE PLPGSQL VOLATILE;

I take NEW, convert it to a quoted literal so I can use it in EXECUTE, cast it 
to the apreopreiate row type and split it into columns using SELECT
and .*. That gets inserted.

you should probably use QUOTE_IDENT on the TG_TABLE_NAME and possibly
also use similarly quoted TG_SCHEMA_NAME 


-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Greetings folks, dumb question maybe?

2010-05-13 Thread Joshua Tolley
On Thu, May 13, 2010 at 10:09:51AM +, Jasen Betts wrote:
> On 2010-05-12, Josh  wrote:
> don't need a function for that one.
> 
> INSERT INTO unpart_tbl_test 
>  SELECT  GENERATE_SERIES(0,999),
>  'teststring data',
>  date '1995-01-01' +(floor(random()*36520)::int % 3652);

...and if you'd rather it were a function anyway, do this:

CREATE FUNCTION populate_table() RETURNS VOID AS $$
INSERT INTO unpart_tbl_test 
 SELECT  GENERATE_SERIES(0,999),
 'teststring data',
 date '1995-01-01' +(floor(random()*36520)::int % 3652);
$$ LANGUAGE SQL;

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature


Re: [SQL] best paging strategies for large datasets?

2010-05-13 Thread Justin Graf
On 5/13/2010 4:41 AM, silly sad wrote:
>
>>> First u count(*) the rows and select a requested page
>>> returning to a client the count result bundled "with a page of rows"
>>>
>>> (1) client renders the acquired rows
>>> (2)__memorize__ what part of the data he just got
>>> (3) and stores the count result to calculate "the pager div"
>>>
>>> all the subsequent clicks on "the pager div" should not immediately
>>> generate requests and decides if the request is needed.
>>
>> Yes, rendering the results throught ajax is a good idea, but one has to
>> be careful not to expose one's LIMIT and OFFSET to the client, but only
>> the "page" number. Or else the client could query the whole data set. A
>> lot of "professional" web site have that hole.
>>
>
> this is not a hole, it is only a matter of aesthetic
>
Silly Sad is right this is not a hole but a matter of aesthetics.  To 
keep the code simple and limit the amount of things that have to be 
tracked with client session on the server,  I pass the limit and offset 
to the client normally in a url/link.  This also solves the problem if 
the users sessions expires, the information is lost meaning the user has 
to restart.  Very annoying.

So the urls look something like this   
www.mywebsit.com/index.php?module=getthedata&limit=10&offset=30&orderby=5

On the server  set the these three data types to integer to block sql 
injection.

I really don't care if the user sends a command to get all 10,000 
records.  If you block that, all that been accomplished is slowing down 
data harvesting and eating up even more resources, as the client/user 
will send ever more GETs to harvest data.  Nothing has been accomplished.

to keep the casual on looker for screwing with the url encode in 
base64.  It keeps honest people honest but the hackers will pull that 
apart in a second.

All legitimate Magwerks Corporation quotations are sent in a .PDF file 
attachment with a unique ID number generated by our proprietary quotation 
system. Quotations received via any other form of communication will not be 
honored.

CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally 
privileged, confidential or other information proprietary to Magwerks 
Corporation and is intended solely for the use of the individual to whom it 
addresses. If the reader of this e-mail is not the intended recipient or 
authorized agent, the reader is hereby notified that any unauthorized viewing, 
dissemination, distribution or copying of this e-mail is strictly prohibited. 
If you have received this e-mail in error, please notify the sender by replying 
to this message and destroy all occurrences of this e-mail immediately.
Thank you.


-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] best paging strategies for large datasets?

2010-05-13 Thread Justin Graf
On 5/13/2010 3:43 AM, Louis-David Mitterrand wrote:
**snip***
>
>> What do you mean by quite slow??
>>  
> Like several seconds. I have to cache the results.
>

Well then i suggest posting the queries to Performance or here and let 
us take a look them
don't forget to include the explain/analyze, and number of records in 
each table.
>> On a 30K record table count() and query speed should not be a problem..
>>  
> This query is a large multi-join of times series data, not a single
> table. And it's not (prematurely :) optimized.
>
> I'm planning a materialized view for it.
>

here read this
http://www.pgcon.org/2008/schedule/events/69.en.html

The question that begs to be asked how big are the tables the query is 
accessing??

if its not hundreds of thousands to millions of records or on a stone 
age Server , my guess is  the  query can be improved.


> **snip**
>> RecordCount/Limit = Number of pages
>> CurrentPage = (offset%RecordCount)/Limit
>>  
> These simple formulas we bill handy.
>

don't forget to use floor on these in what ever language your 
programming in




All legitimate Magwerks Corporation quotations are sent in a .PDF file 
attachment with a unique ID number generated by our proprietary quotation 
system. Quotations received via any other form of communication will not be 
honored.

CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally 
privileged, confidential or other information proprietary to Magwerks 
Corporation and is intended solely for the use of the individual to whom it 
addresses. If the reader of this e-mail is not the intended recipient or 
authorized agent, the reader is hereby notified that any unauthorized viewing, 
dissemination, distribution or copying of this e-mail is strictly prohibited. 
If you have received this e-mail in error, please notify the sender by replying 
to this message and destroy all occurrences of this e-mail immediately.
Thank you.


-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Invalid message format Exception

2010-05-13 Thread Rob Sargent


On 05/13/2010 01:04 AM, Gnanakumar wrote:
> Hi Rob,
> 
>> I'm sure most will urge you to move to UTF-8 encoding asap.
> Did you mean the database encoding to changed from SQL_ASCII to UTF-8?
> 
Yes.  That's pretty much the standard now.  I think it's Postgres'
default installation now (but don't quote me on that).

>> Have you tracked down the "offending" insert statement?  Perhaps it's a
>> trigger trying to generate a log message?
> No, I don't have any trigger on this table for INS/UPD/DEL.
> 
I think you'll need to track down example of inserts which are causing
the problem.

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql