On 03/10/2014 04:51 AM, matshyeq wrote:
Albe Laurenz wrote:
I would believe the stackoverflow
(http://stackoverflow.com/questions/21960121/perl-script-fails-when-selecting-data-from-big-postgresql-table)
question referred to explains the issue well.
You can retrieve the full result set,
not an option because of client memory limitations (in this case it's
poor client spec but there always are some, especially when you want to
pull 1e7 rows)
you can retrieve it row by row,
not an option because of performance (db calls/network roundtrips)
you can use a LIMIT clause to retrieve it in batches.
you pointed the best why it's not a feasible option (complexity,
isolation levels, not always possible ie. when custom query and last but
not least: far from being elegant)
CURSOR option
As already explained at stackoverflow - I'm using it as a workaround. My
general point is it forces developers to use lower level communication
with DB (cursors) therefore not as elegant as just setting RowCacheSize
parameter as specified by DBI. According to DBD::Pg maintainer this
hasn't and can't be implemented for PostgreSQL due to the lack of
support in its own libpq library.
So again.., I'm really surprised this functionality is not yet supported
in PostgreSQL. Does that mean everybody have been implementing this
through cursors?
To recap what's on stackoverflow - The functionality I'm talking about
would be an equivalent of JDBC setFetchSize()
<http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#setFetchSize(int)>
function
to optimize the load from (any) database in batches, like in the example
below:
Statement st =
conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
java.sql.ResultSet.CONCUR_READ_ONLY);
// Set the fetch size to 1000.
st.setFetchSize(1000);
// Execute the given sql query
String sql = "select * from bigtable";
ResultSet rs = statement.executeQuery(sql);
while (rs.next()) {
⋮
}
where underneath ResultSet.next() doesn't actually fetch one row at a
time from the RESULT-SET. It returns that from the (local) ROW-SET and
fetches ROW-SET (transparently) whenever it becomes exhausted on the
local client.
Actually, curious now if this functionality has been implemented in
PostgreSQL JDBC drivers...?
Yes, using a cursor.
http://jdbc.postgresql.org/documentation/92/query.html
By default the driver collects all the results for the query at once.
This can be inconvenient for large data sets so the JDBC driver provides
a means of basing a ResultSet on a database cursor and only fetching a
small number of rows.
.....
Anyway, according to one of the DBD::Pg developers it's impossible to
bring this functionality as the problem lies deeper, within libpq library:
"Unfortunately, this is a limitation in the underlying driver (libpq)
rather than DBD::Pg itself. There have been talks over the years of
supporting this, but nothing concrete yet."
So probably the best is to ask Greg to speak to details if still unclear.
Kind Regards,
Maciek
On Mon, Mar 10, 2014 at 9:42 AM, Albe Laurenz <laurenz.a...@wien.gv.at
<mailto:laurenz.a...@wien.gv.at>> wrote:
matshyeq wrote:
> Postgresql is there for a good while perceived as one of the best
(or just simply the best!?)
> available open source DB solution, so I'm really surprised this
functionality is not yet supported...
You can retrieve the full result set,
you can retrieve it row by row,
you can use a LIMIT clause to retrieve it in batches.
Can you explain how exactly the functionality would look that
you are missing?
Yours,
Laurenz Albe
--
Thank you,
Kind Regards
~Maciek
--
Adrian Klaver
adrian.kla...@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general