Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-21 Thread Robert Haas
On Tue, Apr 20, 2010 at 5:05 PM, Kris Jurka bo...@ejurka.com wrote:
 The reason this is not done is that the mechanism used for fetching a piece
 of the results at a time can change the query plan used if using a
 PreparedStatement.  There are three ways to plan a PreparedStatement:

 a) Using the exact parameter values by substituting them directly into the
 query.  This isn't really planned as you can't re-use it at all.  This is
 only available using the V2 protocol.

 b) Using the parameter values for statistics, but not making any stronger
 guarantees about them.  So the parameters will be used for evaluating the
 selectivity, but not to perform other optimizations like contraint_exclusion
 or transforming a LIKE operation to a range query. This is the default plan
 type the JDBC driver uses.

Hmm.  I didn't think this was possible.  How are you doing this?

 c) Planning the query with no regard for the parameters passed to it. This
 is the plan type the JDBC driver uses when it sees the same
 PreparedStatement being re-used multiple times or when it is respecting
 setFetchSize and allowing for partial results.

 We must use (c) for partial results instead of (b) because of some
 limitations of the server.  Currently you cannot have two statements of type
 (b) open on the same connection.  So since the driver can't know if the user
 will issue another query before fetching the remainder of the first query's
 results, it must setup the first query to be of type (c) so that multiple
 statements can exist simultaneously.

 Switching the default plan type to (c) will cause a significant number of
 complaints as performance on some queries will go into the tank.  Perhaps we
 could have a default fetchSize for plain Statements as it won't affect the
 plan.  I could also see making this a URL parameter though so it could be
 set as the default with only a configuration, not a code change.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-21 Thread Nikolas Everett
On Wed, Apr 21, 2010 at 10:41 AM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, Apr 20, 2010 at 5:05 PM, Kris Jurka bo...@ejurka.com wrote:
  The reason this is not done is that the mechanism used for fetching a
 piece
  of the results at a time can change the query plan used if using a
  PreparedStatement.  There are three ways to plan a PreparedStatement:
 
  a) Using the exact parameter values by substituting them directly into
 the
  query.  This isn't really planned as you can't re-use it at all.  This
 is
  only available using the V2 protocol.
 
  b) Using the parameter values for statistics, but not making any stronger
  guarantees about them.  So the parameters will be used for evaluating the
  selectivity, but not to perform other optimizations like
 contraint_exclusion
  or transforming a LIKE operation to a range query. This is the default
 plan
  type the JDBC driver uses.

 Hmm.  I didn't think this was possible.  How are you doing this?


More to the point is there some option that can shift you into method a?
 I'm thinking of warehousing type applications where you want to re-plan a
good portion of your queries.


Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-21 Thread Tom Lane
 On Tue, Apr 20, 2010 at 5:05 PM, Kris Jurka bo...@ejurka.com wrote:
 ... There are three ways to plan a PreparedStatement:

FWIW, I think there is some consensus to experiment (in the 9.1 cycle)
with making the server automatically try replanning of parameterized
queries with the actual parameter values substituted.  It'll keep doing
so if it finds that that produces a significantly better plan than the
generic parameterized plan; which is what you'd expect if there's a
chance to optimize a LIKE search, eliminate partitions, etc.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-21 Thread Nikolas Everett
On Wed, Apr 21, 2010 at 11:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:

  On Tue, Apr 20, 2010 at 5:05 PM, Kris Jurka bo...@ejurka.com wrote:
  ... There are three ways to plan a PreparedStatement:

 FWIW, I think there is some consensus to experiment (in the 9.1 cycle)
 with making the server automatically try replanning of parameterized
 queries with the actual parameter values substituted.  It'll keep doing
 so if it finds that that produces a significantly better plan than the
 generic parameterized plan; which is what you'd expect if there's a
 chance to optimize a LIKE search, eliminate partitions, etc.

regards, tom lane


That'd be wonderful.


Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-21 Thread Kris Jurka



On Wed, 21 Apr 2010, Robert Haas wrote:


On Tue, Apr 20, 2010 at 5:05 PM, Kris Jurka bo...@ejurka.com wrote:


b) Using the parameter values for statistics, but not making any stronger
guarantees about them.  So the parameters will be used for evaluating the
selectivity, but not to perform other optimizations like contraint_exclusion
or transforming a LIKE operation to a range query. This is the default plan
type the JDBC driver uses.


Hmm.  I didn't think this was possible.  How are you doing this?


This is only possible at the protocol level, it's not available using SQL 
commands only.  You do this by creating an unnamed instead of a named 
statement:


http://www.postgresql.org/docs/8.4/static/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY

Query planning for named prepared-statement objects occurs when
the Parse message is processed. If a query will be repeatedly
executed with different parameters, it might be beneficial to send
a single Parse message containing a parameterized query, followed
by multiple Bind and Execute messages. This will avoid replanning
the query on each execution.

The unnamed prepared statement is likewise planned during Parse
processing if the Parse message defines no parameters. But if
there are parameters, query planning occurs during Bind processing
instead. This allows the planner to make use of the actual values
of the parameters provided in the Bind message when planning the
query.


Kris Jurka
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-21 Thread Kris Jurka



On Wed, 21 Apr 2010, Nikolas Everett wrote:

More to the point is there some option that can shift you into method a? 
 I'm thinking of warehousing type applications where you want to re-plan 
a good portion of your queries.




This can be done by connecting to the database using the V2 protocol (use 
URL option protocolVersion=2).  This does remove some functionality of 
the driver that is only available for V3 protocol, but will work just 
fine for query execution.


Kris Jurka

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-20 Thread Dave Crooke
AFAICT from the Java end, ResultSet.close() is supposed to be final. There
is no way I know of in JDBC to get a handle back to the cursor on the server
side once you have made this call - in fact, its sole purpose is to inform
the server in a timely fashion that this cursor is no longer required, since
the ResultSet itself is a Java object and thus subject to garbage collection
and finalizer hooks.

At a pragmatic level, the PGSQL JDBC driver has a lot of odd behaviours
which, while they may or may not be in strict compliance with the letter of
the standard, are very different from any other mainstream database that I
have accessed from Java  what I'd consider as normative behaviour, using
regular JDBC calls without the need to jump through all these weird hoops,
is exhibited by all of the following: Oracle, SQL Server, DB2, MySQL, Apache
Derby and JET (MS-Access file-based back end, the .mdb format)

In practce, this places PGSQL as the odd one out, which is a bit of a
turn-off to expereinced Java people who are PG newbies for what is otherwise
an excellent database.

At my current post, I came into a shop that had PG as the only real
database, so I have learned to love it, and de-supported Derby and the other
toy back ends we used to use. And to be fair, from a back end perspective,
PG is better than MySQL in terms of manageability  I am running 250GB
databases on small systems with no issues.

At my previous shop, we built a couple of database-backed apps from scratch,
and despite a desire to use PG due to there being more certainty over its
future licensing (it was just after Sun had bought MySQL AG), I ended up
switching from PG to MySQL 5.0.47 (last open source version) because of the
difficulties I was having with the PG driver.

I consider part of the acme of great FOSS is to make it easy to use for
newbies and thus attract a larger user base, but that is just my $0.02
worth.

Cheers
Dave

On Tue, Apr 20, 2010 at 9:28 AM, Kevin Grittner kevin.gritt...@wicourts.gov
 wrote:

 Dave Crooke dcro...@gmail.com wrote:

  I'd consider the fact that ResultSet.close() does not release the
  implicit cursor to be something of a bug

 What's your reasoning on that?  The definitions of cursors in the
 spec, if memory serves, allow a cursor to be closed and re-opened;
 why would this be treated differently?

 -Kevin



Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-20 Thread Kris Jurka



On Mon, 19 Apr 2010, Dave Crooke wrote:


Statement.close() appears to get the job done (in my envrionment, PG's
driver never sees a Connection.close() because of DBCP).

I'd consider the fact that ResultSet.close() does not release the implicit
cursor to be something of a bug, but it may well have been fixed already.


PG doesn't release the locks acquired by the query until transaction end. 
So closing a cursor will release some backend memory, but it won't release 
the locks.  The way the driver implements ResultSet.close() is to put 
the close message into a queue so that the next time a message is sent to 
the backend we'll also send the cursor close message.  This avoids an 
extra network roundtrip for the close action.


In any case Statement.close isn't helping you here either.  It's really 
Connection.commit/rollback that's releasing the locks.


Kris Jurka

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-20 Thread Kevin Grittner
Dave Crooke dcro...@gmail.com wrote:
 
 AFAICT from the Java end, ResultSet.close() is supposed to be
 final.
 
For that ResultSet.  That doesn't mean a ResultSet defines a cursor.
Such methods as setCursorName, setFetchSize, and setFetchDirection
are associated with a Statement.  Think of the ResultSet as the
result of a cursor *scan* generated by opening the cursor defined by
the Statement.
 
http://java.sun.com/javase/6/docs/api/java/sql/ResultSet.html#close%28%29
 
Notice that the ResultSet is automatically closed if the Statement
that generated it is re-executed.  That is very much consistent with
Statement as the equivalent of a cursor, and not very consistent
with a ResultSet as the equivalent of a cursor.
 
 There is no way I know of in JDBC to get a handle back to the
 cursor on the server side once you have made this call - in fact,
 its sole purpose is to inform the server in a timely fashion that
 this cursor is no longer required, since the ResultSet itself is a
 Java object and thus subject to garbage collection and finalizer
 hooks.
 
Again, you're talking about the *results* from *opening* the cursor.
 
 At a pragmatic level, the PGSQL JDBC driver has a lot of odd
 behaviours which, while they may or may not be in strict
 compliance with the letter of the standard, are very different
 from any other mainstream database that I have accessed from Java
  what I'd consider as normative behaviour, using regular JDBC
 calls without the need to jump through all these weird hoops, is
 exhibited by all of the following: Oracle, SQL Server, DB2, MySQL,
 Apache Derby and JET (MS-Access file-based back end, the .mdb
 format)
 
Are you talking about treating the Statement object as representing
a cursor and the ResultSet representing the results from opening
the cursor, or are you thinking of something else here?
 
 In practce, this places PGSQL as the odd one out, which is a bit
 of a turn-off to expereinced Java people who are PG newbies for
 what is otherwise an excellent database.
 
Huh.  I dropped PostgreSQL into an environment with hundreds of
databases, and the applications pretty much just worked for us.
Of course, we were careful to write to the SQL standard and the JDBC
API, not to some other product's implementation of them. 
 
There were a few bugs we managed to hit which hadn't previously been
noticed, but those were promptly fixed.  As I recall, about the only
other things which caused me problems were:
 
(1)  Needing to setFetchSize to avoid materializing the entire
result set in RAM on the client.
 
(2)  Fixing a race condition in our software which was benign in
other products, but clearly my own bug.
 
(3)  Working around the fact that COALESCE(NULL, NULL) can't be used
everywhere NULL can.
 
 At my previous shop, we built a couple of database-backed apps
 from scratch, and despite a desire to use PG due to there being
 more certainty over its future licensing (it was just after Sun
 had bought MySQL AG), I ended up switching from PG to MySQL 5.0.47
 (last open source version) because of the difficulties I was
 having with the PG driver.
 
Just out of curiosity, did you discuss that on the PostgreSQL lists?
Can you reference the thread(s)?
 
 I consider part of the acme of great FOSS is to make it easy to
 use for newbies and thus attract a larger user base, but that is
 just my $0.02 worth.
 
Sure, but I would consider it a step away from that to follow
MySQL's interpretation of cursors rather than the standard's.
YMMV, of course.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-20 Thread Dave Crooke
I don't want to get into a big debate about standards, but I will clarify a
couple of things inline below.

My key point is that the PG JDBC driver resets people's expecations who have
used JDBC with other databases, and that is going to reflect negatively on
Postgres if Postgres is in the minority, standards nothwithstanding, and I
feel badly about that, because PG rocks!

Cheers
Dave

On Tue, Apr 20, 2010 at 11:32 AM, Kevin Grittner 
kevin.gritt...@wicourts.gov wrote:

 Dave Crooke dcro...@gmail.com wrote:

  AFAICT from the Java end, ResultSet.close() is supposed to be
  final.

 For that ResultSet.  That doesn't mean a ResultSet defines a cursor.
 Such methods as setCursorName, setFetchSize, and setFetchDirection
 are associated with a Statement.  Think of the ResultSet as the
 result of a cursor *scan* generated by opening the cursor defined by
 the Statement.

 http://java.sun.com/javase/6/docs/api/java/sql/ResultSet.html#close%28%29

 Notice that the ResultSet is automatically closed if the Statement
 that generated it is re-executed.  That is very much consistent with
 Statement as the equivalent of a cursor, and not very consistent
 with a ResultSet as the equivalent of a cursor.


True, but mechanically there is no other choice - the ResultSet is created
by Statement.executeQuery() and by then it's already in motion  in the
case of Postgres with default settings, the JVM blows out before that call
returns.

I am not explicitly creating any cursors, all I'm doing is running a query
with a very large ResultSet.


 Again, you're talking about the *results* from *opening* the cursor.

  At a pragmatic level, the PGSQL JDBC driver has a lot of odd
  behaviours which, while they may or may not be in strict
  compliance with the letter of the standard, are very different
  from any other mainstream database that I have accessed from Java
   what I'd consider as normative behaviour, using regular JDBC
  calls without the need to jump through all these weird hoops, is
  exhibited by all of the following: Oracle, SQL Server, DB2, MySQL,
  Apache Derby and JET (MS-Access file-based back end, the .mdb
  format)

 Are you talking about treating the Statement object as representing
 a cursor and the ResultSet representing the results from opening
 the cursor, or are you thinking of something else here?


Specific examples:

a. the fact that Statement.executeQuery(select * from huge_table) works
out of the box with every one of those databases, but results in
java.langOutOfMemory with PG without special setup. Again, this is to the
letter of the standard, it's just not very user friendly.

b. The fact that with enterprise grade commercital databases, you can mix
reads and writes on the same Connection, whereas with PG Connection.commit()
kills open cursors.

The fact that I've been using JDBC for 12 years with half a dozen database
products, in blissful ignorance of these fine distinctions in the standard
until I had to deal with them with PG, is kinda what my point is :-)

I understand the reasons for some of these limitations, but by no means all
of them.


 Huh.  I dropped PostgreSQL into an environment with hundreds of
 databases, and the applications pretty much just worked for us.
 Of course, we were careful to write to the SQL standard and the JDBC
 API, not to some other product's implementation of them.


True, but not everyone can hire every developer to be a JDBC / SQL language
lawyer. All of our SQL is either ANSI or created by the Hibernate PGSQL
adapter, with the exception of a daily VACUUM ANALYSE which I added ;-)

I do believe that when there are two ways to implement a standard, the it
just works way is far preferable to the well, I know you probably think
this is a bug, because 90% of the client code out there chokes on it, but
actually we are standards compliant, it's everyone else who is doing it
wrong way.

I used to work at a storage startup that did exactly the latter, using an
obscure HTTP/1.1 standard feature that absolutely none of the current
browsers or HTTP libraries supports, and so it was a constant source of
frustration for customers and tech support alike. I no longer work there ;-)

It's kinda like making stuff that has to work with Windows - you know
Microsoft doesn't follow it's own standards, but you gotta make our code
work with theirs, so you play ball with their rules.


 (1)  Needing to setFetchSize to avoid materializing the entire
 result set in RAM on the client.


I don't understand the rationale for why PG, unlike every other database,
doesn't make this a sensible default, e.g, 10,000 rows ... maybe because the
locks stay in place until you call Connection.close() or Connection.commit()
? ;-)



 (2)  Fixing a race condition in our software which was benign in
 other products, but clearly my own bug.


Been there and done that with code developed on single-threaded DB's (JET /
Derby) ... not what I'm griping about here though, the base code with no
extra 

Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-20 Thread Nikolas Everett
On Tue, Apr 20, 2010 at 3:29 PM, Dave Crooke dcro...@gmail.com wrote:

 I wouldn't hold MySQL up to be a particularly good implmentation of
 anything, other than speed (MyISAM) and usability (the CLI)  I find
 Oracle's JDBC implmentation to be both user friendly and (largely) standards
 compliant.


Dave,

I've been following along at home and agree with you right up until you
mention the MySQL CLI being usable.  I work with the thing every day.  The
plain, vanilla install on my Ubuntu laptop lacks proper readline support.
 Hitting ctrl-c will sometimes kill the running query and sometimes kill the
CLI.  Its far from a paragon of usability.  That last time I used psql it
didn't have any of those issues.

Full disclosure:  mysql does have proper readline support on a Centos
machine I have access to.  ctrl-c still kills the shell.

Your other points are good though.

--Nik


Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-20 Thread Kevin Grittner
Dave Crooke dcro...@gmail.com wrote:
 
 a. the fact that Statement.executeQuery(select * from
 huge_table) works out of the box with every one of those
 databases, but results in java.langOutOfMemory with PG without
 special setup. Again, this is to the letter of the standard, it's
 just not very user friendly.
 
The way I read it, it's *allowed* by the standard, but not
*required* by the standard.  I agree it's not very friendly
behavior.  I made some noise about it early in my use of PostgreSQL,
but let it go once I had it covered for my own shop.  I agree it's a
barrier to conversion -- it often comes up here with new PostgreSQL
users, and who knows how many people give up on PostgreSQL without
coming here when they hit it?
 
It's not just an issue in JDBC, either; it's generally the default
in PostgreSQL interfaces.  That seems to be by design, with the
rationale that it prevents returning some part of a result set and
then throwing an error.  Anyone coming from another database
probably already handles that, so they won't tend to be impressed by
that argument, but it would be hard to change that as a default
behavior in PostgreSQL without breaking a lot of existing code for
PostgreSQL users at this point.  :-(
 
 b. The fact that with enterprise grade commercital databases, you
 can mix reads and writes on the same Connection, whereas with PG
 Connection.commit() kills open cursors.
 
Well, I know that with Sybase ASE (and therefore it's probably also
true of Microsoft SQL Server, since last I saw they both use TDS
protocol), unless you're using a cursor, if you execute another
statement through JDBC on the same connection which has a pending
ResultSet, it reads the rest of the ResultSet into RAM (the behavior
you don't like), before executing the new statement.  So at least
for those databases you can't really claim *both* a and b as points.
 
Oops -- I just noticed you said enterprise grade.  ;-)
 
 The fact that I've been using JDBC for 12 years with half a dozen
 database products, in blissful ignorance of these fine
 distinctions in the standard until I had to deal with them with
 PG, is kinda what my point is :-)
 
OK, point taken.
 
 I understand the reasons for some of these limitations, but by no
 means all of them.
 
Well, one of the cool things about open source is that users have
the opportunity to scratch their own itches.  The JDBC
implementation is 100% Java, so if changing something there would be
helpful to you, you can do so.  If you're careful about it, you may
be able to contribute it back to the community to save others the
pain.  If you want to take a shot at some of this, I'd be willing to
help a bit.  If nothing else, the attempt may give you better
perspective on the reasons for some of the limitations.  ;-)
 
 (1)  Needing to setFetchSize to avoid materializing the entire
 result set in RAM on the client.
 
 I don't understand the rationale for why PG, unlike every other
 database, doesn't make this a sensible default, e.g, 10,000 rows
 
I took a bit of a look at this, years ago.  My recollection is that,
based on the nature of the data stream, you would need to do
something similar to databases using TDS -- you could read as you go
as long as no other statement is executed on the connection; but
you'd need to add code to recognize the exceptional circumstance and
suck the rest of the result set down the wire to RAM should it be
necessary to clear the way for another statement.
 
If you give it a shot, you might want to see whether it's possible
to avoid an irritating implementation artifact of the TDS JDBC
drivers: if you close a ResultSet or a Statement with an open
ResultSet without first invoking Statement.cancel, they would suck
back the rest of the results (and ignore them) -- making for a big
delay sometimes on a close invocation.  As I recall, the
justification was that for executions involving multiple result
sets, they needed to do this to get at the next one cleanly;
although some forms of execute don't support multiple results, and
it doesn't do you a lot of good on Statement close, so you'd think
these could have been optimized.
 
 I find Oracle's JDBC implmentation to be both user friendly and
 (largely) standards compliant.
 
Where there are issues with usability or standards compliance with
PostgreSQL, especially for something which works well for you in
other products, I hope you raise them on these lists.  Perhaps there
are already ways to deal with them, perhaps we need to better
document something, and perhaps some change can be made to
accommodate the issue.  Even if no action is taken at the time it is
helpful to the project, because the number of people raising an
issue is often taken into consideration when deciding whether to
change something.  Also, someone running into the issue later may
find the discussion on a search and gain helpful information.
 
 I hope this can be taken in the amicable spirit of gentlemanly
 debate in which it is offered, 

Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-20 Thread Dave Crooke
I digest this down to this is the best that can be achieved on a connection
that's single threaded

I think the big difference with Oracle is this:

i. in Oracle, a SELECT does not have to be a transaction, in the sense that
PG's SELECT does ... but in Oracle, a SELECT can fail mid-stream if you wait
too long and the UNDO tablespace wraps (ORA-600), i.e. Oracle does not lock
on SELECT. Oracle is optimized for lots of small transactions that typically
commit, PG supports arbitrary transaction mixes of any size, but is less
efficient at the workload for which Oracle is specialized.

ii. SELECT always creates an implicit cursor in Oracle, but access to these
cursors can be interleaved arbitrarily on one connection both with each
other and transactions (writes)

After consiering the context you offered, I'd recommend the following two
minor changes to the PG driver 

a. Make setFetchSize(1) the default

b. If someone does call rs.close() before the end of the ResultSet, and has
not created an explicit cursor at the JDBC level, flag the query / lock /
virtual transaction in some way in the JDBC driver that tells it that it can
just dump the cursor on a subsequent stmt.close(), conn.commit() or
conn.close() call without sucking down the rest of the data.

AFAICT, this will make the behaviour more like other DB's without sacrifcing
anything, but I don't know what default behaviour expectations might be out
there in PG land.

Cheers
Dave

On Tue, Apr 20, 2010 at 3:22 PM, Kevin Grittner kevin.gritt...@wicourts.gov
 wrote:
(Lots of good explanatory stuff)


Re: [JDBC] SOLVED ... Re: Getting rid of a cursor from JDBC .... Re: [PERFORM] Re: HELP: How to tame the 8.3.x JDBC driver with a biq guery result set

2010-04-20 Thread Kris Jurka



On Tue, 20 Apr 2010, Dave Crooke wrote:


a. Make setFetchSize(1) the default


The reason this is not done is that the mechanism used for fetching a 
piece of the results at a time can change the query plan used if using a 
PreparedStatement.  There are three ways to plan a PreparedStatement:


a) Using the exact parameter values by substituting them directly into the 
query.  This isn't really planned as you can't re-use it at all.  This 
is only available using the V2 protocol.


b) Using the parameter values for statistics, but not making any stronger
guarantees about them.  So the parameters will be used for evaluating the 
selectivity, but not to perform other optimizations like 
contraint_exclusion or transforming a LIKE operation to a range query. 
This is the default plan type the JDBC driver uses.


c) Planning the query with no regard for the parameters passed to it. 
This is the plan type the JDBC driver uses when it sees the same 
PreparedStatement being re-used multiple times or when it is respecting 
setFetchSize and allowing for partial results.


We must use (c) for partial results instead of (b) because of some 
limitations of the server.  Currently you cannot have two statements of 
type (b) open on the same connection.  So since the driver can't know if 
the user will issue another query before fetching the remainder of the 
first query's results, it must setup the first query to be of type (c) so 
that multiple statements can exist simultaneously.


Switching the default plan type to (c) will cause a significant number of 
complaints as performance on some queries will go into the tank.  Perhaps 
we could have a default fetchSize for plain Statements as it won't affect 
the plan.  I could also see making this a URL parameter though so it could 
be set as the default with only a configuration, not a code change.



b. If someone does call rs.close() before the end of the ResultSet, and has
not created an explicit cursor at the JDBC level, flag the query / lock /
virtual transaction in some way in the JDBC driver that tells it that it can
just dump the cursor on a subsequent stmt.close(), conn.commit() or
conn.close() call without sucking down the rest of the data.


This is already true.  The JDBC driver only asks the server for more of 
the ResultSet when a next() call requires it.  So the server isn't 
constantly spewing out rows that the driver must deal with, the driver 
only gets the rows it asks for.  Once the ResultSet is closed, it won't 
ask for any more.


Kris Jurka


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance