Re: [dba-dev] Base Performance

Frank Schönheit - Sun Microsystems Germany Tue, 11 Sep 2007 12:23:26 -0700

Hello Andrew,

even if you're probably already used to my delays in answering your
longer mails :) - sorry for the delay ...


> What exactly is the target scenario for the embedded database?

well ...
> ( treat that as rhetorical for the moment, please )

... okay.

>>Here, we probably should distinguish between a "cold start" (which
>>requires starting the JVM) and a hot start.
>>Nonetheless, numbers and an issue are helpful.
> 
> [CHUCKLING] - well, yes that is true. IF you are running quickstarter 
> under windows then opening the second and subsequent databases is faster 
> then the very first one.

Ah, interesting. Yes, if you're not running quickstarter (which in my
very personal opinion is to prefer), then probably nearly every opening
of a database is a first opening - since you closed the first one and
thus the complete OOo.
My personal work style is to always have a document open all day long,
thus I have one "first open" relatively seldom ...
Your comment suggests this might not be the only way to work :), so in
fact the "first open" might deserve more attention that I originally
thought.

> Also,  this is about embedded databases only, file based or server ( 
> external processes from OOo main process ) based databases open 
> acceptably quickly, IMO.

I'd say so, too.

> For example, the File based HSQL driver opens databases much more 
> quickly then the current embedded model. Speaking of which, I seem to
>  recall that there was a Google SOC project proposal for a change to
> the file structure of the HSQLdb to
> address this problem ( among others ), if memory serves correctly it
> was not picked up. Have you heard anything further from the HSQLdb
> folks about this change?>

Unfortunately not. This would be the only change which *really* allows
to address a number of performance issues with the embedded HSQLDB.
Amongst others, closing data views or forms becomes unacceptably slow
(IMO) if the .odb exceeds a certain (relatively small) size limit. Also,
opening the connection becomes slower as the database and thus the .odb
grows. The only change to overcome this would be the single-file
backend, but there has been no progress at this.

(Well, not to mention we will have a hell of argueing to do with our
ODF/standardization folks. They do not like the idea of having one of
our applications having a proprietary format, and it will be pretty
difficult to convince them that this is the only acceptable way to make
embedded HSQLDB *really* usable, performance-wise. But that's another
story.)

>>>Copy / Paste as a means to transfer data into Base is horrendous
>>>    
>>
>>Ehm - yes. Not sure if there exists an issue for this, yet.
> 
> There is, Issue 76606.

You know we introduced the "performance" keyword in IZ a few days ago, I
for the moment at least added it to the issue.

>>Not strictly, ATM. You can see at
>>http://eis.services.openoffice.org/EIS2/cws.ShowCWS?logon=true&Id=3218
>>that there was a time were addressing those was considered a high
>>priority, but we somehow lost it from the radar with a lot of other
>>important changes happening ...
> 
> I see that it is targeted for 3.0

Which has nothing to say at the moment, it is effectively stalled. But
since I agree performance is worth being addressed, I'll try to get
cycles for this topic in the medium term.

> So then, back to the rhetorical question above.
> 
> What is (are) the target database(s) for performance testing? I suppose 
> that should be phrased as -
> What are the use cases to be tested for performance purposes?
> Reading over the design specifications I am at a loss to answer that 
> question, or at least at a loss to
> finding anything in the specifications regarding this. Perhaps once 
> again I have failed to read carefully
> enough however.

There's nothing like this in the documents - not in those I know, at least.

In general, embedded HSQLDB is our main focus. Probably not everybody
will sign this, but hey, we introduced this "all in one file" feature
with a big bang in 2.0, exactly because everybody said that exactly this
is the killer feature we need if we want to be taken seriously as
desktop database. In my understanding, nothing changed since we accepted
this premise, so "all in one file" - i.e. embedded HSQL - is the main focus.

Still, and as always, this means we should not forget the external
databases where Base is used as front-end. That's a difficult balancing
act, here as with any other feature/bug in Base. We cannot afford to
lose either of the two user groups. So, though I think I'd put a little
more weight on embedded HSQL, external DBs are also worth investigation.

Okay, this boils down to: Everything is important :)
(Well, if you come with a very exotic external DB which in a very
esoteric scenario has a performance problem, and this does not apply to
one of the bigger real-life databases, then be sure the issue will be
targeted to "OOo Whene^WLater".)

> For example - As i mentioned above I have the JDBC logging support 
> turned on now. Are there any use case
> scenarios you would like to see numbers on? What kind of triage would 
> you like to see done on those raw numbers?

Basically - everything which is known to have a performance problem can
(partly) be backed up by those numbers.

Connecting to the DB is interesting. While the log does not give us the
complete time as experienced in the UI, it will at least show us that
there are a few seconds needed for retrieving (nearly) the same
information again and again. I *know* we have a problem here - the
performance decreases with every additional table in the DB -, but
didn't yet dare to submit this.

Of course, the log can also be used to give some numbers on copying records.

Not sure if the log is useful for opening forms. I know there's a
problem here, especially with multiple list and combo boxes filled from
tables/queries. I suppose a rough prove could be given with the logs.


Other than that: If cou do some action and think this takes too long,
see whether the log allows you to prove this. If not (for instance the
searching problem you mentioned might not be visible in the log), ask us
to introduce new logs. I'm very open to it. Especially high-level logs
(like "search started" and "search ended with <result>", or "copying
record 1", "copying record 2", and the like) are easy to add, and
inexpensive when actually used. So, if we need logs to back up certain
performance claims, we can add them.

Ciao
Frank

-- 
- Frank Schönheit, Software Engineer         [EMAIL PROTECTED] -
- Sun Microsystems                      http://www.sun.com/staroffice -
- OpenOffice.org Base                       http://dba.openoffice.org -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [dba-dev] Base Performance

Reply via email to