Re: Derby in-memory back end - where to go next?

Lily Wei Fri, 25 Sep 2009 16:27:00 -0700

Thank you so much for the information.
With more smart phone on the market, I think people will use the in-memory 
feature more and appricate the performance improvment.


+1

Thanks again,
Lily




________________________________
From: Kristian Waagan <[email protected]>
To: Derby Discussion <[email protected]>
Sent: Friday, September 25, 2009 2:45:53 AM
Subject: Re: Derby in-memory back end - where to go next?

Rick Hillegas wrote:
> Hi Lily,
> 
> Some responses inline...
> 
> Lily Wei wrote:
>> 
>> Hi Rick:
>> 
>>      I have some follow up questions.
>> Middle-tier caching, monitoring transient data streams and test rigs totally 
>> make sense.
>> 
>> Do you see any benchmark in turn of how derby helps these applications?
>> 
> I don't think we've published any figures on the performance boost you get 
> from running in memory. My anecdotal recollection is that you see a 
> significant boost once you've gotten past database creation. Kristian has 
> done the most extensive testing and may have some figures that he can share. 
> Unfortunately, he suffered an accident earlier this week and is up on the 
> blocks for a while.

Hello Rick and Lily,

The performance benefit you'll see with the in-memory back end is highly 
dependent on the load and the underlying disk subsystem.
For write intensive loads the boost can be in orders of magnitude.
For read intensive loads the boost can be close to zero.

If you have a read-only database, it may be better in some cases to keep the 
database on disk, maximize the page cache size and then prime the cache 
(pulling all pages into the cache).
The downside of using the in-memory back end in such a scenario, is that some 
of the data will be stored twice: once in the "virtual in-memory file system" 
and once in the page cache. For the same reason, you should tweak the page 
cache size accordingly to your amount of data and heap size. Minimizing the 
page cache  (i.e. allowing only 40 pages) to avoid the "data duplication" 
problem is not a good idea for optimal performance...
For some more information about the effects of page cache size and page size, 
see [1]. It is really a comparison between two implementations of an in-memory 
back end, but closer to the end of the document there are some relevant 
experiments.

Unfortunately I'm unable to find the numbers I had comparing the disk based 
back end with the in-memory back end.
If anyone wants some hard numbers, they can try running the various performance 
clients found in the source code repository (under 
trunk/testing/.../perf/clients). The simplest ones are the single record 
operation clients and the bank_tx load.

In my opinion, the primary use cases for the current in-memory back end are 
testing and development. In the next release it may be better suited for 
storing purely transient data in a production environment as well (with a 
proper delete mechanism and maybe a size limit feature).


-- Kristian

[1] 
https://issues.apache.org/jira/secure/attachment/12400859/derby-646-performance_comparison_1a.txt
>> 
>> In aspect such as performance, totally memory consumption or reduce hardware 
>> cost?
>> 
>>  
>>      Do you see other embedded databases that also provide solution on the 
>>stripped-down CDC VM?
>> 
> I don't think that H2 or HSQLDB run on CDC.
> 
> Regards,
> -Rick
>> 
>> Do you have any data point for Derby?
>> 
>>  
>> Thank you so much for shed some lights for people like me,
>> 
>> Lily
>> 
>> 
>> 
>> *From:* Rick Hillegas <[email protected]>
>> *To:* Derby Discussion <[email protected]>
>> *Sent:* Wednesday, September 9, 2009 2:01:01 PM
>> *Subject:* Re: Derby in-memory back end - where to go next?
>> 
>> Hi Lily,
>> 
>> Some comments inline...
>> 
>> Lily Wei wrote:
>> >
>> > Hi Rick:
>> >
>> >      Thank you so much for sharing the information with the group.
>> >
>> > >* It would be great to be able to bound the growth of the in-memory db
>> >
>> > Is there a trend for need of in-memory db on JAVA world?
>> >
>> I find that this consistently generates a lot of discussion whenever I talk 
>> about 10.5 features with users.
>> >
>> > Is it mainly for applications, i.e. ERP, CRM, SRM?
>> >
>> The top use-cases which keep coming up are:
>> 
>> o Middle-tier caching -- here people use Derby in the middle tier in order 
>> to scale out access to a big back end like Oracle or DB2. Running in memory 
>> makes this perform even better.
>> 
>> o Monitoring transient data streams - here you slice and dice the data while 
>> the monitoring application is up but you don't necessarily need to keep the 
>> data after the monitoring session ends.
>> 
>> o Test rigs -- here you can use Derby on your laptop to run regression tests 
>> against an application which will run in production on a big back end like 
>> Oracle or DB2; the rig is lightweight and cleans up after itself.
>> >
>> > What kind of solution JAVA can provide for smart device like iPhone, RIMM 
>> > or Plam? i.e. Will JAVA play well with WindowMobile or Arnoid?
>> >
>> Our small device story is our ability to run on the stripped-down CDC VM. 
>> Being able to run completely in memory gives this story extra appeal too.
>> 
>> Thanks,
>> -Rick
>> >
>> > > > Thank you for shed the lights for us in advance,
>> >
>> > Lily
>> >
>> >
>> > *From:* Rick Hillegas <[email protected] 
>> > <mailto:[email protected]>>
>> > *To:* Derby Discussion <[email protected] 
>> > <mailto:[email protected]>>
>> > *Sent:* Wednesday, September 9, 2009 11:13:05 AM
>> > *Subject:* Re: Derby in-memory back end - where to go next?
>> >
>> > Hi Kristian,
>> >
>> > Here's another piece of feedback: Last night I gave an overview of Derby 
>> > to the San Francisco Java User's Group. A developer asked whether the 
>> > growth of the in-memory database could be bounded. He had a use case which 
>> > we didn't explore in depth but which involved periodically truncating the 
>> > database. I asked him to bring his requirements to the Derby user list so 
>> > that we could feed them into your spec effort. Here are my takeaways:
>> >
>> > * It would be great to be able to bound the growth of the in-memory db
>> >
>> > * It would be great if the memory occupied by deleted records could be 
>> > released
>> >
>> > Thanks,
>> > -Rick
>> >
>> > Kristian Waagan wrote:
>> > > Hello,
>> > >
>> > > In Derby 10.5 an in-memory back end, or storage engine, was included. It 
>> > > stores all the data in main memory, with the exception of derby.log. If 
>> > > this is news to you, and you want a quick intro to it, see [1] and [2].
>> > >
>> > > I'm trying to gather some feedback on whether the current implementation 
>> > > is found acceptable, or if there are additional features people would 
>> > > like to see. I expect some wishes to emerge, and I plan to record these 
>> > > on the wiki page [1]. The page can then be used to guide further work in 
>> > > this area.
>> > >
>> > > To start the discussion, I'll list some potential features and tasks. 
>> > > Feel free to comment on any one of them either by replying to this 
>> > > thread, or by adding your comments to [1]. It can be a +1 or -1 on the 
>> > > feature itself, a suggestion for a new feature, or details on what a 
>> > > feature should look like.
>> > >
>> > >
>> > > * Documentation
>> > > Must at least document the JDBC subsubprotocol, and also explain how to 
>> > > delete in-memory databases.
>> > > If new features are added, these must be documented as well.
>> > >
>> > > * Deletion of in-memory databases
>> > > Currently the only ways to delete an in-memory database are to restart 
>> > > the JVM or use a static method that isn't part of Derby's public API. A 
>> > > proper mechanism for deletion should be added.
>> > >
>> > > * Automatic deletion on database shutdown (or when last connection 
>> > > disconnects)
>> > >
>> > > * "Anonymous in-memory databases"
>> > > A database which only the connection creating it can access, and when 
>> > > the connection goes away the database goes away.
>> > >
>> > > * Automatic persistence
>> > > The database could be persisted to disk automatically based on certain 
>> > > criteria. The most obvious ones are perhaps on a fixed interval and on 
>> > > JVM shutdown.
>> > >
>> > > * Monitoring
>> > > The most basic information is how many in-memory databases exist in the 
>> > > current JVM, and how big they are. How should this information be 
>> > > presented? Should it be available to anyone having a connection to the 
>> > > current JVM?
>> > >
>> > > * No derby.log
>> > > Include a class in Derby that will discard everything written to 
>> > > derby.log.
>> > >
>> > >
>> > > Thank you for your feedback,
>> >
>> >
>> 
>> 
>

Re: Derby in-memory back end - where to go next?

Reply via email to