RE: initial page allocation versus incremental allocation

2009-03-03 Thread Brian Peterson
Won't this still limit the setting to the MAX_PRE_ALLOC_SIZE?  Why have the 
maximum setting? Or, at least, why limit it to just 8 pages? If it is 
configurable, then applications that need the option have it available, and 
those that don't can continue using the default settings. I'm surprised there's 
an option to configure the initial allocation but not the increment size.

What I've been trying to do is use Derby in a client application that needs to 
download more data than will fit into heap. The objects get serialized into 
rows and are indexed by the row number and an additional java.util.UUID that 
identifies the object. The client gui can analyze the data and display the 
results and the raw data as required.

The loading works great up to the initial allocation of pages, then it crawls, 
which cascades through and disrupts the whole application. It looks like I 
could solve this by configuring the page-increment size just like the initial 
allocation setting. 

I had hoped to use Derby as a kind of general purpose heap management solution. 
It has been doing a great job of limiting the amount of heap it uses, and it 
integrates well with the rest of the application, but the slowdown of the table 
loading is killing me. 

Brian

-Original Message-
From: knut.hat...@sun.com [mailto:knut.hat...@sun.com] 
Sent: Tuesday, March 03, 2009 3:28 AM
To: Derby Discussion
Subject: Re: initial page allocation versus incremental allocation

Brian Peterson  writes:

> I see that there’s a property to allow configuring the number of pages
> to initially allocate to a table, derby.storage.initialPages, but
> there isn’t a property to allow for setting the number of pages to
> allocate when incrementally expanding the file container. It looks
> like RawStoreFactory might’ve allowed for this with
>
> public static final String PRE_ALLOCATE_PAGE = 
> “derby.storage.pagePerAllocation”;
>
> but this isn’t reference by anything I can find.

I haven't tested that it actually works, but it appears to be referenced
in FileContainer.createInfoFromProp():

PreAllocSize = 
PropertyUtil.getServiceInt(tc, createArgs,
RawStoreFactory.PRE_ALLOCATE_PAGE,
MIN_PRE_ALLOC_SIZE,
MAX_PRE_ALLOC_SIZE, 
   
DEFAULT_PRE_ALLOC_SIZE /* default */);

If it turns out that setting the property works, we should probably try
to get it into the documentation, as it looks like it could be useful.

-- 
Knut Anders




Re: COPY command

2009-03-03 Thread Walter Rogura
Hi Francois,

That is good stuff and what I need.

Thank you very much,
Walter

On Tue, 3 Mar 2009 10:29:39 -0800
Francois Orsini  wrote:

> Hi Walter,
> 
> You can also import data from a text file with comma-separated fields
> using Derby's SYSCS_UTIL.SYSCS_IMPORT_DATA() built-in procedure...
> 
> CREATE TABLE TAB1 (c1 VARCHAR(30), c2 INT)
> CALL SYSCS_UTIL.SYSCS_IMPORT_DATA (NULL, 'TAB1', null, null,
> 'myfile.del',null, null, null, 0)
> 
> With myfile.del containing data sych as:
> Robert,1
> Mike,2
> Leo,3
> 
> More info at:
> http://db.apache.org/derby/docs/10.4/tools/tools-single.html#ctoolsimport16245
> 
> You can also create a table out of another existing table and specify
> which (if not all) columns you want to have:
> CREATE TABLE TAB1 AS SELECT c1, c2 FROM TAB3 WITH NO DATA;
> 
> You can find a list of Derby's SQL commands in the Reference guide:
> http://db.apache.org/derby/docs/10.4/ref/ref-single.html
> 
> Cheers
> 
> --francois
> 
> On Mon, Mar 2, 2009 at 3:40 PM, Walter Rogura  wrote:
> 
> > Hi,
> >
> > is there a COPY command applicable in derby? I know this from
> > PostgreSQL, e.g.
> > COPY table (column1, column2) FROM stdin;
> > c11 c12
> > c21 c22
> > c31 c32
> > \.
> >
> > I tried it with SQuirreL but had no success. Btw. where do I find a
> > reference of supported SQL commands in derby?
> >
> > Thank you very much,
> > Walter
> >


Re: COPY command

2009-03-03 Thread Walter Rogura
Thank you very much Rick! I'll look into that.

On Tue, 03 Mar 2009 06:14:53 -0800
Rick Hillegas  wrote:

> Hi Walter,
> 
> Derby's SQL dialect is documented in the Reference Guide: 
> http://db.apache.org/derby/docs/10.4/ref/
> 
> If what you need to do is populate a table from stdin, you can do
> this by writing a table function which reads from stdin and presents
> the input as rows. Then you populate your table like this:
> 
> insert into mytable
> select * from table( stdinReadingTableFunction() ) s
> 
> Table functions are described in the Developer's Guide in a section 
> titled "Programming Derby-style table functions": 
> http://db.apache.org/derby/docs/10.4/devguide/ Table functions are 
> discussed at length in the white paper posted here: 
> http://developers.sun.com/javadb/reference/whitepapers/index.jsp
> 
> Hope this helps,
> -Rick
> 
> 
> Walter Rogura wrote:
> > Hi,
> >
> > is there a COPY command applicable in derby? I know this from
> > PostgreSQL, e.g.
> > COPY table (column1, column2) FROM stdin;
> > c11 c12
> > c21 c22
> > c31 c32
> > \.
> >
> > I tried it with SQuirreL but had no success. Btw. where do I find a
> > reference of supported SQL commands in derby?
> >
> > Thank you very much,
> > Walter
> >   
> 


Re: COPY command

2009-03-03 Thread Francois Orsini
Hi Walter,

You can also import data from a text file with comma-separated fields using
Derby's SYSCS_UTIL.SYSCS_IMPORT_DATA() built-in procedure...

CREATE TABLE TAB1 (c1 VARCHAR(30), c2 INT)
CALL SYSCS_UTIL.SYSCS_IMPORT_DATA (NULL, 'TAB1', null, null,
'myfile.del',null, null, null, 0)

With myfile.del containing data sych as:
Robert,1
Mike,2
Leo,3

More info at:
http://db.apache.org/derby/docs/10.4/tools/tools-single.html#ctoolsimport16245

You can also create a table out of another existing table and specify which
(if not all) columns you want to have:
CREATE TABLE TAB1 AS SELECT c1, c2 FROM TAB3 WITH NO DATA;

You can find a list of Derby's SQL commands in the Reference guide:
http://db.apache.org/derby/docs/10.4/ref/ref-single.html

Cheers

--francois

On Mon, Mar 2, 2009 at 3:40 PM, Walter Rogura  wrote:

> Hi,
>
> is there a COPY command applicable in derby? I know this from
> PostgreSQL, e.g.
> COPY table (column1, column2) FROM stdin;
> c11 c12
> c21 c22
> c31 c32
> \.
>
> I tried it with SQuirreL but had no success. Btw. where do I find a
> reference of supported SQL commands in derby?
>
> Thank you very much,
> Walter
>


Re: COPY command

2009-03-03 Thread Rick Hillegas

Hi Walter,

Derby's SQL dialect is documented in the Reference Guide: 
http://db.apache.org/derby/docs/10.4/ref/


If what you need to do is populate a table from stdin, you can do this 
by writing a table function which reads from stdin and presents the 
input as rows. Then you populate your table like this:


insert into mytable
select * from table( stdinReadingTableFunction() ) s

Table functions are described in the Developer's Guide in a section 
titled "Programming Derby-style table functions": 
http://db.apache.org/derby/docs/10.4/devguide/ Table functions are 
discussed at length in the white paper posted here: 
http://developers.sun.com/javadb/reference/whitepapers/index.jsp


Hope this helps,
-Rick


Walter Rogura wrote:

Hi,

is there a COPY command applicable in derby? I know this from
PostgreSQL, e.g.
COPY table (column1, column2) FROM stdin;
c11 c12
c21 c22
c31 c32
\.

I tried it with SQuirreL but had no success. Btw. where do I find a
reference of supported SQL commands in derby?

Thank you very much,
Walter
  




RE: inserts slowing down after 2.5m rows

2009-03-03 Thread derby
> Another question is why the maximum is set to 1000 pages.
> Any takers?

Because it was a big nice round number?

And when I say big, I meant it as it was big enough at the time.

Remember that Cloudscape was designed to fill a niche. It was designed to be
a small, embeddable engine that was 100% Java.

It was not meant as a competitor to the General Purpose RDBMs engines.

This is why I've asked that those pushing the Cloudscape development to
consider what they want this engine to be. Adding more features creates a
larger footprint which has a negative impact on some users who want to embed
the engine in a small downloadable app. 

This is why I recommended that those behind Derby consider building a more
modular approach to the engine. Sort of a plug and play for deployment of
features. (An example, if you're going to use the engine in an embeddable
format, you don't load up a container class that has chunks, page spaces,
table spaces, etc. You don't allow containers that utilize raw disks. You
don't do portioning. However if someone is going to use this in a more
traditional role, you do create the engine and load those classes.)

HTH

-Mike

> -Original Message-
> From: kristian.waa...@sun.com [mailto:kristian.waa...@sun.com]
> Sent: Tuesday, March 03, 2009 6:19 AM
> To: Derby Discussion
> Subject: Re: inserts slowing down after 2.5m rows
> 
> Brian Peterson wrote:
> > I thought I read in the documentation that 1000 was the max initial
> > pages you could allocate, and after that, Derby allocates a page at a
> > time. Is there some other setting for getting it to allocate more at a
> time?
> 
> Another question is why the maximum is set to 1000 pages.
> Any takers?
> 
> If the property can be set higher and controlled on a per conglomerate
> (table or index) basis, it can be a nice tool for those who require or
> want to use such tuning.
> 
> 
> --
> Kristian
> 
> >
> >
> >
> > Brian
> >
> >
> >
> > *From:* Michael Segel [mailto:mse...@segel.com] *On Behalf Of
> > *de...@segel.com
> > *Sent:* Friday, February 27, 2009 9:59 PM
> > *To:* 'Derby Discussion'
> > *Subject:* RE: inserts slowing down after 2.5m rows
> >
> >
> >
> > Ok,
> >
> >
> >
> > For testing, if you allocate 2000 pages, then if my thinking is ok, then
> > you'll fly along until you get until 2100 pages.
> >
> >
> >
> > It sounds like you're hitting a bit of a snag where after your initial
> > allocation of pages, Derby is only allocating a smaller number of pages
> > at a time.
> >
> >
> >
> > I would hope that you could configure the number of pages to be
> > allocated in blocks as the table grows.
> >
> >
> >
> >
> >
> > 
> >
> > *From:* publicay...@verizon.net [mailto:publicay...@verizon.net]
> > *Sent:* Friday, February 27, 2009 8:48 PM
> > *To:* Derby Discussion
> > *Subject:* Re: inserts slowing down after 2.5m rows
> >
> >
> >
> >  I've increased the log size and the checkpoint interval, but it doesn't
> > seem to help.
> >
> >
> >
> > It looks like the inserts begin to dramatically slow down once the table
> > reaches the initial allocation of pages. Things just fly along until it
> > gets to about 1100 pages (I've allocated an initial 1000 pages, pages
> > are 32k).
> >
> >
> >
> > Any suggestions on how to keep the inserts moving quickly at this point?
> >
> >
> >
> > Brian
> >
> >
> >
> > On Fri, Feb 27, 2009 at  3:41 PM, publicay...@verizon.net wrote:
> >
> >
> >
> >  The application is running on a client machine. I'm not sure how to
> > tell if there's a different disk available that I could log to.
> >
> >
> >
> > If checkpoint is causing this delay, how to a manage that? Can I turn
> > checkpointing off? I already have durability set to test; I'm not
> > concerned about recovering from a crashed db.
> >
> >
> >
> > Brian
> >
> >
> >
> > On Fri, Feb 27, 2009 at  9:34 AM, Peter Ondruška wrote:
> >
> >
> >
> >>  Could be checkpoint.. BTW to speed up bulk load you may want to use
> >
> > large log files located separately from data disks.
> >
> >
> >
> > 2009/2/27, Brian Peterson < dianeay...@verizon.net
> > >:
> >
> >>  I have a big table that gets a lot of inserts. Rows are inserted 10k
> at a
> >
> >>  time with a table function. At around 2.5 million rows, inserts slow
> down
> >
> >>  from 2-7s to around 15-20s. The table's dat file is around 800-900M.
> >
> >>
> >
> >>
> >
> >>
> >
> >>  I have durability set to "test", table-level locks, a primary key
> > index and
> >
> >>  another 2-column index on the table. Page size is at the max and page
> > cache
> >
> >>  set to 4500 pages. The table gets compressed (inplace) every 500,000
> > rows.
> >
> >>  I'm using Derby 10.4 with JDK 1.6.0_07, running on Windows XP. I've
> ruled
> >
> >>  out anything from the rest of the application, including GC (memory
> usage
> >
> >>  follows a consistent pattern during the whole load). It is a local
> file
> >
> >>  syste

Re: inserts slowing down after 2.5m rows

2009-03-03 Thread Kristian Waagan

Brian Peterson wrote:
I thought I read in the documentation that 1000 was the max initial 
pages you could allocate, and after that, Derby allocates a page at a 
time. Is there some other setting for getting it to allocate more at a time?


Another question is why the maximum is set to 1000 pages.
Any takers?

If the property can be set higher and controlled on a per conglomerate 
(table or index) basis, it can be a nice tool for those who require or 
want to use such tuning.



--
Kristian



 


Brian

 

*From:* Michael Segel [mailto:mse...@segel.com] *On Behalf Of 
*de...@segel.com

*Sent:* Friday, February 27, 2009 9:59 PM
*To:* 'Derby Discussion'
*Subject:* RE: inserts slowing down after 2.5m rows

 


Ok,

 

For testing, if you allocate 2000 pages, then if my thinking is ok, then 
you'll fly along until you get until 2100 pages.


 

It sounds like you're hitting a bit of a snag where after your initial 
allocation of pages, Derby is only allocating a smaller number of pages 
at a time.


 

I would hope that you could configure the number of pages to be 
allocated in blocks as the table grows.


 

 




*From:* publicay...@verizon.net [mailto:publicay...@verizon.net]
*Sent:* Friday, February 27, 2009 8:48 PM
*To:* Derby Discussion
*Subject:* Re: inserts slowing down after 2.5m rows

 

 I've increased the log size and the checkpoint interval, but it doesn't 
seem to help.


 

It looks like the inserts begin to dramatically slow down once the table 
reaches the initial allocation of pages. Things just fly along until it 
gets to about 1100 pages (I've allocated an initial 1000 pages, pages 
are 32k).


 


Any suggestions on how to keep the inserts moving quickly at this point?

 


Brian

 


On Fri, Feb 27, 2009 at  3:41 PM, publicay...@verizon.net wrote:

 

 The application is running on a client machine. I'm not sure how to 
tell if there's a different disk available that I could log to.


 

If checkpoint is causing this delay, how to a manage that? Can I turn 
checkpointing off? I already have durability set to test; I'm not 
concerned about recovering from a crashed db.


 


Brian

 


On Fri, Feb 27, 2009 at  9:34 AM, Peter Ondruška wrote:

 


 Could be checkpoint.. BTW to speed up bulk load you may want to use


large log files located separately from data disks.

 

2009/2/27, Brian Peterson < dianeay...@verizon.net  
>:



 I have a big table that gets a lot of inserts. Rows are inserted 10k at a



 time with a table function. At around 2.5 million rows, inserts slow down



 from 2-7s to around 15-20s. The table's dat file is around 800-900M.














 I have durability set to "test", table-level locks, a primary key 

index and

 another 2-column index on the table. Page size is at the max and page 

cache

 set to 4500 pages. The table gets compressed (inplace) every 500,000 

rows.


 I'm using Derby 10.4 with JDK 1.6.0_07, running on Windows XP. I've ruled



 out anything from the rest of the application, including GC (memory usage



 follows a consistent pattern during the whole load). It is a local file


 system. The database has a fixed number of tables (so there's a fixed 

number

 of dat files in the database directory the whole time). The logs are 

getting


 cleaned up, so there's only a few dat files in the log directory as well.















 Any ideas what might be causing the big slowdown after so many loads?















 Brian






















Re: initial page allocation versus incremental allocation

2009-03-03 Thread Knut Anders Hatlen
Brian Peterson  writes:

> I see that there’s a property to allow configuring the number of pages
> to initially allocate to a table, derby.storage.initialPages, but
> there isn’t a property to allow for setting the number of pages to
> allocate when incrementally expanding the file container. It looks
> like RawStoreFactory might’ve allowed for this with
>
> public static final String PRE_ALLOCATE_PAGE = 
> “derby.storage.pagePerAllocation”;
>
> but this isn’t reference by anything I can find.

I haven't tested that it actually works, but it appears to be referenced
in FileContainer.createInfoFromProp():

PreAllocSize = 
PropertyUtil.getServiceInt(tc, createArgs,
RawStoreFactory.PRE_ALLOCATE_PAGE,
MIN_PRE_ALLOC_SIZE,
MAX_PRE_ALLOC_SIZE, 
   
DEFAULT_PRE_ALLOC_SIZE /* default */);

If it turns out that setting the property works, we should probably try
to get it into the documentation, as it looks like it could be useful.

-- 
Knut Anders