[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-11-14 Thread Suresh Thalamati (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12449848 ] 

Suresh Thalamati commented on DERBY-801:


Hi Anders, 

I agree with Knut,  this issue can be marked as fixed,   Thanks a lot for  
working on this improvement. 

-suresh



 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Fix For: 10.3.0.0

 Attachments: DERBY-801-6.patch, DERBY-801-7.patch, 
 DERBY-801-v2.patch, DERBY-801-v3.patch, DERBY-801-v4.patch, 
 DERBY-801-v5.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-11-02 Thread Knut Anders Hatlen (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12446607 ] 

Knut Anders Hatlen commented on DERBY-801:
--

I couldn't build the jar files with this patch because the build system 
expected cloudscape.config.rawStore.data.generic to have a J1/J4 suffix. When I 
removed the old cloudscape.config line from modules.properties and inserted one 
with J1 and one with J4, the jars were built just fine. I verified that 
RAFContainer4 was loaded when using JDK 1.5. Derbyall and the JUnit tests ran 
cleanly. Committed the modified patch with revision 470362. Thanks Anders!

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 DERBY-801-v4.patch, DERBY-801-v5.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-11-02 Thread Knut Anders Hatlen (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12446609 ] 

Knut Anders Hatlen commented on DERBY-801:
--

 if we agree that closing a container that is being dropped while IO is still 
 active on it is sane, it shouldn't be a source of noise.

I think that sounds OK. You have already code in writePage() to take care of 
this situation. Should readPage() have the same logic (that is, catching 
IOExceptions and checking committed-drop state)?

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 DERBY-801-v4.patch, DERBY-801-v5.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-11-02 Thread Knut Anders Hatlen (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12446767 ] 

Knut Anders Hatlen commented on DERBY-801:
--

Thanks Anders. I committed DERBY-801-6.patch with revision 470573.

 I'm not sure checking for the committed drop state in the readPage
 method is something we want to do - flushing the cache into a black
 hole is one thing, trying to read data that is supposed to be gone
 is another

You are of course right about the error checking in readPage().

 I've sort of got a feeling that maybe we are masking a problem here
 - should anyone (even the cache?) write to a dropped container?

Actually, I think it's the other way around. We are not writing to a
dropped container, but dropping a container which we happen to be
writing to. Since the dropping of the container has been committed, we
know that we'll never need those pages, so there's no need to complete
the write operations. The alternative would be to let closeContainer()
wait until all write operations on the container have finished.

 Maybe we should remove pages destined for a dying container from the
 cache when the container is dropped?

After the container has been dropped, it shouldn't be a problem
because getCommittedDropState() returns true, so writePage() will
return immediately without trying to write the page. Also, the pages
will be marked as invalid so that the cache space can be reclaimed.

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-6.patch, DERBY-801-v2.patch, 
 DERBY-801-v3.patch, DERBY-801-v4.patch, DERBY-801-v5.patch, 
 NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-11-01 Thread Anders Morken (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12446493 ] 

Anders Morken commented on DERBY-801:
-

 Maybe moving the committedDropState check to the start of writePage and 
 skipping the write altogether if it is in that state is a good approach?

Never mind my ramblings, we already do that. OK, maybe we could check the 
committedDropState in the (iosInProgress == 0) assertion in closePage as well? 
All of this mess (iosInProgress and the associated assertions) is intended as a 
sanity checking aid, and if we agree that closing a container that is being 
dropped while IO is still active on it is sane, it shouldn't be a source of 
noise.

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 DERBY-801-v4.patch, DERBY-801-v5.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-10-18 Thread Knut Anders Hatlen (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12443179 ] 

Knut Anders Hatlen commented on DERBY-801:
--

Army Brown reported an intermittent assert failure in a comment to DERBY-1976. 
It is probably related to this issue. The assert error is thrown when a DROP 
TABLE is committed. Could it be that a checkpoint is writing pages in that 
table's container while the container is dropped/closed? Only guessing...

 I ran the JUnit suite suites.All against ibm142, jdk142, ibm15, jdk15, and 
 jdk16 on a Windows 2000 machine.  The only failure I saw was an intermittent 
 failure in LobLengthTest on jdk15:

 There was 1 error:
 1) 
 testLongLobLengths(org.apache.derbyTesting.functionTests.tests.jdbcapi.LobLengthTest)
java.sql.SQL Exception: DERBY SQL error: SQLCODE: -1, SQLSTATE: XJ001, 
 SQLERRMC:
org.apache.derby.shared.common.sanity.AssertFailure#ASSERT FAILED 
 Container closed
while IO operations are in progress.  This should not happen.#XJ001.U
 at 
 org.apache.derby.client.am.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:46)
 at 
 org.apache.derby.client.am.SqlException.getSQLException(SqlException.java:345)
 at org.apache.derby.client.am.Connection.commit(Connection.java:555)
 at 
 org.apache.derbyTesting.junit.BaseJDBCTestCase.commit(BaseJDBCTestCase.java:159)
 at 
 org.apache.derbyTesting.functionTests.tests.jdbcapi.LobLengthTest.tearDown(LobLengthTest.java:87)
 at 
 org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:76)
 at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
 at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
 at junit.extensions.TestSetup.run(TestSetup.java:23)
 at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
 at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
 at junit.extensions.TestSetup.run(TestSetup.java:23)
 at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
 at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
 at junit.extensions.TestSetup.run(TestSetup.java:23)
Caused by: org.apache.derby.client.am.SqlException: DERBY SQL error: 
 SQLCODE: -1,
SQLSTATE: XJ001, SQLERRMC: 
 org.apache.derby.shared.common.sanity.AssertFailure#ASSERT
FAILED Container closed while IO operations are in progress.  This should 
 not happen.#XJ001.U
 at 
 org.apache.derby.client.am.Connection.completeSqlca(Connection.java:1920)
 at 
 org.apache.derby.client.net.NetConnectionReply.parseRDBCMMreply(NetConnectionReply.java:215)
 at 
 org.apache.derby.client.net.NetConnectionReply.readLocalCommit(NetConnectionReply.java:147)
 at 
 org.apache.derby.client.net.ConnectionReply.readLocalCommit(ConnectionReply.java:43)
 at 
 org.apache.derby.client.net.NetConnection.readLocalCommit_(NetConnection.java:1574)
 at 
 org.apache.derby.client.am.Connection.readCommit(Connection.java:639)
 at 
 org.apache.derby.client.am.Connection.flowCommit(Connection.java:588)
 at org.apache.derby.client.am.Connection.commit(Connection.java:551)
 ... 33 more

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 DERBY-801-v4.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is 

[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-10-18 Thread JIRA
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12443260 ] 

Øystein Grøvlen commented on DERBY-801:
---

I saw the above error when developing the LobLengthTest, but it went away after 
I made some changes to the test.  One of the changes I made was that the test 
was run earlier in the suite.  Knut Anders' guess may explain this since my 
changes may have changed the possibility of running the test concurrently with 
checkpointing.  The reason the problem hits this test may be that the table has 
probably allocated more blocks that any of the other tests in the suite (It 
inserts a16MB blob).  

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 DERBY-801-v4.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-10-18 Thread Anders Morken (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12443338 ] 

Anders Morken commented on DERBY-801:
-

Yeah, looks like I've let in some critters. I'll give it a bit of a kickin' and 
see if I can figure it out. =)

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 DERBY-801-v4.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-10-17 Thread Knut Anders Hatlen (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12443040 ] 

Knut Anders Hatlen commented on DERBY-801:
--

I'm wondering whether we should change the DEBUG_PRINT calls to 
ASSERT/THROWASSERT. Falling back to the old implementation on errors is a good 
approach in insane mode, but I think the errors should be exposed in sane mode 
so we can see them and fix them.

After running the JUnit tests (java junit.textui.TestRunner 
org.apache.derbyTesting.functionTests.suites.All) I noticed that derby.log 
contained this message (but no tests failed):

DEBUG RAFContainerFactory OUTPUT: Caught exception when setting up 
rafContainerConstructor

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 DERBY-801-v4.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-09-19 Thread Anders Morken (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12435677 ] 

Anders Morken commented on DERBY-801:
-

Suresh, Knut Anders, thanks for your advice. There's no need to rush this patch 
for my part, I'll be happy to provide a new one incorporating Suresh' 
suggestions. They probably end up making the whole patch smaller, as most of it 
is about removing redundant code (such as my homegrown RAFContainerFactory). 
I've already fixed 2) and 3) in my working copy, and I plan to take a long, 
hard look at 1) tonight. =)

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.1.1, 10.1.1.2, 
 10.1.2.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 DERBY-801-v4.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-09-19 Thread Knut Anders Hatlen (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12435752 ] 

Knut Anders Hatlen commented on DERBY-801:
--

Committed v4 into trunk with revision 447815. Fixed Suresh's comments 1-3 in 
RAFContainer4 before committing. Now that RAFContainer is no longer used in jdk 
1.5, I will back out the patch for DERBY-733.

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.1.1, 10.1.1.2, 
 10.1.2.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 DERBY-801-v4.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-09-19 Thread Knut Anders Hatlen (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12435798 ] 

Knut Anders Hatlen commented on DERBY-801:
--

Backed out the changes for DERBY-733 with the following commands:
  svn merge -r 357275:357274 .
  svn merge -r 356884:356883 .
Derbyall ran cleanly. Committed revision 447856.

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.1.1, 10.1.1.2, 
 10.1.2.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 DERBY-801-v4.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-09-18 Thread Anders Morken
Suresh Thalamati (JIRA):
 [ 
 http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12434613 
 ] 
 
 Suresh Thalamati commented on DERBY-801:
 
 
 Thanks for working on this issue, Anders. I really like your solution to solve
 this issue. Patch is very good, I have only few minor comments/questions.
  I am really sorry  for not reviewing it sooner. 

Heh, my progress hasn't exactly been fast either. =)

 RAFContainerFactory.java
 
 
 Logic in this new class seems to be deciding whether to load 
 RafContainer.java or
 the RafContainer4.java based on the JVM.  I am not sure, if  this logic is
 necessary here. Did you consider using basic services to load the java 
 classes 
 specific to a JVM ?

Can you do that? Never occured to me, I'll check it out. Thanks for the
tip. Any pointers and tips would be welcome, but given enough time and
source code perusal I'll probably figure it out on my own as well. From
a quick look at modules.properties it looks pretty straightforward, at
least if adding J4 to a module name means load this in Java 1.4 and
up. =)

 I think basic services has support to boot a specific factory implementation
 based on the JVM using modules.properties. For example in the current
 scenario, one can extend BaseDataFileFactory.java class to  
 implement newContainerObject(), which will return the RafContainer4( ..).  add
 the new class to modules.properties to boot only on versions =jdk14.
 
 
 
 In RafContainer4.java :
 -
 
 1) I think following import is not needed. 
 
 +import java.io.*; 

I've narrowed it down to java.io.IOException. =)

 
 2) Is it really necessary to rewind() the buffers in readFull/writeFull ? 
 From what I understood, 
there is a new ByteBuffer object being created on  both read/write page
methods.  
 
 +dstBuffer.rewind(); // Reset buffer position before we start read
 and 
 +srcBuffer.rewind(); // Reset buffer position before we start writing.

Naah, not really. Just unnecessary paranoia. =)

 3) do we really need the following method ? 
 
 +final protected FileChannel getChannel() {
 +return ourChannel;
 +}

Naah, not really. It acted as a synchronization point at one time in the
life of the patch, but the synchronization has been made excplicit in
readPage() and writePage() instead at a later stage. =)

 4) I noticed, there is new encryption buffer created on every writePage() 
 call,
if the database is encrypted. This may cause jvm peak memory usage 
 increase,
when there is a checkpoint, if there are lot of dirty pages in the cache 
 and
if garbage collection is not happening fast enough.  I hope this does not
lead to out of memory errors!
 
We may need to implement some kind of scheme, that will help in reuse of
the encryption buffers. 

This is a (classical) choice between concurrency and resource
consumption.  On one hand, object creation and garbage collection from
the eden space is pretty cheap on 1.4 and newer VMs, but we'll force
additional GC runs... This is a complex subject, with many variables. =)

I suppose it could be interesting to do a little throughput testing with
this patch and an encrypted database and see if there is any impact. I
can't really say that I have any idea if recycling a single buffer or
creating lots of buffers is the cheaper way...

 5) I am ok with readPage() and writePage() routines in RafContainer4.java. 
just curious , if you considered implementing  new read/write..etc  calls 
 in 
the RafContainer4.java using file channel and just wrapper methods in the
RafContainer.java using the existing random access file,  instead of 
overriding readPage()/writePage() ...etc. 

Uhm, I'm sorry, I'm not sure I understand what you're thinking of here?
Could you elaborate a bit? =)

When I started work on this patch, readPage() and writePage() just
seemed like the smallest pieces of RAFContainer that could be replaced
easily to retrofit NIO support. I agree if you think the current
solution is a bit of a bolt-on fix, but it was relatively non-invasive
in the original code (which appealed to me as a newbie to Derby) and
hopefully efficient enough on new JVMs anyway.

If we wanted to go even further we could consider using memory mapped IO
- then the parts of Derby that modify data pages could access the memory
mapped file directly and save some overhead, but that probably requires
a bit of a rework of some interfaces and the code that modifies data
pages. And then we have the case of large databases and 32-bit address
spaces... =)

 6) Please file a JIRA  to enhance StorageFactory interfaces to support NIO. 

There's already DERBY-262 in Jira, but it's a bit unspecific... Anything
in particular you think needs to be added to the interfaces?

Thanks for your advice,
-- 
Anders Morken

My opinions may have changed, but not the fact that I am right!


Re: [jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-09-18 Thread Suresh Thalamati

Anders Morken wrote:

Suresh Thalamati (JIRA):

   [ http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12434613 ] 
   


snip ..



RAFContainerFactory.java


Logic in this new class seems to be deciding whether to load RafContainer.java 
or
the RafContainer4.java based on the JVM.  I am not sure, if  this logic is
necessary here. Did you consider using basic services to load the java classes 
specific to a JVM ?



Can you do that? Never occured to me, I'll check it out. Thanks for the
tip. Any pointers and tips would be welcome, but given enough time and
source code perusal I'll probably figure it out on my own as well. From
a quick look at modules.properties it looks pretty straightforward, at
least if adding J4 to a module name means load this in Java 1.4 and
up. =)


you are are the right track. I have used this mechanism long time ago, you
might need to  something like:

Replace the current data factory module entry
(:derby.module.rawStore.data.generic) with the following entries :

derby.module.rawStore.dataJ1=rg.apache.derby.impl.store.raw.data.BaseDataFileFactory
derby.env.jdk.dataJ1=1
derby.config.rawStore.dataJ1=derby

derby.module.rawStore.dataJ4=org.apache.derby.impl.store.raw.data.DataFileFactoryJava4
derby.env.jdk.dataJ4=4
derby.config.rawStore.dataJ4=derby

DataFileFactoryJava4 will be the new class extended from 
BaseDataFileFactory.









I think basic services has support to boot a specific factory implementation
based on the JVM using modules.properties. For example in the current
scenario, one can extend BaseDataFileFactory.java class to  
implement newContainerObject(), which will return the RafContainer4( ..).  add

the new class to modules.properties to boot only on versions =jdk14.








4) I noticed, there is new encryption buffer created on every writePage() call,
  if the database is encrypted. This may cause jvm peak memory usage increase,
  when there is a checkpoint, if there are lot of dirty pages in the cache and
  if garbage collection is not happening fast enough.  I hope this does not
  lead to out of memory errors!

  We may need to implement some kind of scheme, that will help in reuse of
  the encryption buffers. 



This is a (classical) choice between concurrency and resource
consumption.  On one hand, object creation and garbage collection from
the eden space is pretty cheap on 1.4 and newer VMs, but we'll force
additional GC runs... This is a complex subject, with many variables. =)

I suppose it could be interesting to do a little throughput testing with
this patch and an encrypted database and see if there is any impact. I
can't really say that I have any idea if recycling a single buffer or
creating lots of buffers is the cheaper way...



I agree, new jvms are smarter. This problem can be addressed later, if 
needed. On older jvms I used to run into out of memory problem when I 
do some thing like that.






5) I am ok with readPage() and writePage() routines in RafContainer4.java. 
  just curious , if you considered implementing  new read/write..etc  calls in 
  the RafContainer4.java using file channel and just wrapper methods in the
  RafContainer.java using the existing random access file,  instead of 
  overriding readPage()/writePage() ...etc. 



Uhm, I'm sorry, I'm not sure I understand what you're thinking of here?
Could you elaborate a bit? =)


To avoid dealing with changes to two copies of readPage/writePage 
routines in
future fixes; I would have tried to abstract out just the code that is 
really

different in jdk14 vs jdk13. For example :


in RafContainer.java , implement :

writeBuffer( .) {
  synchronized(this) {
 fileData.seek( ...)
 fileData.write( ..)
  }
}


and in RafContainer4.java, override that routine :

writeBuffer() {

 iochannel.write( ...)

}

similarly any other functionality you need.  I undestand, it adds an 
extra level to the call stack; my guess is that should not have major 
performance impact.


For all I know , it may not be that simple ; Different exceptions 
..etc. Please feel free to ignore this suggestion, if you think 
current approach is good enough.





If we wanted to go even further we could consider using memory mapped IO
- then the parts of Derby that modify data pages could access the memory
mapped file directly and save some overhead, but that probably requires
a bit of a rework of some interfaces and the code that modifies data
pages. And then we have the case of large databases and 32-bit address
spaces... =)



It would be intersting to find, if that improves the performance further.




6) Please file a JIRA  to enhance StorageFactory interfaces to support NIO. 



There's already DERBY-262 in Jira, but it's a bit unspecific... Anything
in particular you think needs to be added to the interfaces?


DERBY-262 is a very generic NIO entry. I think it would be better to 
file a separate JIRA entry that will describe the changes needed 

[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-09-18 Thread Suresh Thalamati (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12435610 ] 

Suresh Thalamati commented on DERBY-801:


Hi Knut , 

Patch is good. It  is  ok with  me ,  if you would like to commit  the current 
patch and let Anders address, other enhancements in the followup patches.  

Thanks
-suersh


 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.1.1, 10.1.1.2, 
 10.1.2.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 DERBY-801-v4.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-09-15 Thread Knut Anders Hatlen (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12434965 ] 

Knut Anders Hatlen commented on DERBY-801:
--

I will hold the commit until Anders has responded to the questions from Suresh.

1, 2 and 3 seem like trivial changes which should be easy to fix. Suresh, if 
these were fixed, do you think the patch could be committed and the other 
issues addressed in followup patches? I think it sounds reasonable to make 
these changes incrementally (if only to make it easier to review the next 
revision of the patch). The current patch clearly is an improvement, and the 
optimizations/improvements suggested by Suresh would improve it further. 

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.1.1, 10.1.1.2, 
 10.1.2.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 DERBY-801-v4.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-09-14 Thread Suresh Thalamati (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12434613 ] 

Suresh Thalamati commented on DERBY-801:


Thanks for working on this issue, Anders. I really like your solution to solve
this issue. Patch is very good, I have only few minor comments/questions.
 I am really sorry  for not reviewing it sooner. 


RAFContainerFactory.java


Logic in this new class seems to be deciding whether to load RafContainer.java 
or
the RafContainer4.java based on the JVM.  I am not sure, if  this logic is
necessary here. Did you consider using basic services to load the java classes 
specific to a JVM ?

I think basic services has support to boot a specific factory implementation
based on the JVM using modules.properties. For example in the current
scenario, one can extend BaseDataFileFactory.java class to  
implement newContainerObject(), which will return the RafContainer4( ..).  add
the new class to modules.properties to boot only on versions =jdk14.



In RafContainer4.java :
-

1) I think following import is not needed. 

+import java.io.*; 

2) Is it really necessary to rewind() the buffers in readFull/writeFull ? From 
what I understood, 
   there is a new ByteBuffer object being created on  both read/write page
   methods.  

+dstBuffer.rewind(); // Reset buffer position before we start read
and 
+srcBuffer.rewind(); // Reset buffer position before we start writing.


3) do we really need the following method ? 

+final protected FileChannel getChannel() {
+return ourChannel;
+}

4) I noticed, there is new encryption buffer created on every writePage() call,
   if the database is encrypted. This may cause jvm peak memory usage increase,
   when there is a checkpoint, if there are lot of dirty pages in the cache and
   if garbage collection is not happening fast enough.  I hope this does not
   lead to out of memory errors!

   We may need to implement some kind of scheme, that will help in reuse of
   the encryption buffers. 
 
5) I am ok with readPage() and writePage() routines in RafContainer4.java. 
   just curious , if you considered implementing  new read/write..etc  calls in 
   the RafContainer4.java using file channel and just wrapper methods in the
   RafContainer.java using the existing random access file,  instead of 
   overriding readPage()/writePage() ...etc. 

6) Please file a JIRA  to enhance StorageFactory interfaces to support NIO. 


/suresh

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.1.1, 10.1.1.2, 
 10.1.2.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 DERBY-801-v4.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-09-14 Thread Knut Anders Hatlen
Suresh Thalamati (JIRA) derby-dev@db.apache.org writes:

 4) I noticed, there is new encryption buffer created on every
 writePage() call, if the database is encrypted. This may cause jvm
 peak memory usage increase, when there is a checkpoint, if there are
 lot of dirty pages in the cache and if garbage collection is not
 happening fast enough.  I hope this does not lead to out of memory
 errors!

This shouldn't cause out of memory errors. If the buffers are not
gc'ed fast enough, the checkpoint thread will be stalled until there
is enough memory for it to continue. Since the checkpoint thread only
writes one page at a time, the gc thread should be able to free all of
the old buffers.

-- 
Knut Anders


[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-08-24 Thread Anders Morken (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12430262 ] 

Anders Morken commented on DERBY-801:
-

Oh, and yes, it still passes tests. storeall ran flawlessly, derbyall failed in 
the derbynetmats suite's jdbcapi/checkDriver.java test because it got a 
connection refused error when connecting. Hopefully unrelated. =)

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.1.1, 10.1.1.2, 
 10.1.2.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, DERBY-801-v3.patch, 
 NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-08-18 Thread Knut Anders Hatlen (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12429015 ] 

Knut Anders Hatlen commented on DERBY-801:
--

Hi Anders,

Thank you for working on this issue! I have looked at your patch, and
my impression is that your approach looks good and complete. A couple
of comments and questions follows.

First to your questions in JIRA and in the comments:

1) I don't think casting dataFile to RandomAccessFile is such an ugly
   hack. You check that it is an instance of RandomAccessFile before
   you cast it, and fall back to the old behaviour if it is not. Seems
   perfectly safe to me.

   If you're really determined to clean up the interfaces, I would
   suggest that the implementations of StorageRandomAccessFile
   contained a RandomAccessFile instance instead of extending the
   class. Then the StorageRandomAccessFile interface could be changed
   to provide a FileChannel-like abstraction. In that case, there
   would be only one implementation of RAFContainer (and it would look
   very much like your RAFContainer4), but the implementations of
   StorageRandomAccessFile would have to differ between different VMs
   (that is, on jvm=1.4 they would use FileChannel under the hood, on
   jvm 1.3 they would pretend that they did, but actually serialize
   reads and writes).

2) In a comment to a try/finally statement where the finally clause
   only contains debug code, you write that you hope the compiler will
   optimize it away. I think this is a reasonable expectation.

Then to my own comments:

3) I'm not sure the access to needsSync is thread safe even though you
   have declared it as volatile. All accesses to it in RAFContainer
   are synchronized on the RAFContainer instance, but the one in
   RAFContainer4 is not. I think this can lead to race conditions,
   such as:

 Thread 1 is invoking RAFContainer.clean() which calls
 writeRAFHeader() and clearDirty() in a synchronized block.

 At the same time, thread 2 is executing RAFContainer4.writePage()
 which contains the assignment needsSync = true without any
 synchronization.

   It is possible that needsSync is assigned to true after thread 1
   invokes writeRAFHeader() but before it invokes clearDirty(). Since
   clearDirty() sets needsSync to false, the second thread's request
   for syncing disappears.

   If the assignment were changed to
  synchronized (this) { needsSync = true; }
   I think it would be thread safe (there would be no way to change
   the value of needsSync between writeRAFHeader() and clearDirty()).

4) Is the synchronization in RAFContainer4.updatePageArray() needed?
   There is no synchronization in RAFContainer.updatePageArray(). To
   me, it seems like RAFContainer.updatePageArray() could be used
   directly (but you'll have to check whether the database is
   encrypted and allocate a new encryption buffer if it is). Is there
   a particular reason why RAFContainer4 needs its own
   updatePageArray()?

5) Should there have been iosInProgress++ and iosInProgress-- in the
   if (syncPage) part of RAFContainer4.writePage() too?

6) The two new java files added by your patch use a mix of tabs and
   spaces as indentation character. It would be better if they
   consistently used spaces. And extra bonus if none of the lines
   exceed 80 characters. :)

7) Apache has decided that we should use another licence notice in the
   file headers. You could just copy the new text from another file
   (you might need to run 'svn up' since this was changed recently).

8) The RAFContainer4 class and the RAFContainerFactory class could be
   package protected, not public.

9) In RAFContainer4, ourChannel and iosInProgress could be private.

10) In RAFContainerFactory, the comment to rafContainerConstructor
says Immutable, initialized by constructor. Since this is the
case, I would prefer that it was made explicit in the code by
declaring it as final. And it could be made private (and maybe
static).

11) RAFContainer.padFile() was changed from private to protected, but
that is not needed since padFile() isn't used in RAFContainer4.

12) Javadoc for RAFContainer4.writeFully() says readFully instead of
writeFully.

13) The javadocs for the classes look great! If you could add p
between the paragraphs, they would look great after they have been
transformed into HTML files as well. :)

14) RAFContainerFactory.newRAFContainer() could be written more
compactly if return new RAFContainer(factory); were moved out of
the catch clause and the else clause. :)

With the exception of 3 and 4, these comments are really minor
nits. (I was in a picky mood today, sorry! ;) ). You have done a great
job! Thank you very much!

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby

[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-08-18 Thread Knut Anders Hatlen (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12429018 ] 

Knut Anders Hatlen commented on DERBY-801:
--

When this fix gets into the codeline, the fix for DERBY-733 could be backed out.

 Allow parallel access to data files.
 

 Key: DERBY-801
 URL: http://issues.apache.org/jira/browse/DERBY-801
 Project: Derby
  Issue Type: Improvement
  Components: Performance, Store
Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.1.1, 10.1.1.2, 
 10.1.2.0, 10.1.2.1
 Environment: Any
Reporter: Øystein Grøvlen
 Assigned To: Anders Morken
 Attachments: DERBY-801-v2.patch, NIO-RAFContainer-v1.patch


 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-05-30 Thread Oystein Grovlen - Sun Norway

Anders Morken (JIRA) wrote:
[ http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12413541 ] 


Anders Morken commented on DERBY-801:
-

I've found a benchmark to test this patch with, and quite frankly I'm not seeing any difference in throughput between my patch and the original at all. 
I've tested on a 4x400MHz Sun Enterprise 450 with a 10-disk ZFS raid0 of old disks for data and logs and my own single-cpu 2,4GHz Athlon64 with a two disk raid0 for data and a single disk for logs.


How large was your database?  Note that if the database is not 
substantially larger that the amount of physical memory, the file system 
may able to cache most of your data.  In that case, you will not gain 
much extra from concurrent I/O.


A metric I have found useful to is the average and maximum times for I/O 
requests.  That can give some indications on the queuing effects of I/O. 
 I have just measured the time by calls to System.currentTimeMillis() 
in readPage/writePage.


--
Øystein


[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-05-30 Thread JIRA
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12413841 ] 

Øystein Grøvlen commented on DERBY-801:
---

I have run our tpc-b like benchmark on with and without your patch on a 2-CPU 
opteron box with 2 local disks running Linux. 

Results:

Without patch:  Throughput: 70 tps.  Max response time: 47.8 seconds
With patch: Throughput: 106 tps. Max response time: 4.5 seconds

All tests were run on the same database (17.5 GB) with a 500 MB page cache.   
For each code version I ran 2 tests each lasting 1 hour.

I think this looks very good.  It increases throughput with 50% and reduces the 
max response time with 90%.  I think this is comparable to the results I saw 
with multiple file descriptors per file.  However, your solution is much 
cleaner.  Do think much work is remaining on this patch to make it ready for 
production?



 Allow parallel access to data files.
 

  Key: DERBY-801
  URL: http://issues.apache.org/jira/browse/DERBY-801
  Project: Derby
 Type: Improvement

   Components: Performance, Store
 Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.1.1, 10.1.1.2, 10.1.2.0, 
 10.1.2.1
  Environment: Any
 Reporter: Øystein Grøvlen
  Attachments: NIO-RAFContainer-v1.patch

 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-05-30 Thread Anders Morken (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12413925 ] 

Anders Morken commented on DERBY-801:
-

Ooh, very cool to see those numbers. Thanks for testing it despite my
pessimism. I probably haven't been running the benchmark I found
floating around the net on a big enough scale - while I have generated a
16.5 GB database, I haven't been patient enough to run the tests
properly. =)

Off the top of my head, things that need fixing are:

1) The class loading/initialization tricks in BaseDataFileFactory -
there's no need to do all that reflection every time we open a
container. Could probably be done in a static initializer, the boot
method or a constructor.  (Which one is appropriate? The boot method?)

2) A couple of hackish casts in the wrapping methods that retrieve the
FileChannel object when the Container's identity is set. Dunno if these
should be left in or we should change the StorageRandomAccessFile
interface to extend java.io.RandomAccessFile? Both the two
implementations are extensions of java.io.RandomAccessFile, so the this
cast works assumption is pretty safe (as well as defensively
implemented) now, but assumption is the mother of all **ck-ups? =)

3) Handling exceptions from FileChannel properly. The current code
handles IOExceptions by padding the file and trying again. I have no
idea if the pad-the-file trick is of any use at all with FileChannel -
it was simply retained from the original implementation. Maybe padFile
should be refitted for FileChannel as well?

3.5) There's probably a bug in the original implementation of
RAFContainer#writePage(): If the catch(IOException e) {...try again...}
path is executed, updatePageArray() is not called, so modifications such
as adding the container header to the first page will be done (unless it
was done before the IOException was thrown) - and perhaps a security
issue: the page written will not be encrypted. The fact that this hasn't
been discovered by encryption tests is probably an indicator that this
codepath doesn't succeed where the first attempt failed very often.
Anyway, I'll make a separate Jira issue for this.

4) Skip more synchronization? Low priority, but I think there's a few
cases where one or more synchronizations could be merged into one block
or removed altogether - but thread safety is a delicate matter.

5) And last but not least, anything else code review turns up, of
course. =)

I'll see if I have some time to work on this later this week. Thanks for
the help, Øystein. =)

 Allow parallel access to data files.
 

  Key: DERBY-801
  URL: http://issues.apache.org/jira/browse/DERBY-801
  Project: Derby
 Type: Improvement

   Components: Performance, Store
 Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.1.1, 10.1.1.2, 10.1.2.0, 
 10.1.2.1
  Environment: Any
 Reporter: Øystein Grøvlen
  Attachments: NIO-RAFContainer-v1.patch

 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



[jira] Commented: (DERBY-801) Allow parallel access to data files.

2006-05-26 Thread Anders Morken (JIRA)
[ 
http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12413541 ] 

Anders Morken commented on DERBY-801:
-

I've found a benchmark to test this patch with, and quite frankly I'm not 
seeing any difference in throughput between my patch and the original at all. 
I've tested on a 4x400MHz Sun Enterprise 450 with a 10-disk ZFS raid0 of old 
disks for data and logs and my own single-cpu 2,4GHz Athlon64 with a two disk 
raid0 for data and a single disk for logs.

Anyway, I think I'll need to work a bit harder on making more of the 
RAFContainer methods called by readPage and writePage thread safe, so we can 
ditch more of the synchronization still left in them. Øystein, don't waste too 
much time on testing this patch yet, it needs more work. =)

 Allow parallel access to data files.
 

  Key: DERBY-801
  URL: http://issues.apache.org/jira/browse/DERBY-801
  Project: Derby
 Type: Improvement

   Components: Performance, Store
 Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.1.1, 10.1.1.2, 10.1.2.0, 
 10.1.2.1
  Environment: Any
 Reporter: Øystein Grøvlen
  Attachments: NIO-RAFContainer-v1.patch

 Derby currently serializes accesses to a data file.  For example, the
 implementation of RAFContainer.readPage is as follows:
 synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
 fileData.seek(pageOffset);  // fileData is a RandomAccessFile
 fileData.readFully(pageData, 0, pageSize);
 }
 I have experiemented with a patch where I have introduced several file
 descriptors (RandomAccessFile objects) per RAFContainer.  These are
 used for reading.  The principle is that when all readers are busy, a
 readPage request will create a new reader.  (There is a maximum number
 of readers.)  With this patch, throughput was improved by 50% on
 linux.  For more discussion on this, see
 http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
 The challenge with the suggested approach is to make a mechanism to
 limit the number of open file descpriptors.  Mike Matrigali has
 suggested to use the existing CacheManager infrastructure for this
 purpose.  For a discussion on that, see:
 http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira