"Mike Matrigali" wrote: > I am happy to answer any specific questions about the store module you > have, or can find someone to answer them if I don't know. I > think questions should be directed to the mail list so that everyone > can share in the discussion. If I miss something feel free to ping > me to get an answer. I am an IBM employee still working on the derby > codeline and was one of the original developers at Cloudscape when it > first began as a startup.
Thank you very much, this will really help me to get started. > Is there any model/standard in apache or opensource that we could follow > or at least look at for a format for this kind of information. I have some thoughts on how I want to do this - these are given below. > I personally really like design information embedded in the actual code > it relates to, rather than having separate documents. I think that well documented code is always a good idea. And there is an argument that documentation closer to the code is more likely to be kept up-to-date. It is also a great help to anyone fixing the code. However, separate documents are also very useful. Specially because you can illustrate concepts and design with nice diagrams and flow charts. It also enables people to grasp the design more easily, as things can be explained top-down rather than jumping straight to implementation details. > Is there any more specific area you are looking at, rather than just > store. > I usually divide the store into the following, some overlap: > o store module interface (java protocol used by other modules to > interact with store, this is actually fairly well documented in the > javadoc already available) > o access methods (btree/heap existing, future might be gist/rtree/...) > o recovery (both restart and crash recovery, logging, xa recovery, ...) > o transactions (locking, isolation levels, xa transactions) > o data (actual physical storage of rows/columns on pages, page > management, cache - the format of rows/columns/pages is reasonably > documented in the code - I think in the javadoc, but might just be in > comments) > o misc (sorting, encryption, ...) Well, I shall be documenting the whole of it., but will start with a particular item, and then move on to other items. I think that documentation should be at two levels. The more useful level is where ideas and concepts are explained. I am sure that Cloudscape was built upon ideas and concepts that had been invented before by others, and perhaps built its own innovations upon those foundations. So I want to explore the origins of those ideas, where did the ideas come from? How did Cloudscape implementation use those ideas? I'd like to give some examples to illustrate what I mean. If we take the transaction manager for a start. As far as I understand, Derby uses the ARIES algorithm for write-ahead logging. What I'd like to understand and document is how the Derby implementation of ARIES is different from the published algorithms. Does Derby use nested top-level actions or sub transactions to manage internal atomic operations? I think I saw a comment in the code somehere that Derby does not yet implement the transaction list in checkpoint records. I assume that Derby does page-oriented redo during recovery redo pass, and then page-oriented or logical undo during the undo pass. For the BTREE implementation, I'd like to understand whether Derby implements ARIES/IM or some other algorithm. How does it handle structural consistency of the tree during concurrent updates? What are the locking/latching implications of the algorithm used? Are accesses to the entire tree or subtree serialised during structural modifications? How is repeatable read ensured? Does Derby use data locking row or index row locking? Does Derby delete rows logically first, and physically afterwards? How does it determine that a deleted row can be physically removed, ie, the transaction that deleted the row has committed? I understand that Derby uses Log Actions to perform both normal operations and recovery operations. Thus a key insert is first encoded as a log operation, logged, and then executed. This is very neat. I also get the impression by reading the code that a BTREE is implemented on top of a regular table like structure. This is also very neat as it resuses all the code for handling rows, locking, etc. These are a few examples, and I could list several more. What would be very useful is if you and other IBM developers could identify the algorithms that Derby uses. Perhaps you could provide references to published research papers, books, and patents. I could start by writing a high level overview of the concepts used by Derby, and gradually build up the detail by reading the code, and by asking questions in this forum. The second level of documentation can be built from the comments in the code. This is the implementation level where I can describe how Derby actually implements something, what the file formats are, how the write ahead log is implementated, what a conglomerate is, what a record id and how it is different from slot id, etc. In terms of starting this level of documentation, I would first like to concentrate on the "raw" module, which is I suspect is at the heart of all other modules. Thanks and Regards Dibyendu
