Oh! Wow, I'm happy that the video was useful to someone :) Tiago
----- Original Message ---- From: Jayaram Subramanian <[email protected]> To: [email protected] Sent: Fri, 9 April, 2010 2:56:53 Subject: Re: DRDA - An abridged tutorial Thanks Tiago. Also special thanks to your derby installation video, thrpugh which i installed derby in my pc Wiy On Thu, Apr 8, 2010 at 2:49 PM, Tiago Espinha <[email protected]> wrote: > Dear all, > > Earlier today Kathey gave me a crash course on DRDA for my GSoC project. > Jayaram asked Kathey for a transcript of our conversation and Kathey > suggested that I'd send it to the list, so that other contributors could spot > eventual mistakes. If you find any, please feel free to chip in with your > knowledge. > > Here goes: > ---------------------8<---------------------- > <kmarsden> DRDA - Distributed Relational Database Architecture. > <kmarsden> Basically it is a protocol that shuttles database requests from a > client over the network to a server. The call the client an Application > Requester and the server an Application Server. > <kmarsden> The Application Server term predates what we think of as an > Application Server and has nothing to do with it. > <etiago> ok > <kmarsden> So as we discussed, the derby client JDBC Driver (our Application > Requester) converts JDBC calls into DRDA, sends the DRDA accross the network > to Network Server (our Application Server) which converts them back into JDBC > which it sends to the embedded driver. > <kmarsden> So the whole thing is a JDBC to DRDA to JDBC converter. This way > we meet the requirement of having multiple jvms on multiple machines > accessing a single embedded database, because everything gets routed through > the network server process. > <kmarsden> Make sense? > <etiago> it does > <kmarsden> DRDA is mostly associated historically with DB2 but there are > actually a lot of licensed DRDA vendors, including Microsoft > <etiago> so let me try to establish a comparison here > <etiago> DRDA is sort of a platform-agnostic way of transferring database > requests over the network, is this right? > <etiago> something like XML > <kmarsden> yes. That's right. It can go over TCP/IP or something called > SNA, but yes it is platform-agnostic. As you may guess I think it started on > the mainframe with EBCDIC encoding. > <kmarsden> Having a standard protocol has allowed even for a single client to > be shared amongst several databases. > <kmarsden> For example the IBM DB2 Universal JDBC Driver (JCC) used to be > used with Derby and is still used with Informix as well as of course many > flavors of DB2. > <kmarsden> DRDA iis also used with the DB2 C client. Not just JDBC > <etiago> I see > <kmarsden> In practice it is not as portable as you would like. It is > generally a lot of work to get one product's DRDA client working with a > different server. > <etiago> ok, is DRDA just plaintext? I was looking at the trace and I saw > that there's an hexadecimal, ASCII and EBCDIC representations of the data > <kmarsden> The encoding for data can be described in the protocol flow. Up > until now the DDM commands and parameters (which we will discuss in a bit) > were all in EBCDIC. > <etiago> ok > <kmarsden> UNICODEMGR allows them (except for the earliest) to be in UTF-8 > <kmarsden> So there are three volumes to the manuals. Volume 1 DRDA describes > the protocol flow. > <kmarsden> It shows the commands and what order they flow in from AR to AS > and back. > <kmarsden> But to read it you need to look at Volume 2, which describes the > commands and objects that are being flowed in detail > <kmarsden> The DDM manual shows a 2 byte "Codepoint" for each one of these > commands or objects and that is how they are identified. > <etiago> ah, kind of counterintuitive then #:) > <kmarsden> You will see Codepoint.java class in both client and server that > lists the codepoints in Derby > <etiago> I did look at that and all it reminded me of was pointers > <kmarsden> Vol 3 describes the format for the data types. > <etiago> so DRDA also has data types of its own? > <kmarsden> I can't say I have ever actually used Vol 3 or understood it. > That work was done pretty early before even my time. I spend my time mostly > in Vol 2 and Vol 1 > <etiago> oh alright > <kmarsden> Yes, it does and I think at two levels. For example DRDA itself > in Volume 1 will talk about Varchar and Char types but Vol 3 just talks about > strings in general. But we'll move on because this is not where you will be > working. > <etiago> ok > * kasun ([email protected]) entrou em #derby > * kasun agora chama-se Guest50421 > <kmarsden> Ok. All this protocol flow and DDM objects are wrapped in > something called a Data Stream Structure (DSS) which is just really a packet > wrapper and has no real content except descxribing the length and chaniing > one DSS to another. You will see these in the traces starting with a D0 and > identifes the packet as a request a reply or an object. > <kmarsden> I only mention that really because you will see DSS referred to > all over the place. I don't think you will have to change that code either. > <etiago> yes, I've seen DSS stuff around > <kmarsden> Now lets look at the actual protocol. When a connection is made > by the AR the first thing that is sent is an EXCSAT > <kmarsden> Exchange Server Attributes Request > <etiago> yes, I see that on the trace > <kmarsden> This even with UNICODEMGR will be sent EBCDIC and is really > important because it is where the protocol levels and now encoding are > negotiated. > <etiago> as I understand, we keep this as EBCDIC to allow clients that don't > support UTF-8 to ask for normal EBCDIC rather than UTF-8 > <kmarsden> Right. If you look at the DDM manual which is alphabetical at > EXCSAT, you will find out about it., > <kmarsden> > <etiago> ok > <kmarsden> Generally with EXCSAT you send an external name identifying the > client, a version and a manager level list. > <kmarsden> It is the manager level list or MGRLVLLS where the protocol > negotiation and now encoding takes place. > <kmarsden> oh no, I switched computers and don't have my spec to look at so I > wlll go from memory. > <kmarsden> Most of the Manager levels like SQLAM for instance are just a > number 1- whatever to show the protocol level, but UNICODEMGR the one being > introduced for ACR7007 is a bit different. > <kmarsden> For that one the client will send 0 if it wants to continue with > EBCDIC or 1208 if it wants UTF-8. I am not sure exactly why it was set up > this way, maybe to allow different encodings in the future? I just don't know > <etiago> ok yeah, that sort of makes sense > <kmarsden> The next things that flow from the AR to the AS are ACCEC and > SECCHK to negotiate security. These are chained before receiving the EXCSATRM > so these are in EBCDIC too. > <kmarsden> SECCHK (or maybe ACCSEC) has an optional RDBNAM parameter which > Derby client used to send. > <etiago> that's the database name if I recall correctly > <kmarsden> Some time ago I took that out in preparation of DERBY-728. We now > do not send the RDBNAM until it is required in ACCRDB > <kmarsden> right. > <etiago> oh by removing it from the SECCHK (which is EBCDIC-only) we can > actually support database names with chinese and japanese characters > <kmarsden> Anyway after the AR flows EXCSAT, ACCSEC and SECCHK then the > server sends back an EXCSATRM with it's manager level list and then the AR > and AS settle on the level . > <kmarsden> right. > <etiago> ok what exactly is this level that they settle on? > <kmarsden> For each level, for instance UNICODEMGR, if the AR sent 1208 and > the AS sent back 0, they would settle on 0 all EBCDIC. > <kmarsden> I suppose it is actually the server that takes what the client has > provided and returns the actual level that it will be. > <etiago> oh ok > <kmarsden> If the server could handle 1208 but the client said 0, it would be > 0. > <kmarsden> At least I think that's how it works. You better check > <etiago> that makes sense but I will check > <kmarsden> Also if the client knows nothing about this new manager level the > server will dumb things down, > <kmarsden> For example if a 10.3 client tries to connect to a 10.6 server > after DERBY-728 has been implemented, EBCDIC it will be, UNICODEMGR level 0 > <kmarsden> make sense? > <etiago> yep we have to maintain compatibility > <etiago> by using what's common to everyone > <kmarsden> right > <kmarsden> So let's say you have implemented DERBY-728 and are using new > client and server. AR sent 1208, AS responded 1208 for UNICODEMGR. It is > with the next request ACCRDB that the encoding changes. > <kmarsden> It is with this request that we now finally send the RDBNAM in > UTF-8 so we can send Chinese. Hooray! > <etiago> yay :-) > <kmarsden> But there is gotcha of course. The DRDA character strings have > limits and not just character limits but byte limits. > <etiago> meaning that the arguments of ACCRDB, etc have certain byte length > limitations, right? > <kmarsden> right. It is really very sad because from a Derby perspective, > these are just arbitrary limits and since the database name is a often a full > file system path and now characters can take up to four bytes I am thinking > folks are going to run out. > <etiago> yeah, I suppose these limits are defined somewhere in those DRDA > documents? > <kmarsden> but I understand extending or eliminating 1) would require a new > opengroup ACXR which seems to take years and 2) will be a total no go in the > C world. > <kmarsden> right. I think RDBNAM is 255 bytes > <etiago> actually, about that, is this ACR actually approved? > <kmarsden> It is not. It is undergoing the "fast track" approval process at > opengroup with some other ACR's. Work on it within IBM completed over a year > ago, but they were waiting for a bunch of others to be ready to submit > <etiago> gotta love bureaucracy :-) > <kmarsden> But I think it is pretty solid. The work you do should be for > 10.7, not 10.6 > <kmarsden> so we'll be safe just in case. > <etiago> alright > <kmarsden> I should mention that this ACR was not originally for the purpose > we are using it for. It was originally meant some sort of performance > improvement to avoid converting to EBCDIC all the time. > <kmarsden> At least 6 times that I can remember someone suggested moving the > switch to UTF-8 later after ACCRDB, so I have scouts at opengroup that > promise they will send up a big red flag if somebody tries i again. > <etiago> hahaha > <kmarsden> anyway moving along. > <etiago> that's good > <kmarsden> So what I have covered is the part of the protocol that you will > be dealing with mostly. I will move on with some other protocol stuff but > mostly for general knowledge. Then some other time we can have a talk about > the Derby code in relation to the protocol > <etiago> ok sounds good > <kmarsden> So when Network server receives the ACCRDB it makes an embedded > connection to the database which it will use for all the requests on that > client connection. The simplest thing that might come through is an > EXCSQLIMM which is just a simple statement execution with no result set and > not a prepared statement, maybe just an update or delete. > <etiago> yep I got that in my trace: [derby] SEND BUFFER: EXCSQLIMM > (ASCII) (EBCDIC) > <kmarsden> Most statements need to be prepared. Even if they are not a JDBC > prepared statement, if they return a result set, they have to be prepared > with a PRPSQLSTT > <kmarsden> The PRPSQLSTT command takes a package name, consistency token and > section number. The package name and section number are used to identify the > statement later when it is executed. > <kmarsden> In DB2 these are actual things and there are packages related to > holdability and other things I can't remember and the section is the > statement within that category. > <etiago> hmm ok > <kmarsden> In Derby, we just sort of pretend, because we don't actually have > packages. But these are how the statements are identified and Network Server > keeps a hash table of the prepared statements keyed on package and section > number for retrieval. > <kmarsden> An interesting way to look at this is to start network server and > then go into ij and prepare a statement or two then run runtimeinfo to see > the statements, their packages and section numbers. > <etiago> ok > <kmarsden> After PRPSQLSTT is the SQLSTT which is the actual SQL statement > text. This I believe can already be encoded UTF-8 even without ACR7007, so > you can have international characters in SQL even now. > <kmarsden> Once the statement is prepared, you execute it with ECSQLSTT. Of > course when you do this, you send again the package name, consitency token > and section number so the server can look it up and match it to the prepared > statement > <etiago> hmm but is it actually done right now in Derby? > <kmarsden> SQLDTA contains the parameter data and may just have place holders > for large objects which are sent in an EXTDTA > <etiago> ok > <kmarsden> If the statement is a query, the AR sends an OPNQRY, and gets an > OPNQRYRM and then iterates with CNTQRY (sort of like next but for many rows) > and gets QRYDTA and possibly EXTDTA objects back with the data. > <kmarsden> En the end the server sends a ENDQRYRM and a SQLCARD indicating > the end of data. > <kmarsden> The SQLCARD is an interesting thing. Generally it is used for > SQLExceptions, but there is a special one for end of data. > <etiago> so a simple SELECT statement actually represents several DRDA > commands, right? > <kmarsden> right, but not nearly as many as in that book. I still don't > understand why it's so big. The ones I mentioned here are pretty much the > pages you will go to over and over again > <etiago> hehe fair enough :) > <kmarsden> When the client connection is all done, there is no final end > connection command that it sends, which really irritating. The socket is > closed on the client and the server has to detect that and clean up. This has > lead to many leak issues over the years. Maybe someday there will be an > actual command which will make things easier. > <etiago> maybe another ACR is due > <kmarsden> So I guess that's the end of the protocol story for today, unless > you have questions. > <kmarsden> I wonder too if dag is here and if so how many lies he can spot. > <etiago> haha I think that covers it for now, I will have questions over time > I'm sure > <kmarsden> Yes, maybe you can write the ACR's for extending the character > string lengths and a connection termination command. It will look nice on > your resume and nice for opengroup to have input from a more diverse group. > <etiago> maybe #:) > <etiago> maybe I still have a too simplistic insight of DRDA on Derby, but I > think this will all boil down to detect the level of the encoding and > instantiate the right encoder class based on that > <kmarsden> I am a bit torn sometimes about whether we should just branch out > since we have our own ciient. > <kmarsden> Yes, that sounds right. > ------------------------------------8<---------------------- > > Regards, > Tiago Espinha > > > > >
