Hi Jay, On Tue, Oct 28, 2008 at 12:05:00PM -0400, Jay Pipes wrote: > > The Drizzle protocol works over TCP, UDP, and Unix Domain Sockets > > (UDS, also known as IPC sockets), although there are limitations when > > using UDP (this is discussed below). In the case of TCP and UDS, > > a connection is made, a command is sent, and a response loop is > > started. Socket communication ends when either side closes the > > connection or a QUIT command is issued. > > AFAIK, Drizzle does not support Unix domain sockets...that was one of > the earlier things that was ripped out.
The library will still have support for them (for MySQL protocol interfaces), si I thought I'd keep them in there in case we want to re-introduce them. It's really no extra cost to support them. > > The sequence of packets for a simple connection and command that > > responds with an OK packet: > > > > C: Command > > S: OK > > > > The sequence of packets for a simple connection and query command > > with results: > > > > C: Command > > S: OK > > S: Fields (optional, multiple packets) > > S: Rows (multiple packets) > > S: EOF > > What does the OK packet signify from the server? That the query passes > syntax checks? That the query has already been streamed into a result? > That the storage engine contains the tables in the SELECT? That the > command itself is a registered command the server can respond to? That > the next packet will be a Fields packet? The OK was there to note the start of fields and row responses. It may also contain parameters for things like field count, row count (if it's known at the time), and other info. Thinking about it more, any info could just be stuffed as parameters into the first field or row packet, and the lack of an error response will signify success. If we did want to keep the OK packet, it would be small and could be stuffed into the same network packet as the first field/row packet (no extra RTTs). I think in the next draft I'll just drop it. :) > > When authentication is required for a command, the server will ask > > for it. For example: > > > > C: Command > > S: Authentication Required > > C: Authentication Credentials > > S: OK > > S: Fields > > S: Rows > > S: EOF > > Is the authentication pluggable in the client? Meaning, can the client > send, for instance, LDAP authentication to a server expecting it? How > does the server tell the client what it expects to see? We had a good discussion about this last weekend, and it was decided that for the protocol, we only need to worry about a couple authentication methods (hash, scramble, and maybe kerberos at some point). The server can then take this and authenticate it against anything it likes (LDAP, ...). I'm more on the side of keeping it completely pluggable, but I think for the first round we decided it will be more static. When the server responds with an auth required message, it will included what methods it supports, and any data for methods that are enabled (such as the nonce). Any thoughts on how pluggable this should be? > > The server will use the most recent credential information when > > processing subsequent commands. If a client wishes to multiplex > > commands on a single connection, it can do so using the command > > identifiers. Here is an example of how the packets could be ordered, > > but this will largely depend on the servers ability to process the > > commands concurrently and the processing time for each command. > > > > C: Command (Command ID=1) > > C: Command (Command ID=2) > > S: OK (Command ID=2) > > S: Field (Command ID=2) > > S: OK (Command ID=1) > > S: Fields (Command ID=2) > > S: Rows (Command ID=2) > > S: EOF > > > > As you can see, the commands may be executed with results generated > > in any order, and the packet containing the results may be interleaved. > > Same comment as above on the OK packet. > > Also, does a zero-rows result packet get sent? Or does the Fields > packet contain information telling the client that there are no rows to > expect in a Result packet? The field packets are optional (controlled by a parameter during the command packet), so if there are no rows sent, an EOF packet is sent (which may have other info about the query in parameters). > > Length Encoding > > --------------- > > > > Some lengths used within the protocol packets are length encoded. This > > means the size of the length field will vary between 1 and 9 bytes, > > and is determined by the value of the first byte. > > > > 0-252 - Actual length > > 253 - NULL value (only applicable in row results) > > 254 - Following 8 bytes hold length > > 255 - Depends on context, usually signifies end > > Awesome. Looks like you took Mats advice on this one :) > > What is the maximum length supported? Is this defined in the client > library API? Is this max length versioned? Must it match the server's > definition of the maximum length supported? The 8-byte value, so theoretically 2^64-1. Now, this is just the protocol limit, the actual client and server limits could be configured to something else (perhaps only 32bit for things like row data). This should prevent us from having to change the protocol anytime soon. :) > > Packets > > ------- > > > > Packets consist of two layers. The first is meant to be small, > > simple, and have just enough information for fast router and proxy > > processing. It consists of a fixed-size part, along with a variable > > sized client id (explained later), a series of chunked data, followed > > by a checksum at the end. The chunked transfer encoding allows for > > not having to pre-compute the packet data length before sending, > > and support packets of any size. It also allows for a large packet > > to be aborted gracefully (without having to close the connection) > > in the event of an error. > > > > The first part of a packet is: > > > > 1-byte Magic number, the value should be 0x44. > > > > 1-byte Protocol version, currently 1. > > > > 2-byte Command ID. This is a unique number among all other queries > > currently being executed on the connection. The client is > > responsible for choosing a unique number while generating a > > command packet, and all response packets associated with that > > command must have the same command ID. Once a command has been > > completed, the client may reuse the ID. > > Assume a session which issues the following series: > > BEGIN TRANSACTION; > INSERT INTO t1 VALUES (1); > INSERT INTO t1 VALUES (2); > INSERT INTO t1 VALUES (N); # Where N == 75000 > COMMIT; > > Is a 2 byte command ID going to be enough? Can the client realistically > re-use command IDs after each INSERT returns an OK packet? On the > server side, the Query_id is a 64-bit unsigned integer (see > drizzled/query_id.h). Should the client command ID mimick this? This is something else we discussed for a while. We decided there probably wound't be more than 64k concurrent commands running at the same time for any given connection. Chances are some responses will start coming in before it starts to roll around for reuse, and if they don't start getting responses by the roll around (say there is a table lock blocking the INSERTs), the client should probably throttle down until they do start flushing (state information in the client will start getting large too). If you REALLY want to start shoving more concurrent queries down, you can always open up more connections too. > > OK/ERROR > > -------- > > > > The server responds with an OK or ERROR if no row data is given. A > > list of parameters may follow, and the marked with an end of parameter > > value. > > I prefer something other than ERROR if no row data is given. What about > valid zero-row results? Not necessarily an ERROR, right? Or am I > missing something? > > Cheers, and thanks for all your great work so far, Eric! Of course, thanks for the comments! -Eric _______________________________________________ Mailing list: https://launchpad.net/~drizzle-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~drizzle-discuss More help : https://help.launchpad.net/ListHelp

