Hi Christoph, On Mon, Oct 14, 2013 at 10:02 AM, Christoph Badura <b...@bsd.de> wrote: > First, I find the usage of the "buf" terminology confusing. In kernel > context I associate "buf" with the file system buffe cache "buf" structure. > Packet buffers a called "mbufs". I would appreciate it if the terminology > was consistent with the kernel or at least not confusing.
This is due my lack of creativeness =).. I'm quite open for naming suggestions. > Also, having to switch mentally between zero-based arrays in the kernel C > code and 1-based arrays in the Lua code make my head ache. It's something that doesn't bug me so much.. But, if necessary it could be changed to 0-based in this userdata. > On Thu, Oct 10, 2013 at 03:15:54PM -0300, Lourival Vieira Neto wrote: >> C API: >> >> lbuf_new(lua_State L, void * buffer, size_t length, lua_Alloc free, bool >> net); >> >> * creates a new lbuf userdatum and pushes it on the Lua stack. The net >> flag indicates if it is necessary to perform endianness conversion. > > I what is "buffer" and how does it relate to mbufs? How do I create a new > "lbuf" from an mbuf? Or from an array of bytes? Note, non-contiguous buffer still an open problem in lbuf. I don't know if should use a ptrdiff_t to pass the distance to 'next' field, a 'next()' to return 'next' field or something else. However, you could create a lbuf from a mbuf header as follows: lbuf_new(L, mbuf->m_data, mbuf->m_len, NULL, true); or from an array: uint8_t array[ N ]; lbuf_new(L, (void *) array, N, NULL, false); // 'false' means 'use the platform endianess' Then, you could call a Lua function passing this lbuf, for example: lua_getglobal(L, "handler"); lbuf_new(L, mbuf->m_data, mbuf->m_len, NULL, true); lua_pcall(L, 1, 0, 0); > In order to indicate that endianness conversion is necessary I need to > know the future uses of the buffer. Clairvoyance excepted, that is kinda > hard. It's a generic data structure that could be used to handle bit fields or nonaligned data. > If you are going to make the buffers endianness aware, why not record the > endianness that the packet is encoded in. And byteswapping can be > performed automatically depending on the consumers endianness. I think > this way a lot of redundant code can be avoided. > > And you don't describe under what circumstances endianness convresion is > performed. Yes, mea culpa =(. I wasn't clear about that. 'net' flag was the way I found to 'record' the buffer endianness. What means, true if the buffer uses BE and false if it uses HE. It has the same semantics of hton* and ntoh* functions. Don't know if it is better to pass the endianness itself as a flag (e.g., enum { BIG_ENDIAN, LITTLE_ENDIAN, HOST_ENDIAN }). What do you think? So, if you set net flag true when you access a bit field, the conversion to and from big endian, if needed, is done automatically taking the smaller aligned set of bits. For example: buf:rawget(0, 9) ~> if net flag is *true*: takes 16 bits from beginning of the buffer (as is); convert these 2 bytes from BE to HE (if necessary); and returns these 2 bytes masked to preserve only the most significant 9 bits (zeroing the remaining bits) and shifted to LSB. If net is *false*: just returns the first 2 bytes masked and shifted (without conversion). Then these 2 bytes are expanded to lua_Number type (int64_t in kernel) That is: a) If net flag is _true_ and the platform is LE: 1- Takes 16 bits: [ b0 | b1 | b2 | b3 | b4 | b5 | b6 | b7 ][ b8 | b9 | b10 | b11 | b12 | b13 | b14 | b15 ] 2- Convert it to LE: [ b8 | b9 | b10 | b11 | b12 | b13 | b14 | b15 ][ b0 | b1 | b2 | b3 | b4 | b5 | b6 | b7 ] 3- Returns the first 2 bytes masked and shifted: [ b1 | b2 | b3 | b4 | b5 | b6 | b7 | b8 ][ 0 | 0 | 0 | 0 | 0 | 0 | 0 | b0 ] b) If net flag is _false_ and the platform is LE: 1- Takes 16 bits: [ b0 | b1 | b2 | b3 | b4 | b5 | b6 | b7 ][ b8 | b9 | b10 | b11 | b12 | b13 | b14 | b15 ] 2- Returns the first 2 bytes masked and shifted: [ b9 | b10 | b11 | b12 | b13 | b14 | b15 | b0 ][ 0 | 0 | 0 | 0 | 0 | 0 | 0 | b8 ] c) If net flag is _true or false_ and platform is BE: 1- Takes 16 bits: [ b0 | b1 | b2 | b3 | b4 | b5 | b6 | b7 ][ b8 | b9 | b10 | b11 | b12 | b13 | b14 | b15 ] 2- Returns the first 2 bytes masked and shifted: [ 0 | 0 | 0 | 0 | 0 | 0 | 0 | b0 ][ b1 | b2 | b3 | b4 | b5 | b6 | b7 | b8 ] >> Lua API: >> >> - array access (1) >> >> lbuf:mask(alignment [, offset, length]) >> buf[ix] ~> accesses 'alignment' bits from 'alignment*(ix -1)+offset' position >> >> e.g.: >> buf:mask(3) >> buf[3] ~> accesses 3 bits from bit-6 position > > What does that mean? Does it return the top-most 2 bits from the first > byte plus the least significant bit fom the second byte of the buffer? It means the least-most 2 bits from the first byte and the LSB from the second. > What is 'length' for? Offset and length could be used to impose boundaries to the mask. For example, if you want to analyse a segment of the buffer that is organized in a logical array of 2 bytes starting from the second byte and that has 3 elements, you could do: buf:mask(16, 8, 3). > How does endianness conversion fit in? Endianness conversion is done using the smaller aligned amount of bits; in this case, 1 byte, which does not applies to endianness. >> - array access (2) >> >> buf:mask{ length_pos1, length_pos2, ... } >> buf[ix] ~> accesses 'length_pos(ix)' bits from 'length_pos1 + ... >> length_pos(ix-1)' position >> >> e.g.: >> buf:mask{ 2, 2, 32, 9 } >> buf[2] ~> accesses 2 bits from bit-2 position > > What exactly would "buf[3]" return. Please be explicit in whether you are > counting byte offsets or bit offsets. I can't figure that out from your > description. It would return 32 bits (converted or not, depending on 'net' flag) from bit-4 (MSB-ordered). mask{ ... } receives bit offsets and array access receives mask field index. I'm always counting bit offsets. Bytes are only used to endianness conversion. > Personally, the idea of making array access to the buffer depend on > state stored in the buffer does not look appealing to me. It prevents > buffers to be passed around because consumers don't know what they will > get back on array access. I think it could be useful to access nonaligned and aligned data easily without caring about naming fields. >> buf:mask{ field = { offset, length }, ... } >> buf.field ~> 'field.length' bits from 'offset' position > > This actually makes some sense to me. =) >> buf:segment(offset [, length]) >> returns a new lbuf corresponding a 'buf' segment. > > What is a a 'segment' actually? Segment is a sub-buffer. You could use just a portion of a main buffer with another mask (e.g., to dissect a payload). >> - mask reusing >> lbuf.mask{ ... } > > This makes sense again... =) >> function filter(packet) >> packet:mask(ethernet_mask) >> if packet.type == 0x88CC then >> lldp_pdu = packet.segment(payload_offset):mask(lldp_mask) >> if packet.version < 1 return DROP end >> end >> return PASS >> end > > ... except the code seems to be not runnable. Where does 'payload_offset' > come from? It's a variable which could be set by the script itself or loaded by the C module. I could use the value itself (like 0x88CC), but I just wanted to save the time of reading the standard. > And don't you mean lldp_pdu.version? Yes, sorry about that. > I find it not helpful when the examples do not actually work. Well, the library it is not ready yet. This example is just a draft to discuss a concept. However, this fragment should be runnable when the lib is implemented (except by packet.version mistake). > --chris Regards, -- Lourival Vieira Neto