Kevin Fishburne ha scritto: > On 04/07/2011 12:20 PM, Benoît Minisini wrote: > >> When you write the data to the socket, all data are converted from the CPU >> endianness (little endian for Intel/AMD) to the network endianness (big >> endian >> by definition). >> >> When you read the data from the socket, the data must be converted back from >> big endian to little endian, and it is done automatically by READ. >> >> But if you read everything inside a string, that is not done, because string >> are bytes, and so do not have endianness. >> >> Do you have an example of code that takes the socket data from a big message >> like you described and decode it? >> >> We will find a solution! >> > Ahh, I understand a little better. I don't know much about endianness, > so please excuse my ignorance here. When reading socket data into a > string how is the data affected exactly? I can think of a few possibilities: > > 1) The byte order of the entire string is reversed ("abcd" becomes "dcba") > 2) The bit order of the entire string is reversed (00001111 becomes > 11110000) > 3) The byte order of the values is reversed (0000 1111, 1010 1100 > becomes 1111 0000, 1100 1010) > 4) The bit order of the values is reversed(0000 1111, 1010 1100 becomes > 1111 0000, 0011 0101) > Endiannes refers to the order the bytes are kept in memory, when a numerical multi-byte value is involved. Strictly speaking, strings are not affected. For example, human beings are big endian: the number 10, composed by two digits, is written with the weighter digit first. If human beings were little endian, they wrote this number as "01".
Computers don't use decimal base (not always true), but it is the same. If you have a word composed by two bytes, you can write those bytes in two different orders: MSB (most significant byte, or "heavier byte") first, or LSB first. The number 512 in hex is formed by two bytes: MSB=2 and LSB=0, written by a human being as "&h0200". A computer can store (or send over a channel) those bytes in either order. The same mechanism applies to 4-bytes numbers, and floating point numbers. These last ones could be a little different, because they have two parts, mantissa and exponent, but normally mantissa is considered least significant than exponent. Note that even strings can be affected, if they are UTF (or multi-byte, or whatever), because a single character can need two bytes. So, Benoit is right when he says that this endiannes has to do with network. But this is not completely true, because the same problem arises when a computer writes some data to a file, and this file is transferred to another computer. UTF is a clear example: files which contain UTF text can sometimes be problematic because it is not clear which endianness this UTF has. Correctly composed files have a marker which clearly states the endianness (named BOM, perhaps). This is why some text editors try different encodings when opening a file, and assume they chose the wrong encoding if an illegal character is detected. While it is good that sending data over a network should be an endiannes-aware operation, it is also true that some easy way to play with single bytes could be handy. A swapendiannes(), or changeendianness() function could be used to swap bytes inside a variable, be it two, four or eight bytes long. Just an idea. Regards, Doriano ------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ Gambas-user mailing list Gambas-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gambas-user