Thank you very much for your answer. > You have to be able to match bytes, not strings.
May I ask you to elaborate on this, sorry non-native English speaker. The buffer I receive is a byte-like buffer. > I don't think you'll be able to 100% reliably match bytes in this way. > You're asking it to make analysis of multiple bytes and to interpret > them according to which character they would represent if decoded from > UTF-8. > > My recommendation: Even if your buffer is multiple gigabytes, just > decode it anyway. Maybe you can decode your buffer in chunks, but > otherwise, just bite the bullet and do the decode. You may be > pleasantly surprised at how little you suffer as a result; Python is > quite decent at memory management, and even if you DO get pushed into > the swapper by this, it's still likely to be faster than trying to > code around all the possible problems that come from mismatching your > text search. > > ChrisA That's what I was afraid of. It would be nice if the "world" could commit itself to one standard, but I'm afraid that won't happen in my life anymore, I guess. :-( Thx Eren -- https://mail.python.org/mailman/listinfo/python-list