Hi Stepan,

No, there's no easy way to detect the corruption your describe. In fact,
for most serialization formats, there's no solution to this problem. Once
you've lost track of message boundaries, it's impossible to tell the
difference between the start of a new message vs. data in the previous
message, since any message can contain arbitrary byte blobs (e.g. via the
`Data` type).

If what you describe is a requirement for your use case, you could
accomplish it with an additional framing layer.

Option 1: Choose an 128-bit unguessable random number before you start
writing. Write that number before each message. Now you can scan the bytes
of the file looking for this 128-bit sequence and, if you see it, you can
be fairly certain (p ~= 2^-128) that a new message starts after it. You
have to use a new random number for every file in case you ever embed a
whole file into another file.

Option 2: Choose a magic number to write before each message, *and* scan
the contents of each message for this number, replacing it with an "escape
sequence" if seen. Do the opposite transformation while reading. This
allows you to detect boundaries "perfectly" (zero probability of false
positive) but you lose the benefits of zero-copy due to the need to process
escape sequences.

-Kenton

On Fri, Apr 14, 2017 at 12:35 PM, <stepan.buj...@gmail.com> wrote:

> I have a message that serializes into 24 bytes. I write two messages to a
> file resulting in a file thats 48 bytes long. Now I truncate the file to 40
> bytes and write one message, so the file now looks like this: 1 full
> message, one broken, 1 full message. Is there any way to iterate over the
> file and when encountering the broken message detect that it is broken and
> skip directly to the second full message? I've been using python to read
> such file with following code
>
> def main():
>     with open('dates.txt', 'r') as fp:
>         for date in date_capnp.Date.read_multiple(fp):
>                 print(date)
>
> But it fails with following message:
>
> Message contains non-struct pointer where struct pointer was expected
>
> Also, if it's possible to detect such message, is it possible to get it's
> position and length? Thank you.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Cap'n Proto" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to capnproto+unsubscr...@googlegroups.com.
> Visit this group at https://groups.google.com/group/capnproto.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.

Reply via email to