[protobuf] Issue 378 in protobuf: UTF8 string validation for repeated strings only checks the first element when deserializing

protobuf Tue, 20 Mar 2012 16:43:00 -0700

Status: New
Owner: ----
Labels: Type-Defect Priority-Medium

New issue 378 by josephrd...@gmail.com: UTF8 string validation for repeatedstrings only checks the first element when deserializing

http://code.google.com/p/protobuf/issues/detail?id=378

What steps will reproduce the problem?

1. Serialize a message with a repeated string field containing two strings.The first string should be valid UTF8, the second should not.

2. Deserialize the previously serialized message.

3. Observe the invalid UTF8 warning is only printed from the serializationstep, and not the deserialization step.4. If you reverse the order the strings are encoded, the deserializationstep will flag two invalid UTF8 strings instead of one.

5. This was observed in the C++ code.

What is the expected output? What do you see instead?

Explained above.

Additionally, the problem in source is in the file cpp_string_field.cc, inthe method


void RepeatedStringFieldGenerator::
GenerateMergeFromCodedStream(io::Printer* printer) const;

When passing arguments to the VerifyUTF8String method it always passes the0th element, ie:


    printer->Print(variables_,
      "::google::protobuf::internal::WireFormat::VerifyUTF8String(\n"
      "  this->$name$(0).data(), this->$name$(0).length(),\n"
      "  ::google::protobuf::internal::WireFormat::PARSE);\n");

Instead of the 0th element, it should reference the last element of therepeated field (since the most recently decoded element will reside there.)There's no simple accessor, but $name$_size() - 1 works for obtaining theindex.



--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

[protobuf] Issue 378 in protobuf: UTF8 string validation for repeated strings only checks the first element when deserializing

Reply via email to