Re: intermittent issue with encode (version 2.0.3)

2009-07-10 Thread Kenton Varda
As long as your threads do not start before main() is called, you are fine.
 Again, the function in question is guaranteed to be called before main()
starts.

On Fri, Jul 10, 2009 at 3:52 PM, Rizzuto, Raymond
wrote:

>  Is there something I need to do in the application to cause that to
> happen before I’ve spun off multiple threads?  Currently I spin up threads,
> which then start making calls to fill in and then encode google
> protobuffers.
>
>
>  --
>
> *From:* Kenton Varda [mailto:ken...@google.com]
> *Sent:* Thursday, July 09, 2009 6:05 PM
>
> *To:* Rizzuto, Raymond
> *Cc:* protobuf@googlegroups.com
> *Subject:* Re: intermittent issue with encode (version 2.0.3)
>
>
>
> As the comment says, the first call will always occur at startup time when
> there is only one thread anyway, so it's perfectly safe.  The parenthetical
> about GCC4 is just an aside.
>
> On Thu, Jul 9, 2009 at 2:47 PM, Rizzuto, Raymond 
> wrote:
>
> I am a bit nervous about the GCC4 comment in
> GeneratedMessageFactory::singleton  (message.cc):
>
>
>
>   // No need for thread-safety here because this will be called at static
>
>   // initialization time.  (And GCC4 makes this thread-safe anyway.)
>
>
>
> I’m using gcc 3.3.3.
>
>
>
> The singleton object in GeneratedMessageFactory::singleton, is a local
> static of non-POD type.  The C++ standard says:
>
>
>
> An implementation is permitted to perform
>
> early initialization of other local objects with static storage duration
> under the same conditions that an
>
> implementation is permitted to statically initialize an object with static
> storage duration in namespace scope
>
> (3.6.2). Otherwise such an object is initialized the first time control
> passes through its declaration; such an
>
> object is considered initialized upon the completion of its initialization.
>
>
>
> I don’t think the language standard addresses what “first time control
> passes through its declaration” means when two threads call the function
> simultaneously.  Perhaps gcc4 provides features that make that safe.  I
> don’t know if that is something that can be relied on in all compilers,
> however.
>
>
>
> Ray
>
>
>  --------------
>
> *From:* Kenton Varda [mailto:ken...@google.com]
> *Sent:* Thursday, July 09, 2009 5:08 PM
>
>
> *To:* Rizzuto, Raymond
> *Cc:* protobuf@googlegroups.com
> *Subject:* Re: intermittent issue with encode (version 2.0.3)
>
>
>
> I suppose you could also temporarily edit the header file.
>
> On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond 
> wrote:
>
> I’m trying to, without success.   Breakpoints in header files, at least
> with the version of tools I have, don’t work very well.
>
>
>  --
>
> *From:* Kenton Varda [mailto:ken...@google.com]
> *Sent:* Thursday, July 09, 2009 5:02 PM
> *To:* Rizzuto, Raymond
> *Cc:* protobuf@googlegroups.com
> *Subject:* Re: intermittent issue with encode (version 2.0.3)
>
>
>
> Run in a debugger and set a breakpoint at wire_format_inl.h:289.
>
> On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond 
> wrote:
>
> I think I have an error in my code (C++) that only occurs when I have
> multiple threads, and a lot of message volume.   Even then, I can run the
> same test many times, but only get a failure on some runs.  With 7 threads
> running on a 4 core machine, and generating 480384 google protocol buffer
> messages, I get 33 errors like this to stdout:
>
>
>
> libprotobuf ERROR
> /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289]
> Encountered string containing invalid UTF-8 data while serializing protocol
> buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
>
>
>
> I believe that the data is in error since I get similar errors decoding the
> messages:
>
>
>
> libprotobuf ERROR
> /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138]
> Encountered string containing invalid UTF-8 data while parsing protocol
> buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
>
>
>
> Is there any way that I can check for this at run time so that I can print
> out more context?  I do call IsInitialized before serializing, but that
> doesn’t check for this case.
>
>
>
> I am running on SLES9SP4, using gcc 3.3.3 as the compiler.
>
>
>
> Ray
>
>
>  --
>
> Ray Rizzuto
>
> raymond.rizz...@sig.com
>
> Susquehanna International Group
>
> (61

RE: intermittent issue with encode (version 2.0.3)

2009-07-10 Thread Rizzuto, Raymond
Is there something I need to do in the application to cause that to happen 
before I've spun off multiple threads?  Currently I spin up threads, which then 
start making calls to fill in and then encode google protobuffers.


From: Kenton Varda [mailto:ken...@google.com]
Sent: Thursday, July 09, 2009 6:05 PM
To: Rizzuto, Raymond
Cc: protobuf@googlegroups.com
Subject: Re: intermittent issue with encode (version 2.0.3)

As the comment says, the first call will always occur at startup time when 
there is only one thread anyway, so it's perfectly safe.  The parenthetical 
about GCC4 is just an aside.
On Thu, Jul 9, 2009 at 2:47 PM, Rizzuto, Raymond 
mailto:raymond.rizz...@sig.com>> wrote:

I am a bit nervous about the GCC4 comment in GeneratedMessageFactory::singleton 
 (message.cc):



  // No need for thread-safety here because this will be called at static

  // initialization time.  (And GCC4 makes this thread-safe anyway.)



I'm using gcc 3.3.3.



The singleton object in GeneratedMessageFactory::singleton, is a local static 
of non-POD type.  The C++ standard says:



An implementation is permitted to perform

early initialization of other local objects with static storage duration under 
the same conditions that an

implementation is permitted to statically initialize an object with static 
storage duration in namespace scope

(3.6.2). Otherwise such an object is initialized the first time control passes 
through its declaration; such an

object is considered initialized upon the completion of its initialization.



I don't think the language standard addresses what "first time control passes 
through its declaration" means when two threads call the function 
simultaneously.  Perhaps gcc4 provides features that make that safe.  I don't 
know if that is something that can be relied on in all compilers, however.



Ray





From: Kenton Varda [mailto:ken...@google.com<mailto:ken...@google.com>]
Sent: Thursday, July 09, 2009 5:08 PM

To: Rizzuto, Raymond
Cc: protobuf@googlegroups.com<mailto:protobuf@googlegroups.com>
Subject: Re: intermittent issue with encode (version 2.0.3)



I suppose you could also temporarily edit the header file.

On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond 
mailto:raymond.rizz...@sig.com>> wrote:

I'm trying to, without success.   Breakpoints in header files, at least with 
the version of tools I have, don't work very well.





From: Kenton Varda [mailto:ken...@google.com<mailto:ken...@google.com>]
Sent: Thursday, July 09, 2009 5:02 PM
To: Rizzuto, Raymond
Cc: protobuf@googlegroups.com<mailto:protobuf@googlegroups.com>
Subject: Re: intermittent issue with encode (version 2.0.3)



Run in a debugger and set a breakpoint at wire_format_inl.h:289.

On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond 
mailto:raymond.rizz...@sig.com>> wrote:

I think I have an error in my code (C++) that only occurs when I have multiple 
threads, and a lot of message volume.   Even then, I can run the same test many 
times, but only get a failure on some runs.  With 7 threads running on a 4 core 
machine, and generating 480384 google protocol buffer messages, I get 33 errors 
like this to stdout:



libprotobuf ERROR 
/siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289]
 Encountered string containing invalid UTF-8 data while serializing protocol 
buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.



I believe that the data is in error since I get similar errors decoding the 
messages:



libprotobuf ERROR 
/siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138]
 Encountered string containing invalid UTF-8 data while parsing protocol 
buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.



Is there any way that I can check for this at run time so that I can print out 
more context?  I do call IsInitialized before serializing, but that doesn't 
check for this case.



I am running on SLES9SP4, using gcc 3.3.3 as the compiler.



Ray





Ray Rizzuto

raymond.rizz...@sig.com<mailto:raymond.rizz...@sig.com>

Susquehanna International Group

(610)747-2336 (W)

(215)776-3780 (C)









IMPORTANT: The information contained in this email and/or its attachments is 
confidential. If you are not the intended recipient, please notify the sender 
immediately by reply and immediately delete this message and all its 
attachments. Any review, use, reproduction, disclosure or dissemination of this 
message or any attachment by an unintended recipient is strictly prohibited. 
Neither this message nor any attachment is intended as or should be construed 
as an offer, solicitation or recomme

Re: intermittent issue with encode (version 2.0.3)

2009-07-09 Thread Kenton Varda
As the comment says, the first call will always occur at startup time when
there is only one thread anyway, so it's perfectly safe.  The parenthetical
about GCC4 is just an aside.

On Thu, Jul 9, 2009 at 2:47 PM, Rizzuto, Raymond wrote:

>  I am a bit nervous about the GCC4 comment in
> GeneratedMessageFactory::singleton  (message.cc):
>
>
>
>   // No need for thread-safety here because this will be called at static
>
>   // initialization time.  (And GCC4 makes this thread-safe anyway.)
>
>
>
> I’m using gcc 3.3.3.
>
>
>
> The singleton object in GeneratedMessageFactory::singleton, is a local
> static of non-POD type.  The C++ standard says:
>
>
>
> An implementation is permitted to perform
>
> early initialization of other local objects with static storage duration
> under the same conditions that an
>
> implementation is permitted to statically initialize an object with static
> storage duration in namespace scope
>
> (3.6.2). Otherwise such an object is initialized the first time control
> passes through its declaration; such an
>
> object is considered initialized upon the completion of its initialization.
>
>
>
> I don’t think the language standard addresses what “first time control
> passes through its declaration” means when two threads call the function
> simultaneously.  Perhaps gcc4 provides features that make that safe.  I
> don’t know if that is something that can be relied on in all compilers,
> however.
>
>
>
> Ray
>
>
>  --
>
> *From:* Kenton Varda [mailto:ken...@google.com]
> *Sent:* Thursday, July 09, 2009 5:08 PM
>
> *To:* Rizzuto, Raymond
> *Cc:* protobuf@googlegroups.com
> *Subject:* Re: intermittent issue with encode (version 2.0.3)
>
>
>
> I suppose you could also temporarily edit the header file.
>
> On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond 
> wrote:
>
> I’m trying to, without success.   Breakpoints in header files, at least
> with the version of tools I have, don’t work very well.
>
>
>  --------------
>
> *From:* Kenton Varda [mailto:ken...@google.com]
> *Sent:* Thursday, July 09, 2009 5:02 PM
> *To:* Rizzuto, Raymond
> *Cc:* protobuf@googlegroups.com
> *Subject:* Re: intermittent issue with encode (version 2.0.3)
>
>
>
> Run in a debugger and set a breakpoint at wire_format_inl.h:289.
>
> On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond 
> wrote:
>
> I think I have an error in my code (C++) that only occurs when I have
> multiple threads, and a lot of message volume.   Even then, I can run the
> same test many times, but only get a failure on some runs.  With 7 threads
> running on a 4 core machine, and generating 480384 google protocol buffer
> messages, I get 33 errors like this to stdout:
>
>
>
> libprotobuf ERROR
> /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289]
> Encountered string containing invalid UTF-8 data while serializing protocol
> buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
>
>
>
> I believe that the data is in error since I get similar errors decoding the
> messages:
>
>
>
> libprotobuf ERROR
> /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138]
> Encountered string containing invalid UTF-8 data while parsing protocol
> buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
>
>
>
> Is there any way that I can check for this at run time so that I can print
> out more context?  I do call IsInitialized before serializing, but that
> doesn’t check for this case.
>
>
>
> I am running on SLES9SP4, using gcc 3.3.3 as the compiler.
>
>
>
> Ray
>
>
>  --
>
> Ray Rizzuto
>
> raymond.rizz...@sig.com
>
> Susquehanna International Group
>
> (610)747-2336 (W)
>
> (215)776-3780 (C)
>
>
>
>
>
>
>  --
>
> IMPORTANT: The information contained in this email and/or its attachments
> is confidential. If you are not the intended recipient, please notify the
> sender immediately by reply and immediately delete this message and all its
> attachments. Any review, use, reproduction, disclosure or dissemination of
> this message or any attachment by an unintended recipient is strictly
> prohibited. Neither this message nor any attachment is intended as or should
> be construed as an offer, solicitation or recommendation to buy or sell any
> security or other financial instrument. Neither the sender, his or her
> employer nor any of their respective affiliate

RE: intermittent issue with encode (version 2.0.3)

2009-07-09 Thread Rizzuto, Raymond
I am a bit nervous about the GCC4 comment in GeneratedMessageFactory::singleton 
 (message.cc):

  // No need for thread-safety here because this will be called at static
  // initialization time.  (And GCC4 makes this thread-safe anyway.)

I'm using gcc 3.3.3.

The singleton object in GeneratedMessageFactory::singleton, is a local static 
of non-POD type.  The C++ standard says:

An implementation is permitted to perform
early initialization of other local objects with static storage duration under 
the same conditions that an
implementation is permitted to statically initialize an object with static 
storage duration in namespace scope
(3.6.2). Otherwise such an object is initialized the first time control passes 
through its declaration; such an
object is considered initialized upon the completion of its initialization.

I don't think the language standard addresses what "first time control passes 
through its declaration" means when two threads call the function 
simultaneously.  Perhaps gcc4 provides features that make that safe.  I don't 
know if that is something that can be relied on in all compilers, however.

Ray


From: Kenton Varda [mailto:ken...@google.com]
Sent: Thursday, July 09, 2009 5:08 PM
To: Rizzuto, Raymond
Cc: protobuf@googlegroups.com
Subject: Re: intermittent issue with encode (version 2.0.3)

I suppose you could also temporarily edit the header file.
On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond 
mailto:raymond.rizz...@sig.com>> wrote:

I'm trying to, without success.   Breakpoints in header files, at least with 
the version of tools I have, don't work very well.





From: Kenton Varda [mailto:ken...@google.com<mailto:ken...@google.com>]
Sent: Thursday, July 09, 2009 5:02 PM
To: Rizzuto, Raymond
Cc: protobuf@googlegroups.com<mailto:protobuf@googlegroups.com>
Subject: Re: intermittent issue with encode (version 2.0.3)



Run in a debugger and set a breakpoint at wire_format_inl.h:289.

On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond 
mailto:raymond.rizz...@sig.com>> wrote:

I think I have an error in my code (C++) that only occurs when I have multiple 
threads, and a lot of message volume.   Even then, I can run the same test many 
times, but only get a failure on some runs.  With 7 threads running on a 4 core 
machine, and generating 480384 google protocol buffer messages, I get 33 errors 
like this to stdout:



libprotobuf ERROR 
/siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289]
 Encountered string containing invalid UTF-8 data while serializing protocol 
buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.



I believe that the data is in error since I get similar errors decoding the 
messages:



libprotobuf ERROR 
/siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138]
 Encountered string containing invalid UTF-8 data while parsing protocol 
buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.



Is there any way that I can check for this at run time so that I can print out 
more context?  I do call IsInitialized before serializing, but that doesn't 
check for this case.



I am running on SLES9SP4, using gcc 3.3.3 as the compiler.



Ray





Ray Rizzuto

raymond.rizz...@sig.com<mailto:raymond.rizz...@sig.com>

Susquehanna International Group

(610)747-2336 (W)

(215)776-3780 (C)









IMPORTANT: The information contained in this email and/or its attachments is 
confidential. If you are not the intended recipient, please notify the sender 
immediately by reply and immediately delete this message and all its 
attachments. Any review, use, reproduction, disclosure or dissemination of this 
message or any attachment by an unintended recipient is strictly prohibited. 
Neither this message nor any attachment is intended as or should be construed 
as an offer, solicitation or recommendation to buy or sell any security or 
other financial instrument. Neither the sender, his or her employer nor any of 
their respective affiliates makes any warranties as to the completeness or 
accuracy of any of the information contained herein or that this message or any 
of its attachments is free of viruses.







IMPORTANT: The information contained in this email and/or its attachments is 
confidential. If you are not the intended recipient, please notify the sender 
immediately by reply and immediately delete this message and all its 
attachments. Any review, use, reproduction, disclosure or dissemination of this 
message or any attachment by an unintended recipient is strictly prohibited. 
Neither this message nor any attachment is intended as or should be construed 
as an offer, solicitation

RE: intermittent issue with encode (version 2.0.3)

2009-07-09 Thread Rizzuto, Raymond
Exactly.  Basically I want to perform any "standard" validation on content 
across all fields of the message, not application level validation.  Since I 
have a number of complicated messages in my application, it would be somewhat 
tedious to write that code.  If the compiler can add it automatically, or if 
there was a way of iterating/introspecting the object to validate it, that 
would be great.


From: Kenton Varda [mailto:ken...@google.com]
Sent: Thursday, July 09, 2009 5:34 PM
To: Rizzuto, Raymond
Cc: protobuf@googlegroups.com
Subject: Re: intermittent issue with encode (version 2.0.3)

Sorry, I think I misread your message.

You just want there to me a method like IsInitialized() that you can call to 
validate UTF-8 stuff.  I'll think about that.

On Thu, Jul 9, 2009 at 2:32 PM, Kenton Varda 
mailto:ken...@google.com>> wrote:
This is something you can do in your own code -- just call your validation 
function before serializing.  If this were to be a "feature" of protocol 
buffers, then we'd have to store a pointer to your validator function 
somewhere.  Storing it in the message object itself would harm performance and 
memory usage, but storing it in a static location (such that it applies to all 
instances of the type) would bring all the myriad problems commonly associated 
with singletons.  So I don't think there's any reasonable way for the protobuf 
system to provide this.

On Thu, Jul 9, 2009 at 2:14 PM, Rizzuto, Raymond 
mailto:raymond.rizz...@sig.com>> wrote:

I'm going to try that.   Since another group builds and packages the libraries 
I use, it'll take a bit to make a private copy with that change.



As an enhancement request, I wish there was a function I could call to validate 
the message content before serialize, that would tell me about any fields of 
the message that are in error.  I.e. so I could catch that issue similarly to 
catching uninitialized fields:



if (!m.IsInitialized())

{

std::string error = name + " is missing fields: ";

std::vector errors;

m.FindInitializationErrors(&errors);

std::vector::const_iterator it;

for(it = errors.begin(); it!= errors.end(); ++it)

{

if (it != errors.begin())

error += ", ";

error += *it;

}

throw SPException(error.c_str());

}



It might not be something I'd do in production, but it sure would help during 
development.





From: Kenton Varda [mailto:ken...@google.com<mailto:ken...@google.com>]
Sent: Thursday, July 09, 2009 5:08 PM

To: Rizzuto, Raymond
Cc: protobuf@googlegroups.com<mailto:protobuf@googlegroups.com>
Subject: Re: intermittent issue with encode (version 2.0.3)



I suppose you could also temporarily edit the header file.

On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond 
mailto:raymond.rizz...@sig.com>> wrote:

I'm trying to, without success.   Breakpoints in header files, at least with 
the version of tools I have, don't work very well.





From: Kenton Varda [mailto:ken...@google.com<mailto:ken...@google.com>]
Sent: Thursday, July 09, 2009 5:02 PM
To: Rizzuto, Raymond
Cc: protobuf@googlegroups.com<mailto:protobuf@googlegroups.com>
Subject: Re: intermittent issue with encode (version 2.0.3)



Run in a debugger and set a breakpoint at wire_format_inl.h:289.

On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond 
mailto:raymond.rizz...@sig.com>> wrote:

I think I have an error in my code (C++) that only occurs when I have multiple 
threads, and a lot of message volume.   Even then, I can run the same test many 
times, but only get a failure on some runs.  With 7 threads running on a 4 core 
machine, and generating 480384 google protocol buffer messages, I get 33 errors 
like this to stdout:



libprotobuf ERROR 
/siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289]
 Encountered string containing invalid UTF-8 data while serializing protocol 
buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.



I believe that the data is in error since I get similar errors decoding the 
messages:



libprotobuf ERROR 
/siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138]
 Encountered string containing invalid UTF-8 data while parsing protocol 
buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.



Is there any way that I can check for this at run time so that I can print out 
more context?  I do call IsInitialized before serializing, but that doesn't 
check for this case.



I am running on SLES9SP4, using gcc 3.3.3 as the compiler.



Ray



__

Re: intermittent issue with encode (version 2.0.3)

2009-07-09 Thread Kenton Varda
Sorry, I think I misread your message.
You just want there to me a method like IsInitialized() that you can call to
validate UTF-8 stuff.  I'll think about that.

On Thu, Jul 9, 2009 at 2:32 PM, Kenton Varda  wrote:

> This is something you can do in your own code -- just call your validation
> function before serializing.  If this were to be a "feature" of protocol
> buffers, then we'd have to store a pointer to your validator function
> somewhere.  Storing it in the message object itself would harm performance
> and memory usage, but storing it in a static location (such that it applies
> to all instances of the type) would bring all the myriad problems commonly
> associated with singletons.  So I don't think there's any reasonable way for
> the protobuf system to provide this.
>
>
> On Thu, Jul 9, 2009 at 2:14 PM, Rizzuto, Raymond 
> wrote:
>
>>  I’m going to try that.   Since another group builds and packages the
>> libraries I use, it’ll take a bit to make a private copy with that change.
>>
>>
>>
>> As an enhancement request, I wish there was a function I could call to
>> validate the message content before serialize, that would tell me about any
>> fields of the message that are in error.  I.e. so I could catch that issue
>> similarly to catching uninitialized fields:
>>
>>
>>
>> if (!m.IsInitialized())
>>
>> {
>>
>> std::string error = name + " is missing fields: ";
>>
>> std::vector errors;
>>
>> m.FindInitializationErrors(&errors);
>>
>> std::vector::const_iterator it;
>>
>> for(it = errors.begin(); it!= errors.end(); ++it)
>>
>> {
>>
>> if (it != errors.begin())
>>
>> error += ", ";
>>
>> error += *it;
>>
>> }
>>
>> throw SPException(error.c_str());
>>
>> }
>>
>>
>>
>> It might not be something I’d do in production, but it sure would help
>> during development.
>>
>>
>>  --
>>
>> *From:* Kenton Varda [mailto:ken...@google.com]
>> *Sent:* Thursday, July 09, 2009 5:08 PM
>>
>> *To:* Rizzuto, Raymond
>> *Cc:* protobuf@googlegroups.com
>> *Subject:* Re: intermittent issue with encode (version 2.0.3)
>>
>>
>>
>> I suppose you could also temporarily edit the header file.
>>
>> On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond 
>> wrote:
>>
>> I’m trying to, without success.   Breakpoints in header files, at least
>> with the version of tools I have, don’t work very well.
>>
>>
>>  --
>>
>> *From:* Kenton Varda [mailto:ken...@google.com]
>> *Sent:* Thursday, July 09, 2009 5:02 PM
>> *To:* Rizzuto, Raymond
>> *Cc:* protobuf@googlegroups.com
>> *Subject:* Re: intermittent issue with encode (version 2.0.3)
>>
>>
>>
>> Run in a debugger and set a breakpoint at wire_format_inl.h:289.
>>
>> On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond 
>> wrote:
>>
>> I think I have an error in my code (C++) that only occurs when I have
>> multiple threads, and a lot of message volume.   Even then, I can run the
>> same test many times, but only get a failure on some runs.  With 7 threads
>> running on a 4 core machine, and generating 480384 google protocol buffer
>> messages, I get 33 errors like this to stdout:
>>
>>
>>
>> libprotobuf ERROR
>> /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289]
>> Encountered string containing invalid UTF-8 data while serializing protocol
>> buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
>>
>>
>>
>> I believe that the data is in error since I get similar errors decoding
>> the messages:
>>
>>
>>
>> libprotobuf ERROR
>> /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138]
>> Encountered string containing invalid UTF-8 data while parsing protocol
>> buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
>>
>>
>>
>> Is there any way that I can check for this at run time so that I can print
>> out more context?  I do call IsInitialized before serializing, but that
>> doesn’t check for this case.
>>
>>
>>
>> I am runnin

Re: intermittent issue with encode (version 2.0.3)

2009-07-09 Thread Kenton Varda
This is something you can do in your own code -- just call your validation
function before serializing.  If this were to be a "feature" of protocol
buffers, then we'd have to store a pointer to your validator function
somewhere.  Storing it in the message object itself would harm performance
and memory usage, but storing it in a static location (such that it applies
to all instances of the type) would bring all the myriad problems commonly
associated with singletons.  So I don't think there's any reasonable way for
the protobuf system to provide this.

On Thu, Jul 9, 2009 at 2:14 PM, Rizzuto, Raymond wrote:

>  I’m going to try that.   Since another group builds and packages the
> libraries I use, it’ll take a bit to make a private copy with that change.
>
>
>
> As an enhancement request, I wish there was a function I could call to
> validate the message content before serialize, that would tell me about any
> fields of the message that are in error.  I.e. so I could catch that issue
> similarly to catching uninitialized fields:
>
>
>
> if (!m.IsInitialized())
>
> {
>
> std::string error = name + " is missing fields: ";
>
> std::vector errors;
>
> m.FindInitializationErrors(&errors);
>
> std::vector::const_iterator it;
>
> for(it = errors.begin(); it!= errors.end(); ++it)
>
> {
>
> if (it != errors.begin())
>
> error += ", ";
>
> error += *it;
>
> }
>
> throw SPException(error.c_str());
>
> }
>
>
>
> It might not be something I’d do in production, but it sure would help
> during development.
>
>
>  --
>
> *From:* Kenton Varda [mailto:ken...@google.com]
> *Sent:* Thursday, July 09, 2009 5:08 PM
>
> *To:* Rizzuto, Raymond
> *Cc:* protobuf@googlegroups.com
> *Subject:* Re: intermittent issue with encode (version 2.0.3)
>
>
>
> I suppose you could also temporarily edit the header file.
>
> On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond 
> wrote:
>
> I’m trying to, without success.   Breakpoints in header files, at least
> with the version of tools I have, don’t work very well.
>
>
>  --
>
> *From:* Kenton Varda [mailto:ken...@google.com]
> *Sent:* Thursday, July 09, 2009 5:02 PM
> *To:* Rizzuto, Raymond
> *Cc:* protobuf@googlegroups.com
> *Subject:* Re: intermittent issue with encode (version 2.0.3)
>
>
>
> Run in a debugger and set a breakpoint at wire_format_inl.h:289.
>
> On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond 
> wrote:
>
> I think I have an error in my code (C++) that only occurs when I have
> multiple threads, and a lot of message volume.   Even then, I can run the
> same test many times, but only get a failure on some runs.  With 7 threads
> running on a 4 core machine, and generating 480384 google protocol buffer
> messages, I get 33 errors like this to stdout:
>
>
>
> libprotobuf ERROR
> /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289]
> Encountered string containing invalid UTF-8 data while serializing protocol
> buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
>
>
>
> I believe that the data is in error since I get similar errors decoding the
> messages:
>
>
>
> libprotobuf ERROR
> /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138]
> Encountered string containing invalid UTF-8 data while parsing protocol
> buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
>
>
>
> Is there any way that I can check for this at run time so that I can print
> out more context?  I do call IsInitialized before serializing, but that
> doesn’t check for this case.
>
>
>
> I am running on SLES9SP4, using gcc 3.3.3 as the compiler.
>
>
>
> Ray
>
>
>  --
>
> Ray Rizzuto
>
> raymond.rizz...@sig.com
>
> Susquehanna International Group
>
> (610)747-2336 (W)
>
> (215)776-3780 (C)
>
>
>
>
>
>
>  --
>
> IMPORTANT: The information contained in this email and/or its attachments
> is confidential. If you are not the intended recipient, please notify the
> sender immediately by reply and immediately delete this message and all its
> attachments. Any review, use, reproduction, disclosure or dissemination of
> this message or any attachment by an unintended recipient is strictly
> prohibite

RE: intermittent issue with encode (version 2.0.3)

2009-07-09 Thread Rizzuto, Raymond
I'm going to try that.   Since another group builds and packages the libraries 
I use, it'll take a bit to make a private copy with that change.

As an enhancement request, I wish there was a function I could call to validate 
the message content before serialize, that would tell me about any fields of 
the message that are in error.  I.e. so I could catch that issue similarly to 
catching uninitialized fields:

if (!m.IsInitialized())
{
std::string error = name + " is missing fields: ";
std::vector errors;
m.FindInitializationErrors(&errors);
std::vector::const_iterator it;
for(it = errors.begin(); it!= errors.end(); ++it)
{
if (it != errors.begin())
error += ", ";
error += *it;
}
throw SPException(error.c_str());
}

It might not be something I'd do in production, but it sure would help during 
development.


From: Kenton Varda [mailto:ken...@google.com]
Sent: Thursday, July 09, 2009 5:08 PM
To: Rizzuto, Raymond
Cc: protobuf@googlegroups.com
Subject: Re: intermittent issue with encode (version 2.0.3)

I suppose you could also temporarily edit the header file.
On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond 
mailto:raymond.rizz...@sig.com>> wrote:

I'm trying to, without success.   Breakpoints in header files, at least with 
the version of tools I have, don't work very well.





From: Kenton Varda [mailto:ken...@google.com<mailto:ken...@google.com>]
Sent: Thursday, July 09, 2009 5:02 PM
To: Rizzuto, Raymond
Cc: protobuf@googlegroups.com<mailto:protobuf@googlegroups.com>
Subject: Re: intermittent issue with encode (version 2.0.3)



Run in a debugger and set a breakpoint at wire_format_inl.h:289.

On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond 
mailto:raymond.rizz...@sig.com>> wrote:

I think I have an error in my code (C++) that only occurs when I have multiple 
threads, and a lot of message volume.   Even then, I can run the same test many 
times, but only get a failure on some runs.  With 7 threads running on a 4 core 
machine, and generating 480384 google protocol buffer messages, I get 33 errors 
like this to stdout:



libprotobuf ERROR 
/siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289]
 Encountered string containing invalid UTF-8 data while serializing protocol 
buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.



I believe that the data is in error since I get similar errors decoding the 
messages:



libprotobuf ERROR 
/siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138]
 Encountered string containing invalid UTF-8 data while parsing protocol 
buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.



Is there any way that I can check for this at run time so that I can print out 
more context?  I do call IsInitialized before serializing, but that doesn't 
check for this case.



I am running on SLES9SP4, using gcc 3.3.3 as the compiler.



Ray





Ray Rizzuto

raymond.rizz...@sig.com<mailto:raymond.rizz...@sig.com>

Susquehanna International Group

(610)747-2336 (W)

(215)776-3780 (C)









IMPORTANT: The information contained in this email and/or its attachments is 
confidential. If you are not the intended recipient, please notify the sender 
immediately by reply and immediately delete this message and all its 
attachments. Any review, use, reproduction, disclosure or dissemination of this 
message or any attachment by an unintended recipient is strictly prohibited. 
Neither this message nor any attachment is intended as or should be construed 
as an offer, solicitation or recommendation to buy or sell any security or 
other financial instrument. Neither the sender, his or her employer nor any of 
their respective affiliates makes any warranties as to the completeness or 
accuracy of any of the information contained herein or that this message or any 
of its attachments is free of viruses.







IMPORTANT: The information contained in this email and/or its attachments is 
confidential. If you are not the intended recipient, please notify the sender 
immediately by reply and immediately delete this message and all its 
attachments. Any review, use, reproduction, disclosure or dissemination of this 
message or any attachment by an unintended recipient is strictly prohibited. 
Neither this message nor any attachment is intended as or should be construed 
as an offer, solicitation or recommendation to buy or sell any security or 
other financial instrument. Neither the sender, his or her employer nor an

Re: intermittent issue with encode (version 2.0.3)

2009-07-09 Thread Kenton Varda
I suppose you could also temporarily edit the header file.

On Thu, Jul 9, 2009 at 2:05 PM, Rizzuto, Raymond wrote:

>  I’m trying to, without success.   Breakpoints in header files, at least
> with the version of tools I have, don’t work very well.
>
>
>  --
>
> *From:* Kenton Varda [mailto:ken...@google.com]
> *Sent:* Thursday, July 09, 2009 5:02 PM
> *To:* Rizzuto, Raymond
> *Cc:* protobuf@googlegroups.com
> *Subject:* Re: intermittent issue with encode (version 2.0.3)
>
>
>
> Run in a debugger and set a breakpoint at wire_format_inl.h:289.
>
> On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond 
> wrote:
>
> I think I have an error in my code (C++) that only occurs when I have
> multiple threads, and a lot of message volume.   Even then, I can run the
> same test many times, but only get a failure on some runs.  With 7 threads
> running on a 4 core machine, and generating 480384 google protocol buffer
> messages, I get 33 errors like this to stdout:
>
>
>
> libprotobuf ERROR
> /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289]
> Encountered string containing invalid UTF-8 data while serializing protocol
> buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
>
>
>
> I believe that the data is in error since I get similar errors decoding the
> messages:
>
>
>
> libprotobuf ERROR
> /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138]
> Encountered string containing invalid UTF-8 data while parsing protocol
> buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
>
>
>
> Is there any way that I can check for this at run time so that I can print
> out more context?  I do call IsInitialized before serializing, but that
> doesn’t check for this case.
>
>
>
> I am running on SLES9SP4, using gcc 3.3.3 as the compiler.
>
>
>
> Ray
>
>
>  --
>
> Ray Rizzuto
>
> raymond.rizz...@sig.com
>
> Susquehanna International Group
>
> (610)747-2336 (W)
>
> (215)776-3780 (C)
>
>
>
>
>
>
>  --
>
> IMPORTANT: The information contained in this email and/or its attachments
> is confidential. If you are not the intended recipient, please notify the
> sender immediately by reply and immediately delete this message and all its
> attachments. Any review, use, reproduction, disclosure or dissemination of
> this message or any attachment by an unintended recipient is strictly
> prohibited. Neither this message nor any attachment is intended as or should
> be construed as an offer, solicitation or recommendation to buy or sell any
> security or other financial instrument. Neither the sender, his or her
> employer nor any of their respective affiliates makes any warranties as to
> the completeness or accuracy of any of the information contained herein or
> that this message or any of its attachments is free of viruses.
>
> >
>
>
>
>
> --
> IMPORTANT: The information contained in this email and/or its attachments
> is confidential. If you are not the intended recipient, please notify the
> sender immediately by reply and immediately delete this message and all its
> attachments. Any review, use, reproduction, disclosure or dissemination of
> this message or any attachment by an unintended recipient is strictly
> prohibited. Neither this message nor any attachment is intended as or should
> be construed as an offer, solicitation or recommendation to buy or sell any
> security or other financial instrument. Neither the sender, his or her
> employer nor any of their respective affiliates makes any warranties as to
> the completeness or accuracy of any of the information contained herein or
> that this message or any of its attachments is free of viruses.
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



RE: intermittent issue with encode (version 2.0.3)

2009-07-09 Thread Rizzuto, Raymond
I'm trying to, without success.   Breakpoints in header files, at least with 
the version of tools I have, don't work very well.


From: Kenton Varda [mailto:ken...@google.com]
Sent: Thursday, July 09, 2009 5:02 PM
To: Rizzuto, Raymond
Cc: protobuf@googlegroups.com
Subject: Re: intermittent issue with encode (version 2.0.3)

Run in a debugger and set a breakpoint at wire_format_inl.h:289.
On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond 
mailto:raymond.rizz...@sig.com>> wrote:

I think I have an error in my code (C++) that only occurs when I have multiple 
threads, and a lot of message volume.   Even then, I can run the same test many 
times, but only get a failure on some runs.  With 7 threads running on a 4 core 
machine, and generating 480384 google protocol buffer messages, I get 33 errors 
like this to stdout:



libprotobuf ERROR 
/siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289]
 Encountered string containing invalid UTF-8 data while serializing protocol 
buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.



I believe that the data is in error since I get similar errors decoding the 
messages:



libprotobuf ERROR 
/siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138]
 Encountered string containing invalid UTF-8 data while parsing protocol 
buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.



Is there any way that I can check for this at run time so that I can print out 
more context?  I do call IsInitialized before serializing, but that doesn't 
check for this case.



I am running on SLES9SP4, using gcc 3.3.3 as the compiler.



Ray





Ray Rizzuto

raymond.rizz...@sig.com<mailto:raymond.rizz...@sig.com>

Susquehanna International Group

(610)747-2336 (W)

(215)776-3780 (C)






IMPORTANT: The information contained in this email and/or its attachments is 
confidential. If you are not the intended recipient, please notify the sender 
immediately by reply and immediately delete this message and all its 
attachments. Any review, use, reproduction, disclosure or dissemination of this 
message or any attachment by an unintended recipient is strictly prohibited. 
Neither this message nor any attachment is intended as or should be construed 
as an offer, solicitation or recommendation to buy or sell any security or 
other financial instrument. Neither the sender, his or her employer nor any of 
their respective affiliates makes any warranties as to the completeness or 
accuracy of any of the information contained herein or that this message or any 
of its attachments is free of viruses.





IMPORTANT: The information contained in this email and/or its attachments is 
confidential. If you are not the intended recipient, please notify the sender 
immediately by reply and immediately delete this message and all its 
attachments. Any review, use, reproduction, disclosure or dissemination of this 
message or any attachment by an unintended recipient is strictly prohibited. 
Neither this message nor any attachment is intended as or should be construed 
as an offer, solicitation or recommendation to buy or sell any security or 
other financial instrument. Neither the sender, his or her employer nor any of 
their respective affiliates makes any warranties as to the completeness or 
accuracy of any of the information contained herein or that this message or any 
of its attachments is free of viruses.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: intermittent issue with encode (version 2.0.3)

2009-07-09 Thread Kenton Varda
Run in a debugger and set a breakpoint at wire_format_inl.h:289.

On Thu, Jul 9, 2009 at 1:56 PM, Rizzuto, Raymond wrote:

>  I think I have an error in my code (C++) that only occurs when I have
> multiple threads, and a lot of message volume.   Even then, I can run the
> same test many times, but only get a failure on some runs.  With 7 threads
> running on a 4 core machine, and generating 480384 google protocol buffer
> messages, I get 33 errors like this to stdout:
>
>
>
> libprotobuf ERROR
> /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289]
> Encountered string containing invalid UTF-8 data while serializing protocol
> buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
>
>
>
> I believe that the data is in error since I get similar errors decoding the
> messages:
>
>
>
> libprotobuf ERROR
> /siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138]
> Encountered string containing invalid UTF-8 data while parsing protocol
> buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
>
>
>
> Is there any way that I can check for this at run time so that I can print
> out more context?  I do call IsInitialized before serializing, but that
> doesn’t check for this case.
>
>
>
> I am running on SLES9SP4, using gcc 3.3.3 as the compiler.
>
>
>
> Ray
>
>
>  --
>
> Ray Rizzuto
>
> raymond.rizz...@sig.com
>
> Susquehanna International Group
>
> (610)747-2336 (W)
>
> (215)776-3780 (C)
>
>
>
>
>
> --
> IMPORTANT: The information contained in this email and/or its attachments
> is confidential. If you are not the intended recipient, please notify the
> sender immediately by reply and immediately delete this message and all its
> attachments. Any review, use, reproduction, disclosure or dissemination of
> this message or any attachment by an unintended recipient is strictly
> prohibited. Neither this message nor any attachment is intended as or should
> be construed as an offer, solicitation or recommendation to buy or sell any
> security or other financial instrument. Neither the sender, his or her
> employer nor any of their respective affiliates makes any warranties as to
> the completeness or accuracy of any of the information contained herein or
> that this message or any of its attachments is free of viruses.
>
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



intermittent issue with encode (version 2.0.3)

2009-07-09 Thread Rizzuto, Raymond
I think I have an error in my code (C++) that only occurs when I have multiple 
threads, and a lot of message volume.   Even then, I can run the same test many 
times, but only get a failure on some runs.  With 7 threads running on a 4 core 
machine, and generating 480384 google protocol buffer messages, I get 33 errors 
like this to stdout:

libprotobuf ERROR 
/siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:289]
 Encountered string containing invalid UTF-8 data while serializing protocol 
buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.

I believe that the data is in error since I get similar errors decoding the 
messages:

libprotobuf ERROR 
/siglinux/tc/sles9sp4_gcc-3.3.3_i686/sig1/protobuf-2.0.3/include/google/protobuf/wire_format_inl.h:138]
 Encountered string containing invalid UTF-8 data while parsing protocol 
buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.

Is there any way that I can check for this at run time so that I can print out 
more context?  I do call IsInitialized before serializing, but that doesn't 
check for this case.

I am running on SLES9SP4, using gcc 3.3.3 as the compiler.

Ray


Ray Rizzuto
raymond.rizz...@sig.com
Susquehanna International Group
(610)747-2336 (W)
(215)776-3780 (C)




IMPORTANT: The information contained in this email and/or its attachments is 
confidential. If you are not the intended recipient, please notify the sender 
immediately by reply and immediately delete this message and all its 
attachments. Any review, use, reproduction, disclosure or dissemination of this 
message or any attachment by an unintended recipient is strictly prohibited. 
Neither this message nor any attachment is intended as or should be construed 
as an offer, solicitation or recommendation to buy or sell any security or 
other financial instrument. Neither the sender, his or her employer nor any of 
their respective affiliates makes any warranties as to the completeness or 
accuracy of any of the information contained herein or that this message or any 
of its attachments is free of viruses.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---