[protobuf] Memory leak (fragmentation?)

2009-11-24 Thread Greg Burri
Hi,

I'm using Protocol Buffers in an open source project[1]. I have a tree
structure in memory which represents some directories and files.
This structure must be persisted, so I've defined some protocol
messages[2] to do that.

The persist routine[3] will run through the tree structure to build a
'Protos.FileCache.Hashes' message and serialize it to the hard drive,
then this message will be deleted.

Each time I persist, the memory will increase by ~200ko (very
approximatively) for a tree which contains ~30'000 files and ~4'000
directories.
I Tried to remove all 'set' like 'dirToFill.set_name(this-getName
().toStdString());'[4] to avoid memory leak from my code and kept only
the Protocol Buffer methods. I also removed the serialize call
'Common::PersistantData::setValue(FILE_CACHE, hashes);'[3].
Nonetheless I always observe some memory leak.

Does it due to memory fragmentation ? It's very strange because this
method 'FileManager::persistCacheToFile()'[3] should delete all its
memory when it returns. Should I use a other memory allocater like
tcmalloc ?

[1] : http://dev.euphorik.ch/projects/show/pmp

[2] :
http://git.euphorik.ch/index.cgi?p=aybabtu.git;a=blob;f=application/Protos/files_cache.proto;h=b6dca154ae46b57c29cf3a4b66c9bece52dd3953;hb=b60e04e926bff06ec1707cc13aa70b6a4b8aaa1c

[3] :  FileManager::persistCacheToFile() :
http://git.euphorik.ch/index.cgi?p=aybabtu.git;a=blob;f=application/Core/FileManager/priv/FileManager.cpp;h=c33f94986bef3edcb6ca2ce539fdc11df0bb2b1e;hb=b60e04e926bff06ec1707cc13aa70b6a4b8aaa1c#l289

[4] :
http://git.euphorik.ch/index.cgi?p=aybabtu.git;a=blob;f=application/Core/FileManager/priv/Cache/Directory.cpp;h=0818fefdf7ba2b63918c0157f295afc72c492f03;hb=b60e04e926bff06ec1707cc13aa70b6a4b8aaa1c#l76


Thanks in advance !

/Greg

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




[protobuf] Protocol Buffers and asynchronous sockets

2009-11-24 Thread Gilad Ben-Ami
Hey,

I'm using ACE library for C++ and it's reactor pattern for handling
asynchronous read from  / write to sockets.
I'm trying to integrate Protocol buffers into my solution in order to
exchange data with another process developed in Java.

The way asynchronous work, forces me to know in advance what is the
expected message size and only after i have all the data try to parse
it with PB.
What is the best way to use PB in this scenario? Is there any Stream i
can use to hold the data arrived? and i can i recover from trying to
parse a message that has failed because of not enough data arrived?

Your help is appreciated.
Thanks.

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] Protocol Buffers and asynchronous sockets

2009-11-24 Thread Mika Raento
Protobufs are pretty much designed to be read all at once. The normal
thing would be to define a stream format that prefixes the serialized
protobufs with their length and buffer the data until a whole protobuf
has been read.

In other words: you should not describe the whole stream as a single
protobuf (like you often would with, say, XML) but instead use a
different format for framing a stream of protobufs.

Regards,
   Mika

2009/11/24 Gilad Ben-Ami gilad...@gmail.com:
 Hey,

 I'm using ACE library for C++ and it's reactor pattern for handling
 asynchronous read from  / write to sockets.
 I'm trying to integrate Protocol buffers into my solution in order to
 exchange data with another process developed in Java.

 The way asynchronous work, forces me to know in advance what is the
 expected message size and only after i have all the data try to parse
 it with PB.
 What is the best way to use PB in this scenario? Is there any Stream i
 can use to hold the data arrived? and i can i recover from trying to
 parse a message that has failed because of not enough data arrived?

 Your help is appreciated.
 Thanks.

 --

 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.




--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




[protobuf] Re: Protocol Buffers and asynchronous sockets

2009-11-24 Thread Gilad Ben-Ami
Hey,

The protocol we've defined for this kind of solution is to send a
fixed 4 byte unsigned interger that represents the
following PB message length, read the PB message and wait again for
the size.

So in this case, what is the best method to use PB?
Should i use SerializeToArray and ParseFromArray instead of using the
protobuf::io streams?
(because the data I'm buffering is stored in a char* array) Does PB
provide any stream i can feed with data until
I've read all the expected bytes and then order to parse?

Thanks.


On Nov 24, 12:56 pm, Mika Raento mika.rae...@gmail.com wrote:
 Protobufs are pretty much designed to be read all at once. The normal
 thing would be to define a stream format that prefixes the serialized
 protobufs with their length and buffer the data until a whole protobuf
 has been read.

 In other words: you should not describe the whole stream as a single
 protobuf (like you often would with, say, XML) but instead use a
 different format for framing a stream of protobufs.

 Regards,
    Mika

 2009/11/24 Gilad Ben-Ami gilad...@gmail.com:

  Hey,

  I'm using ACE library for C++ and it's reactor pattern for handling
  asynchronous read from  / write to sockets.
  I'm trying to integrate Protocol buffers into my solution in order to
  exchange data with another process developed in Java.

  The way asynchronous work, forces me to know in advance what is the
  expected message size and only after i have all the data try to parse
  it with PB.
  What is the best way to use PB in this scenario? Is there any Stream i
  can use to hold the data arrived? and i can i recover from trying to
  parse a message that has failed because of not enough data arrived?

  Your help is appreciated.
  Thanks.

  --

  You received this message because you are subscribed to the Google Groups 
  Protocol Buffers group.
  To post to this group, send email to proto...@googlegroups.com.
  To unsubscribe from this group, send email to 
  protobuf+unsubscr...@googlegroups.com.
  For more options, visit this group 
  athttp://groups.google.com/group/protobuf?hl=en.

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




[protobuf] Re: Getting all fields of Message through the java reflection api

2009-11-24 Thread Thibaut
Thanks,

It's working fine now!

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




[protobuf] Re: Protocol Buffers and asynchronous sockets

2009-11-24 Thread Gilad Ben-Ami
Thanks for the suggestion.

Do you think that using std::iostream in the following scenario would
work / be a good choice?
1. read message_length
2. buffer message_length bytes into iostream variable.
3. when all data is received, use  IstreamInputStream to wrap the
iostream and have it parsed with ParseFromZeroCopyStream()

Does the iostream handles releasing the bytes already read by PB?

Thanks.

On Nov 24, 5:29 pm, Evan Jones ev...@mit.edu wrote:
 Gilad Ben-Ami wrote:
  So in this case, what is the best method to use PB?
  Should i use SerializeToArray and ParseFromArray instead of using the
  protobuf::io streams?

 To use protocol buffers with an asynchronous library, you need to
 collect the data for the message is some data structure until you know
 it is all there. If performance is not critical the least effort
 approach is:

 1. Read the message_length from the stream in some way.
 2. Create a std::string.
 3. Read message_length bytes from the stream, appending them to the
 std::string.
 4. Use message.ParseFromString() to parse the message.

 This can be bad for performance because the data may be copied many
 times. If performance is really critical, you basically need to
 efficiently collect the bytes into some buffer data structure. I'm
 assuming the ACE library probably provides something that does this?
 Then, once you have at least message_length bytes, you parse it via a
 ZeroCopyInputStream implementation.

 For my asynchronous library, my implementation is approximately:

 // assume we read the message_length from input somehow
 if (input.availableBytes()  message_length) {
    // get called back later
    return IO_WAIT;

 }

 // MyInputWrapper implements google::protobuf::io::ZeroCopyInputStream
 MyInputWrapper wrapper(input, message_length);
 MyProtocolBuffer message;
 message.ParseFromZeroCopyStream(wrapper);

 I hope this helps,

 Evan

 --
 Evan Joneshttp://evanjones.ca/

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] Re: Protocol Buffers and asynchronous sockets

2009-11-24 Thread Evan Jones
Gilad Ben-Ami wrote:
 Do you think that using std::iostream in the following scenario would
 work / be a good choice?
 1. read message_length
 2. buffer message_length bytes into iostream variable.
 3. when all data is received, use  IstreamInputStream to wrap the
 iostream and have it parsed with ParseFromZeroCopyStream()

If your application doesn't have a buffer already, I recommend using 
std::string. AFAIK, the C++ standard library doesn't provide anything 
more appropriate. It will do a good enough job, particularly if you 
re-use one std::string rather than allocating a new one for each message.

The reason to use something more complicated is because lots of 
applications already have some sort of buffer, and you want to try and 
avoid extra copies.

Evan

-- 
Evan Jones
http://evanjones.ca/

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] Memory leak (fragmentation?)

2009-11-24 Thread Jason Hsueh
Are you sure the leak is due to persistCacheToFile()? It looks like your
FileManager::loadCacheFromFile() method is leaking the protocol buffer when
you are reading the cache back in.

If that's not the case, can you send a small reproduction of the problem?

On Tue, Nov 24, 2009 at 2:03 AM, Greg Burri greg.bu...@gmail.com wrote:

 Hi,

 I'm using Protocol Buffers in an open source project[1]. I have a tree
 structure in memory which represents some directories and files.
 This structure must be persisted, so I've defined some protocol
 messages[2] to do that.

 The persist routine[3] will run through the tree structure to build a
 'Protos.FileCache.Hashes' message and serialize it to the hard drive,
 then this message will be deleted.

 Each time I persist, the memory will increase by ~200ko (very
 approximatively) for a tree which contains ~30'000 files and ~4'000
 directories.
 I Tried to remove all 'set' like 'dirToFill.set_name(this-getName
 ().toStdString());'[4] to avoid memory leak from my code and kept only
 the Protocol Buffer methods. I also removed the serialize call
 'Common::PersistantData::setValue(FILE_CACHE, hashes);'[3].
 Nonetheless I always observe some memory leak.

 Does it due to memory fragmentation ? It's very strange because this
 method 'FileManager::persistCacheToFile()'[3] should delete all its
 memory when it returns. Should I use a other memory allocater like
 tcmalloc ?

 [1] : http://dev.euphorik.ch/projects/show/pmp

 [2] :

 http://git.euphorik.ch/index.cgi?p=aybabtu.git;a=blob;f=application/Protos/files_cache.proto;h=b6dca154ae46b57c29cf3a4b66c9bece52dd3953;hb=b60e04e926bff06ec1707cc13aa70b6a4b8aaa1c

 [3] :  FileManager::persistCacheToFile() :

 http://git.euphorik.ch/index.cgi?p=aybabtu.git;a=blob;f=application/Core/FileManager/priv/FileManager.cpp;h=c33f94986bef3edcb6ca2ce539fdc11df0bb2b1e;hb=b60e04e926bff06ec1707cc13aa70b6a4b8aaa1c#l289

 [4] :

 http://git.euphorik.ch/index.cgi?p=aybabtu.git;a=blob;f=application/Core/FileManager/priv/Cache/Directory.cpp;h=0818fefdf7ba2b63918c0157f295afc72c492f03;hb=b60e04e926bff06ec1707cc13aa70b6a4b8aaa1c#l76


 Thanks in advance !

 /Greg

 --

 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.




--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] Memory leak (fragmentation?)

2009-11-24 Thread Neil T. Dantam

Jason Hsueh wrote:
 Are you sure the leak is due to persistCacheToFile()? 

 Does it due to memory fragmentation ? It's very strange because this
 method 'FileManager::persistCacheToFile()'[3] should delete all its
 memory when it returns. Should I use a other memory allocater like
 tcmalloc ?

I find valgrind to be an invaluable tool for tracking down these
kind of problems (and many others).

http://valgrind.org/

--
Neil

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




[protobuf] Re: Installation/usage - Java version

2009-11-24 Thread dp
[...@lothlorien ~]$ java -version
java version 1.6.0_0
OpenJDK Runtime Environment (IcedTea6 1.6) (fedora-21.b16.fc10-i386)
OpenJDK Server VM (build 14.0-b16, mixed mode)
[...@lothlorien ~]$


On Nov 20, 6:01 pm, Kenton Varda ken...@google.com wrote:
 What version of Java are you using?  It looks like the regular expression
 API is missing some methods.



 On Fri, Nov 20, 2009 at 5:03 PM, dp decimusphos...@gmail.com wrote:
  Hi,

  I needed some help with the Java version. Followed the steps in the
  README(s) [C++/Java], but got some errors with the test step. Have
  added the details below. (Also, saw similar errors with mvn package as
  well)

  It would be great if someone could point me in the right direction.
  Thanks.

  Regards,
  dp

  Disclaimer: Java + protocol buffers neophyte. :)

  Details:

  [...@lothlorien java]$ mvn test
  [INFO] Scanning for projects...
  [INFO]
  
  [INFO] Building Protocol Buffer Java API
  [INFO]    task-segment: [test]
  [INFO]
  
  [INFO] [antrun:run {execution: generate-sources}]
  [INFO] Executing tasks
  [INFO] Executed tasks
  [INFO] Registering compile source root /home/dp/protocol_buffers/
  protobuf-2.1.0/java/target/generated-sources
  [INFO] [resources:resources {execution: default-resources}]
  [WARNING] Using platform encoding (UTF-8 actually) to copy filtered
  resources, i.e. build is platform dependent!
  [INFO] skip non existing resourceDirectory /home/dp/protocol_buffers/
  protobuf-2.1.0/java/src/main/resources
  [INFO] [compiler:compile {execution: default-compile}]
  [INFO] Compiling 2 source files to /home/dp/protocol_buffers/
  protobuf-2.1.0/java/target/classes
  --
  1. ERROR in /home/dp/protocol_buffers/protobuf-2.1.0/java/src/main/
  java/com/google/protobuf/TextFormat.java (at line 440)
         while (pos  matcher.regionStart()) {
                              ^^^
  The method regionStart() is undefined for the type Matcher
  --
  2. ERROR in /home/dp/protocol_buffers/protobuf-2.1.0/java/src/main/
  java/com/google/protobuf/TextFormat.java (at line 451)
         if (matcher.regionStart() == matcher.regionEnd()) {
                     ^^^
  The method regionStart() is undefined for the type Matcher
  --
  3. ERROR in /home/dp/protocol_buffers/protobuf-2.1.0/java/src/main/
  java/com/google/protobuf/TextFormat.java (at line 451)
         if (matcher.regionStart() == matcher.regionEnd()) {
                                              ^
  The method regionEnd() is undefined for the type Matcher
  --
  4. ERROR in /home/dp/protocol_buffers/protobuf-2.1.0/java/src/main/
  java/com/google/protobuf/TextFormat.java (at line 455)
         matcher.usePattern(TOKEN);
                 ^^
  The method usePattern(Pattern) is undefined for the type Matcher
  --
  5. ERROR in /home/dp/protocol_buffers/protobuf-2.1.0/java/src/main/
  java/com/google/protobuf/TextFormat.java (at line 458)
         matcher.region(matcher.end(), matcher.regionEnd());
                                               ^
  The method regionEnd() is undefined for the type Matcher
  --
  6. ERROR in /home/dp/protocol_buffers/protobuf-2.1.0/java/src/main/
  java/com/google/protobuf/TextFormat.java (at line 462)
         matcher.region(pos + 1, matcher.regionEnd());
                                         ^
  The method regionEnd() is undefined for the type Matcher
  --
  7. ERROR in /home/dp/protocol_buffers/protobuf-2.1.0/java/src/main/
  java/com/google/protobuf/TextFormat.java (at line 474)
         matcher.usePattern(WHITESPACE);
                 ^^
  The method usePattern(Pattern) is undefined for the type Matcher
  --
  8. ERROR in /home/dp/protocol_buffers/protobuf-2.1.0/java/src/main/
  java/com/google/protobuf/TextFormat.java (at line 476)
         matcher.region(matcher.end(), matcher.regionEnd());
                                               ^
  The method regionEnd() is undefined for the type Matcher
  --
  8 problems (8 errors)[INFO]
  
  [ERROR] BUILD FAILURE
  [INFO]
  
  [INFO] Compilation failure
  Failure executing javac, but could not parse the error:

  [INFO]
  
  [INFO] For more information, run Maven with the -e switch
  [INFO]
  
  [INFO] Total time: 18 seconds
  [INFO] Finished at: Fri Nov 20 16:46:29 GMT-08:00 2009
  [INFO] Final Memory: 35M/42M
  [INFO]
  
  [...@lothlorien java]$ protoc --version
  libprotoc 2.1.0
  

[protobuf] Invalid free() / delete / delete[]

2009-11-24 Thread Vlad
When I link protobuf library on linux suse to empty program valgrind
starts to complain about:

==26306== Invalid free() / delete / delete[]
==26306==at 0x4A1F99E: free (vg_replace_malloc.c:323)
==26306==by 0x6467D1A: free_mem (in /lib64/libc-2.4.so)
==26306==by 0x6467991: __libc_freeres (in /lib64/libc-2.4.so)
==26306==by 0x491C31C: _vgnU_freeres (vg_preloaded.c:60)
==26306==by 0x63A82F4: exit (in /lib64/libc-2.4.so)
==26306==by 0x639315A: (below main) (in /lib64/libc-2.4.so)
==26306==  Address 0x4031928 is not stack'd, malloc'd or (recently)
free'd
==26306==

Any help?
Thanks \/lad.

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] Invalid free() / delete / delete[]

2009-11-24 Thread Kenton Varda
I can't tell anything from that stack trace, sorry.

On Tue, Nov 24, 2009 at 10:10 AM, Vlad vladimir.sakha...@gmail.com wrote:

 When I link protobuf library on linux suse to empty program valgrind
 starts to complain about:

 ==26306== Invalid free() / delete / delete[]
 ==26306==at 0x4A1F99E: free (vg_replace_malloc.c:323)
 ==26306==by 0x6467D1A: free_mem (in /lib64/libc-2.4.so)
 ==26306==by 0x6467991: __libc_freeres (in /lib64/libc-2.4.so)
 ==26306==by 0x491C31C: _vgnU_freeres (vg_preloaded.c:60)
 ==26306==by 0x63A82F4: exit (in /lib64/libc-2.4.so)
 ==26306==by 0x639315A: (below main) (in /lib64/libc-2.4.so)
 ==26306==  Address 0x4031928 is not stack'd, malloc'd or (recently)
 free'd
 ==26306==

 Any help?
 Thanks \/lad.

 --

 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.




--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] Re: Protocol Buffers and asynchronous sockets

2009-11-24 Thread Kenton Varda
Yes, use std::string.  The only potential problem is if your messages are
very large -- allocating large contiguous blocks of memory (as std::string
does) could lead to memory fragmentation.  But for small and medium-sized
messages, there's no reason not to use std::string as the buffer.  Parsing
from an std::string (or a simple array -- they're essentially the same) is
(slightly) faster than parsing from any other data structure.

On Tue, Nov 24, 2009 at 9:07 AM, Evan Jones ev...@mit.edu wrote:

 Gilad Ben-Ami wrote:
  Do you think that using std::iostream in the following scenario would
  work / be a good choice?
  1. read message_length
  2. buffer message_length bytes into iostream variable.
  3. when all data is received, use  IstreamInputStream to wrap the
  iostream and have it parsed with ParseFromZeroCopyStream()

 If your application doesn't have a buffer already, I recommend using
 std::string. AFAIK, the C++ standard library doesn't provide anything
 more appropriate. It will do a good enough job, particularly if you
 re-use one std::string rather than allocating a new one for each message.

 The reason to use something more complicated is because lots of
 applications already have some sort of buffer, and you want to try and
 avoid extra copies.

 Evan

 --
 Evan Jones
 http://evanjones.ca/

 --

 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.




--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




[protobuf] [De]serialization of messages to java strings

2009-11-24 Thread Will Morton
Hello all;

I need to serialize a protobuf message to a string so that it can be
passed outside my program.  The below fails, I'm guessing due to UTF8
encoding issues:

byte[] arr = msg.toByteArray();
String str = new String(arr);
// ... pass str around ...
MsgType msg2 = MsgType.parseFrom(str.getBytes()); // -- throws
InvalidProtocolBufferException

So, reading the API, I thought I should use ByteStrings, with their
handy UTF8 encoding methods, but this doesn't work either:

ByteString bs = msg.toByteString();
String str = bs.toStringUtf8();
// ... pass str around ...
ByteString bs2 = ByteString.copyFromUtf8(str);
MsgType msg2 = MsgType.parseFrom(bs2); // -- Still throws exception

What am I doing wrong?  What's the best way to do java string
serialization of protobuf messages?

Thanks in advance,

Will

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] [De]serialization of messages to java strings

2009-11-24 Thread Kenton Varda
Strings contain text, not arbitrary bytes.  Encoded protocol buffers are
arbitrary bytes, not text.  So, they aren't compatible.  You would need to
do something like base-64 encode the data in order to put it in a String.

On Tue, Nov 24, 2009 at 3:16 PM, Will Morton will.mor...@gmail.com wrote:

 Hello all;

 I need to serialize a protobuf message to a string so that it can be
 passed outside my program.  The below fails, I'm guessing due to UTF8
 encoding issues:

 byte[] arr = msg.toByteArray();
 String str = new String(arr);
 // ... pass str around ...
 MsgType msg2 = MsgType.parseFrom(str.getBytes()); // -- throws
 InvalidProtocolBufferException

 So, reading the API, I thought I should use ByteStrings, with their
 handy UTF8 encoding methods, but this doesn't work either:

 ByteString bs = msg.toByteString();
 String str = bs.toStringUtf8();
 // ... pass str around ...
 ByteString bs2 = ByteString.copyFromUtf8(str);
 MsgType msg2 = MsgType.parseFrom(bs2); // -- Still throws exception

 What am I doing wrong?  What's the best way to do java string
 serialization of protobuf messages?

 Thanks in advance,

 Will

 --

 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.




--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] [De]serialization of messages to java strings

2009-11-24 Thread Will Morton
2009/11/25 Adam Vartanian flo...@google.com:
 What am I doing wrong?  What's the best way to do java string
 serialization of protobuf messages?

 If you absolutely have to pass things around as a String, you're going
 to need to do so in some kind of encoding that supports arbitrary
 data.  For example, you could encode it in Base64.


Great, thanks guys... I was wondering if protobuf had a more efficient
string-safe encoding, but I'll just base64 it.

Cheers!

Will

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] [De]serialization of messages to java strings

2009-11-24 Thread Kenton Varda
You can use TextFormat but it is probably *less* efficient than base64.

On Tue, Nov 24, 2009 at 4:14 PM, Will Morton will.mor...@gmail.com wrote:

 2009/11/25 Adam Vartanian flo...@google.com:
  What am I doing wrong?  What's the best way to do java string
  serialization of protobuf messages?
 
  If you absolutely have to pass things around as a String, you're going
  to need to do so in some kind of encoding that supports arbitrary
  data.  For example, you could encode it in Base64.
 

 Great, thanks guys... I was wondering if protobuf had a more efficient
 string-safe encoding, but I'll just base64 it.

 Cheers!

 Will

 --

 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.




--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] [De]serialization of messages to java strings

2009-11-24 Thread Adam Vartanian
 What am I doing wrong?  What's the best way to do java string
 serialization of protobuf messages?

The native wire format of protocol buffers is just a sequence of
bytes, so it can contain values that are invalid UTF-8 (or any
encoding that has invalid byte sequences).  Trying to pack that into a
String, which holds Unicode character data, isn't going to work well;
Strings are welcome to mangle the bytes however they want as long as
the same characters are represented.  If you want to pass a serialized
protocol buffer to something else, you should generally use a
ByteString, byte[], or ByteBuffer.

If you absolutely have to pass things around as a String, you're going
to need to do so in some kind of encoding that supports arbitrary
data.  For example, you could encode it in Base64.

- Adam

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




[protobuf] Re: Fix for Protobuf issue 136: memoized serialized size of packed fields was invalid (issue157169)

2009-11-24 Thread Kenton Varda
I guess it's technically the case that a packed field always has non-zero
size, but maybe we should initialize the cached sizes to -1 anyway to make
it more clear when they haven't been initialized?

On Tue, Nov 24, 2009 at 7:04 PM, jas...@google.com wrote:

 Reviewers: kenton,

 Description:
 Fix Issue 136: the memoized serialized size for packed fields may not
 be properly set. writeTo() may be invoked without a call to
 getSerializedSize(), so the generated serialization methods would
 write a length of 0 for non-empty packed fields. Since the object
 is immutable, we know that the memoized size is 0 or set to the
 correct size. In the generated code, check to see if the memoized
 size is 0, and if so, call getSerializedSize() before actually
 emitting the field.

 Tested: new unittest case in WireFormatTest.java now passes



 Please review this at http://codereview.appspot.com/157169

 Affected files:
  M java/src/test/java/com/google/protobuf/WireFormatTest.java
  M src/google/protobuf/compiler/java/java_primitive_field.cc


 Index: java/src/test/java/com/google/protobuf/WireFormatTest.java
 ===
 --- java/src/test/java/com/google/protobuf/WireFormatTest.java  (revision
 246)
 +++ java/src/test/java/com/google/protobuf/WireFormatTest.java  (working
 copy)
 @@ -102,6 +102,28 @@
 assertEquals(rawBytes, rawBytes2);
   }

 +  public void testSerializationPackedWithoutGetSerializedSize()
 +  throws Exception {
 +// Write directly to an OutputStream, without invoking
 getSerializedSize()
 +// This used to be a bug where the size of a packed field was
 incorrect,
 +// since getSerializedSize() was never invoked.
 +TestPackedTypes message = TestUtil.getPackedSet();
 +
 +// Directly construct a CodedOutputStream around the actual
 OutputStream,
 +// because writeTo() now invokes getSerializedSize();
 +ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
 +CodedOutputStream codedOutput =
 CodedOutputStream.newInstance(outputStream);
 +
 +message.writeTo(codedOutput);
 +
 +codedOutput.flush();
 +
 +TestPackedTypes message2 = TestPackedTypes.parseFrom(
 +outputStream.toByteArray());
 +
 +TestUtil.assertPackedFieldsSet(message2);
 +  }
 +
   public void testSerializeExtensionsLite() throws Exception {
 // TestAllTypes and TestAllExtensions should have compatible wire
 formats,
 // so if we serialize a TestAllExtensions then parse it as TestAllTypes
 Index: src/google/protobuf/compiler/java/java_primitive_field.cc
 ===
 --- src/google/protobuf/compiler/java/java_primitive_field.cc   (revision
 246)
 +++ src/google/protobuf/compiler/java/java_primitive_field.cc   (working
 copy)
 @@ -384,8 +384,17 @@
  void RepeatedPrimitiveFieldGenerator::
  GenerateSerializationCode(io::Printer* printer) const {
   if (descriptor_-options().packed()) {
 +// writeTo() may be called without getSerializedSize() ever having
 been
 +// called, so we may need to compute the length of the packed data.
 Since
 +// the object is immutable, we know that the memoized size is either
 set to
 +// 0 (via object initialization) or else it has the correct size from
 a
 +// previous getSerializedSize() call. If the field is not empty and
 the size
 +// has not yet been computed, just call getSerializedSize()
 printer-Print(variables_,
   if (get$capitalized_name$List().size()  0) {\n
 +if ($name$MemoizedSerializedSize == 0) {\n
 +  getSerializedSize();\n
 +}\n
 output.writeRawVarint32($tag$);\n
 output.writeRawVarint32($name$MemoizedSerializedSize);\n
   }\n




--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




[protobuf] Re: Fix for Protobuf issue 136: memoized serialized size of packed fields was invalid (issue157169)

2009-11-24 Thread Kenton Varda
On Tue, Nov 24, 2009 at 7:40 PM, Kenton Varda ken...@google.com wrote:

 I guess it's technically the case that a packed field always has non-zero
 size, but maybe we should initialize the cached sizes to -1 anyway to make
 it more clear when they haven't been initialized?


That was a confusing sentence.  Change when they haven't been initialized
to when getSerializedSize() hasn't been called.



 On Tue, Nov 24, 2009 at 7:04 PM, jas...@google.com wrote:

 Reviewers: kenton,

 Description:
 Fix Issue 136: the memoized serialized size for packed fields may not
 be properly set. writeTo() may be invoked without a call to
 getSerializedSize(), so the generated serialization methods would
 write a length of 0 for non-empty packed fields. Since the object
 is immutable, we know that the memoized size is 0 or set to the
 correct size. In the generated code, check to see if the memoized
 size is 0, and if so, call getSerializedSize() before actually
 emitting the field.

 Tested: new unittest case in WireFormatTest.java now passes



 Please review this at http://codereview.appspot.com/157169

 Affected files:
  M java/src/test/java/com/google/protobuf/WireFormatTest.java
  M src/google/protobuf/compiler/java/java_primitive_field.cc


 Index: java/src/test/java/com/google/protobuf/WireFormatTest.java
 ===
 --- java/src/test/java/com/google/protobuf/WireFormatTest.java  (revision
 246)
 +++ java/src/test/java/com/google/protobuf/WireFormatTest.java  (working
 copy)
 @@ -102,6 +102,28 @@
 assertEquals(rawBytes, rawBytes2);
   }

 +  public void testSerializationPackedWithoutGetSerializedSize()
 +  throws Exception {
 +// Write directly to an OutputStream, without invoking
 getSerializedSize()
 +// This used to be a bug where the size of a packed field was
 incorrect,
 +// since getSerializedSize() was never invoked.
 +TestPackedTypes message = TestUtil.getPackedSet();
 +
 +// Directly construct a CodedOutputStream around the actual
 OutputStream,
 +// because writeTo() now invokes getSerializedSize();
 +ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
 +CodedOutputStream codedOutput =
 CodedOutputStream.newInstance(outputStream);
 +
 +message.writeTo(codedOutput);
 +
 +codedOutput.flush();
 +
 +TestPackedTypes message2 = TestPackedTypes.parseFrom(
 +outputStream.toByteArray());
 +
 +TestUtil.assertPackedFieldsSet(message2);
 +  }
 +
   public void testSerializeExtensionsLite() throws Exception {
 // TestAllTypes and TestAllExtensions should have compatible wire
 formats,
 // so if we serialize a TestAllExtensions then parse it as
 TestAllTypes
 Index: src/google/protobuf/compiler/java/java_primitive_field.cc
 ===
 --- src/google/protobuf/compiler/java/java_primitive_field.cc   (revision
 246)
 +++ src/google/protobuf/compiler/java/java_primitive_field.cc   (working
 copy)
 @@ -384,8 +384,17 @@
  void RepeatedPrimitiveFieldGenerator::
  GenerateSerializationCode(io::Printer* printer) const {
   if (descriptor_-options().packed()) {
 +// writeTo() may be called without getSerializedSize() ever having
 been
 +// called, so we may need to compute the length of the packed data.
 Since
 +// the object is immutable, we know that the memoized size is either
 set to
 +// 0 (via object initialization) or else it has the correct size from
 a
 +// previous getSerializedSize() call. If the field is not empty and
 the size
 +// has not yet been computed, just call getSerializedSize()
 printer-Print(variables_,
   if (get$capitalized_name$List().size()  0) {\n
 +if ($name$MemoizedSerializedSize == 0) {\n
 +  getSerializedSize();\n
 +}\n
 output.writeRawVarint32($tag$);\n
 output.writeRawVarint32($name$MemoizedSerializedSize);\n
   }\n





--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] Re: Add another option to support java_implement_interface

2009-11-24 Thread Kenton Varda
On Tue, Nov 24, 2009 at 6:45 PM, Alex Antonov a...@antonov.ws wrote:

  Rewriting the protocol compiler to use some sort of template system is
 not
  something we really have resources for.  That said, the plugins thing I'm
  working on may be useful to similar ends.

 Oh, the plugins stuff sounds interesting.  Is there anything on the
 discussion board about it, or still in the works?


I send an e-mail awhile back discussing my design.  Search for plugin or
plugins.

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.