I certainly wouldn't want to see the global interface design you proposed. That would be awkward.
I was thinking something more along the lines of your factory idea, but without hardcoding the serializer styles. Instead, it would be up to the factory to select the appropriate serializer. Then it becomes an exercise in classifying protocols so that it's clear which kind of serializer they should use. For instance, Binary and (existing) Compact both use the "context free" serializer; JSON would probably use a custom serializer, as would a rewritten Compact. I'm not sure how to go into too much more detail without writing code. Are you in a position where you want to start hacking on this project? If so, we can chat offline about how to get a prototype going. -Bryan On Sun, May 22, 2011 at 10:51 AM, David Nadlinger <c...@klickverbot.at>wrote: > Hey Bryan, > > First, I'd like to thank you a lot for your offer – I very much appreciate > any help from more experienced Thrift users or developers. > > I thought a bit more about this issue, and while I agree that the current > scheme makes it really hard to implement alternative protocols differing > from the flat, »context-free« nature of the default binary protocol, I'm not > sure how pluggable serializers would be implemented in your idea. > > More specifically, I can't quite see how structs would really be serialized > after the change. Would you propose to replace the protocol interface by a > project-specific generated serializer interface having a write method for > all defined struct types, like in the following example? > > --- > struct Foo { > int a; > // No read/write method here. > } > > struct Bar { … } > > interface TSerializer { > void writeFoo(Foo f); > void writeBar(Bar b); > } > > class TBinarySerializer implements TSerializer { > this (TTransport t) { … } > void writeFoo(Foo f) { … } > void writeBar(Bar b) { … } > } > > class TJsonSerializer implements TSerializer { … } > --- > > Having such a single global interface doesn't seem quite right to me > (extensibility, etc.) even if it would be generated, and indeed you wrote > about serializer classes being generated for each struct. But how would you > connect serializers to protocols then, or how would the protocol interface > (i.e. TProtocol and friends) look like in the first place to allow for > writing protocol agnostic code? It appears to me that somewhere all possible > »protocol styles« (i.e. serializer types) would have to be enumerated, > because otherwise there would be no way for the write() methods to be able > to select the correct serializer, which doesn't seem like a great solution > either. > > To clarify what I mean, another example how I think this approach could be > implemented: > > --- > interface TProtocol { > void writeStruct(TStructSerializerFactory s); > … > } > > class TBinaryProtocol implements TProtocol { > void writeStruct(TStructSerializerFactory s) { > s.getBinarySerializer().writeTo(this); > } > … > } > > interface TBinarySerializer { void writeTo(TBinaryProtocol t); } > interface TJsonSerializer { void writeTo(TJsonProtocol t); } > > interface TStructSerializerFactory { > // Have to enumerate all possible protocol »styles« here. > TBinarySerializer getBinarySerializer(); > TJsonSerializer getJsonSerializer(); > … > } > > struct Foo { > int a; > void write(TProtocol t) { > t.writeStruct(new FooSerializerFactory(this)); > } > } > > class FooSerializerFactory implements TStructSerializerFactory { > Foo f_; > this(Foo f) { > f_ = f; > } > TBinarySerializer getBinarySerializer() { > return new FooBinarySerializer(f_); > } > // other factory methods > … > } > > class FooBinarySerializer implements TBinarySerializer { > Foo f_; > this(Foo f) { > f_ = f; > } > void writeTo(TBinaryProtocol t) { > // The code currently generated into Foo.write(). > … > } > } > --- > > There are of course a few other possible ways to implement this, but I > couldn't really come up with a design to connect serializers and protocols > that doesn't seem hackish or overly complex. > > But isn't the problem really just that the current TProtocol interface > makes it hard to implement protocols that have some kind of »scope« or > »nesting«, like JSON does, because everything is »flattened« to a single > layer, only to painstakingly reconstruct the structure from the > write*Begin() and write*End() calls later? > > I think it would help quite a bit to just replace all the pairs of *Begin() > and *End() calls with a single function, e.g. writeStruct(), which takes a > delegate/lambda (or whatever it is called in the respective language) for > writing the children. A little piece of D-style pseudocode to illustrate > what I mean: > > --- > interface TProtocol { > void writeStruct(string name, void delegate() writeMembers); > … > } > > class TJsonProtocol implements TProtocol { > void writeStruct(string name, void delegate() writeMembers) { > // Do some setup work, open a new JSON object. > … > // Call the passed in delegate, which calls other write* functions > // on this protocol instance to write out all the members. > writeMembers(); > > // Do some cleanup work, close the JSON object definition, being > // able to access any data stored in local variables above. > } > > … > } > > struct Foo { > int a; > void write(TProtocol t) { > t.writeStruct("Foo", { > // Write all the members of Foo to t, just like we do now: > t.writeField(1, …); > } ); > } > } > --- > > This way, you don't need an excessive amount of bookkeeping to persist the > information about the structure across the different calls by just mapping > the structure to recursive function calls, but there is still a simple > common interface for all protocols. I'll give it a try when implementing the > protocols in D, let's see how this works out… > > Thanks for reading through all this, > David > > > > On 4/30/11 11:26 PM, Bryan Duxbury wrote: > >> Hey David - >> >> I don't think it's been explored in great detail anywhere yet, but my idea >> was that we'd introduce a layer of abstraction between struct and protocol >> called serializer. This new object would basically take the guts of the >> write() and read() methods and move them into a separate class, which the >> compiler would generate for each struct. >> >> The first draft of this would just be an exercise in refactoring, but once >> the code was generated in a different class, we could extend he model to >> generate different kinds of serializers that work better with different >> protocols. For instance, I could imagine a "CompactSerializer" that meant >> we >> didn't have to keep a stateful Protocol, or a JsonSerializer that just >> made >> JSON without all the existing machinations. >> >> I wish I had more to offer here, but I just haven't had the time to >> experiment. If you're starting from scratch on a new language >> implementation, I'd recommend just porting the Java library as directly as >> you can manage. It's extremely mature and robust - and it has pretty >> decent >> tests. >> >> Let me know if you run into specific roadblocks. I'm always happy to help >> new languages come on board! >> >> -Bryan >> >> On Fri, Apr 29, 2011 at 4:36 PM, David Nadlinger<c...@klickverbot.at >> >wrote: >> >> Hello list, >>> >>> as this is my first post here, let my quickly introduce myself first: My >>> name is David Nadlinger, I'm a student from Austria, and I am going to >>> work >>> on a Thrift-related project during this year's Google Summer of Code >>> under >>> the umbrella of Digital Mars: a Thrift implementation for/in the D >>> programming language. [1] >>> >>> While preparing my project proposal, I came across a JIRA entry which >>> discusses the idea of pluggable serializers [2], and as I will implement >>> a >>> new language library during the course of the project, this obviously >>> caught >>> my attention. As I am somewhat familiar with the way serialization is >>> currently implemented, I can see the limitations of the existing >>> approach, >>> but are there any details on how exactly the design of the proposed new >>> solution would look like? Maybe there is some previous discussion on the >>> topic I missed while looking through the mailing list archives? >>> Otherwise, >>> Bryan, would you mind quickly sketching how you envision the design? >>> >>> As I am currently thinking about the library design for D, I would be >>> grateful for any feedback, also regarding any other lessons learned about >>> the current C++/Java library design. >>> >>> Thanks a lot, >>> David >>> >>> >>> [1] http://klickverbot.at/code/gsoc/thrift/ (nothing of interest there >>> yet) >>> >>> [2] https://issues.apache.org/jira/browse/THRIFT-769 >>> >>> >> >