You don't sound dismissive; I think you're right. Reflection can't possibly be as fast as generated code if it is doing all the same useful operations but with extra lookups and conditionals to get there. I guess I was just wondering where you make the tradeoff between more concise/clear/friendly generated code and more performance. One valid answer is that performance trumps 'friendliness', because people shouldn't be looking at the generated code anyway.
I do like your suggestion of looking into serializing non-generated objects, except that it detracts from the value of the IDL, which may be my favorite part of Thrift. I'm curious if it would be feasible to make a version of Thrift that generated ONLY annotated interfaces, and used reflection, factories, and dynamic proxies to do the rest. It would be extremely slow, that's for sure, but it might be appealing enough to cost me a weekend. -- Jeff DeCew On Thu, Jul 22, 2010 at 8:08 PM, Bryan Duxbury <[email protected]> wrote: > On Thu, Jul 22, 2010 at 6:10 PM, Jeffrey DeCew <[email protected]> wrote: > > > There's something to be said for simplifying the generated code and > moving > > responsibilities to the library. It would certainly make bug fixes > easier > > to incorporate if they only affected library code. > > > > I agree in principle, but it remains to be seen if there this could be done > without negative impact in other areas. > > > > > > Thrift has done a great job of adapting to the standards of various > > languages, but I think that it has been harder to reach the bar set by > > Java, > > with it's much more pervasive and defined conventions. One such > > 'convention' of Java (or perhaps it's a backlash against EJB and other > > codegen-heavy Java frameworks that have lost favor) is the lack of > > generated > > source code in favor of greater use of interfaces as well as annotations > > and > > reflection. I think Thrift could gain a lot more Java support (certainly > > in > > my company) if it embraced reflection (where it can remain high > > performance) > > to reduce the massive amounts of (often highly redundant) generated code. > > > > I don't mean to sound dismissive, but if you were able to make changes to > start using reflection in non-trivial parts of Thrift (like the > serialization and deserialization code), and it *didn't* have a negative > performance impact, I'd be astounded. I've spent a lot of time staring at a > profiler, and the first thing I usually do is get rid of reflection. Maybe > annotations aren't as much of an issue, but where would we use them in > particular? > > Is the lack of reflection use seriously hampering Thrift adoption at your > company? Or do you just mean that the bloated code turns people off, and > reflection is a solution? > > > > > > > Regardless, it would be good to see some benchmark data comparing this > > reflection-based system with the current Thrift baseline, just to see the > > real performance cost we are starting with. > > > > I'd also love to see this benchmark. Is this something you could put > together? I may be able to assist somewhat. > > > > > > I work in a group that is primarily Java-based, but we occasionally use > > Thrift because we have some C++ apps (current and future) that > desperately > > want to be better connected to our environment. The main technology that > > we > > pitted Thrift against is Spring remoting, which is pretty much Java-only, > > and uses dynamic proxies on the client side to implement a plain Java > > interface at runtime and has library classes which use reflection to > > process > > messages on the server side. These may be strange concepts to Thrift > (and > > certainly to C++), but they are more and more common in Java, and we > should > > encourage Java developers to help make Thrift more comfortable for > > themselves to work in, assuming it doesn't sacrifice performance too > much. > > > > I don't personally have experience with Spring remoting, and it's not in > the > benchmarks I've seen, but maybe you could find something similar here: > http://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2 > > However, if you were so inclined, you could absolutely use reflection to > serialize arbitrary objects with Thrift protocols, and you've have > interoperability with a bunch of languages. I think this would be a really > interesting contribution to Thrift. > > > > > > -- > > Jeff DeCew > > > > > > On Thu, Jul 22, 2010 at 1:35 PM, Bryan Duxbury <[email protected]> > wrote: > > > > > This is pretty interesting. I think that you're right in your > assessment > > > that these changes would adversely affect performance, but I don't > agree > > > that the performance changes are unimportant on the client. In my > > > experience, using Java Reflection for anything that is vaguely > > > performance-critical gets you in a lot of trouble. All the generated > > > metadata stuff is designed with this in mind. > > > > > > I'm a bit surprised that you got that degree of size reduction from > your > > > changes. I would be open to incorporating some of them - the toString > > > helper > > > method, for instance - and perhaps adding switches for things like > equals > > > and compareTo, if it's reasonable to remove them at times. I don't > think > > it > > > makes sense to branch the serialization code, though. > > > > > > One question I have in general, though, is why does the size of the jar > > > matter at all? It's not like it's going to take up tons of memory or > slow > > > you down particularly when being transferred around. > > > > > > -Bryan > > > > > > On Thu, Jul 22, 2010 at 11:34 AM, Emmanuel Bourg <[email protected]> > > > wrote: > > > > > > > Hi all, > > > > > > > > I'm new to Thrift and I started playing lately with a generated Java > > > > client. I was somewhat struck by the size of the resulting jar, about > > 3MB > > > > for an interface consisting of about 30 service methods and 150 > structs > > > of > > > > variable size. > > > > > > > > The generated classes contain code that isn't always useful, like the > > > > equals() and compareTo() methods. The implementation of toString() > > could > > > > easily be delegated to an utility class. Annotations could be used to > > > > represent the metadata. The service implementation contains the > client > > > and > > > > the server, but I rarely need both at the same time. > > > > > > > > From these observations I tweaked the compiler to generate minimal > > > classes. > > > > For example a simple struct like this one: > > > > > > > > struct Person { > > > > 1:i32 id, > > > > 2:string name, > > > > 3:string email > > > > } > > > > > > > > will translate into the following class: > > > > > > > > @ThriftStruct("Person") > > > > public class Person implements Serializable { > > > > > > > > @ThriftField(name="id", type=TType.I32, id=1) > > > > public int id; > > > > > > > > @ThriftField(name="name", type=TType.STRING, id=2) > > > > public String name; > > > > > > > > @ThriftField(name="email", type=TType.STRING, id=3) > > > > public String email; > > > > > > > > public int getId() { > > > > return this.id; > > > > } > > > > > > > > public void setId(int id) { > > > > this.id = id; > > > > } > > > > > > > > public String getName() { > > > > return this.name; > > > > } > > > > > > > > public void setName(String name) { > > > > this.name = name; > > > > } > > > > > > > > public String getEmail() { > > > > return this.email; > > > > } > > > > > > > > public void setEmail(String email) { > > > > this.email = email; > > > > } > > > > > > > > public String toString() { > > > > return ToStringBuilder.toString(this); > > > > } > > > > } > > > > > > > > The read() and write() methods are now implemented in a utility class > > > using > > > > reflection to find out the fields being serialized. That's probably > > > slower > > > > than the static code generated by the original compiler but that's > less > > > > critical on the client side. > > > > > > > > With these modifications the resulting jar was about 10 times smaller > > in > > > my > > > > case. > > > > > > > > I uploaded the code on Github if someone is interested to look into > it. > > > > It's derived from the 0.2 branch. > > > > > > > > http://github.com/ebourg/thrift > > > > http://wiki.github.com/ebourg/thrift > > > > > > > > > > > > Emmanuel Bourg > > > > > > > > > > > > > >
