if you recall, i'm working on a project called xthrift, which adds passing
objects by-reference on top of thrift. the project seemed very promising up
until yesterday, when i realized thrift generates way to much code to make
it feasible.

i made an test case of 6 classes, each with 6 methods and 6 attributes, and
6 service functions that expose those. i attached the thrift file that's
generated from my xthrift file -- it contains around 100 functions.

generating java code using the thrift compiler yields a 2.2 MB java source
file! when compiled, it yields a 1.6MB jar! in csharp and python, the
situation is slightly better: ~700 KB. just for the sake of entropy,
compressing (bz2) the generated java code yields a 34 KB file (the a ratio
is 65! )

for our project, that contains ~100 classes, each with ~10 methods and ~5
attributes, plus ~50 functions, the generated java code would weigh tens if
not hundreds of MBs, which is unacceptable, of course.

looking at the generated code, it's easy to spot the redundancy: thrift
employs a "full beta-reduction policy", i.e., it doesn't encapsulate common
functionality into functions, instead it just repeats them over and over.
this yields ~80,000 lines of code that mostly repeat one another.

judging from the code size, i understand thrift is not meant to handle more
than ~50 functions per project, unless you are willing to accept tens of MBs
of library footprint.[1]
is there any "compiler switch" or planned feature, to eliminate this code
bloat?

if not, my company will have to drop thrift and adopt an in-house solution
(which we really hoped to avoid...)


thanks in advance,
-tomer

[1] a 100 MB library, on today's hardware, is not unheardof, but our
project's RAM footprint is ~30 MB... it would be a pity to require such big
a footprint just for glue code.



An NCO and a Gentleman

Reply via email to