[ 
https://issues.apache.org/jira/browse/GIRAPH-37?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128664#comment-13128664
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-37:
-----------------------------------------

Ok emailed Marius Eriksen (Finagle lead, among other things), and here's his 
feedback so far:

{quote}
that's great! (that they're doing this). would be happy to help in any way to 
make it work.

> 1) why a custom thrift compiler? makes distribution of code hard, have
> to make devs install that

this sucks, but it sadly necessary (unless we can get our work integrated with 
the standard thrift stack). we do require custom codegen in order to interface 
with the finagle thrift codec.

we now actually have our own entirely-in-JVM codegenerator, that parses thrift 
IDL, etc.-- so at the very least we'll have something portable that also 
shouldn't require any installation-- presumably the various build systems can 
download them as a build-only dependency, etc. we're using this internally for 
a few projects already, but still working out how to widely distribute it.

> 2) gigantic hard to understand stack traces

that's mostly a fact of life, sadly. i mean, with any asynchronous system you 
have much less context in your stack traces generally, but with proliferation 
of anonymous closures in the finagle codebase, it's often made even worse.

a few things here: (1) as of 1.9.3 (i notice this patch uses 1.9.0) stacks are 
now unwound per responder per thread. this means roughly the stacks you observe 
will ever only be one callback deep. now this might be even worse in terms of 
debugging, but it does produce cleaner/smaller stack traces.

debuggability is a big concern (both for finagle, and for general use of 
Futures). one interesting difference between asynchronous systems and 
synchronous ones is that stack traces don't tell the story, or may tell only 
part of the story. really what you want is a dispatch *graph*. we have a 
mechanism in twitter futures (called Locals-- they're like thread locals but 
instead they're local to the dispatch graph) where can record dispatches. this 
would now give us our graph. a little weird, maybe, but certainly something 
that would be very helpful in many circumstances. i'm still toying around with 
how to expose them (eg. we could synthesize stacks that's really a topological 
sort of the dispatch graph in all exceptions encoded by finagleā€¦)

> 3) some stability issues, apparently

i looked at his patch briefly.  this part is suspect (the fact that he throws 
in a callback).

{code}
+    @Override
+    public void onFailure(Throwable cause) {
+      cdl.countDown();
+      throw new RuntimeException("Hit exception in proxied call", cause);
+    }
{code}
and would cause that exception to be thrown. it's actually harmless in terms of 
functionality, but it will report the wrong underlying reason.

none of the user provided handlers should throw exceptions. at the same time, 
the fact that it's reported as "result set multiple times" may indicate a bug 
somewhere. i'm going to look into that probably by ~wed or so (my schedule is 
pretty filled up until then).

it's difficult to debug what's going on there (2/3s successful runs) without 
getting some stats out of the system, and/or diving deeper into the code. it 
sounds like perhaps the client isn't tuned properly for the particular use case.

anyhow. in my experience, almost *all* debugging of these sorts of systems can 
be done by looking at the client/server stats. and finagle exports a rich set 
of stats for both.

use the .reportTo() method in the builder to report to either ostrich or 
science/commons stats, or provide your own StatsReceiver.

{quote}
                
> Implement Netty-backed rpc solution
> -----------------------------------
>
>                 Key: GIRAPH-37
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-37
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Jakob Homan
>            Assignee: Jakob Homan
>         Attachments: GIRAPH-37-wip.patch
>
>
> GIRAPH-12 considered replacing the current Hadoop based rpc method with 
> Netty, but didn't went in another direction. I think there is still value in 
> this approach, and will also look at Finagle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to