Vladimir, thanks for your reply! What you suggest definitely makes sense and it looks more reasonable solution. But there remains other thing. The second issue, that this solution solves, is prevent log pollution with GridServiceNotFoundException. The reason why it happens is because GridServiceProxy#invokeMethod() designed to catch it and ClusterTopologyCheckedException and retry over and over again unless service is become available, but on the same time there are tons of stack traces printed on remote node.
If we want to avoid changing API, probably we need to add option to mute that exceptions in GridJobWorker. What do you think? On Tue, Nov 15, 2016 at 5:10 PM, Vladimir Ozerov <voze...@gridgain.com> wrote: > Also we implemented the same thing for platforms some time ago. In short, > job result processing was implemented as follows (pseudocode): > > // Execute. > Object res; > > try { > res = job.run(); > } > catch (Exception e) { > res = e > } > > // Serialize result. > try { > SERIALIZE(res); > } > catch (Exception e) { > try{ > // Serialize serialization error. > SERIALIZE(new IgniteException("Failed to serialize result.", e)); > } > catch (Exception e) { > // Cannot serialize serialization error, so pass only string to > exception. > SERIALIZE(new IgniteException("Failed to serialize result: " + > e.getMessage()); > } > } > > On Tue, Nov 15, 2016 at 5:05 PM, Vladimir Ozerov <voze...@gridgain.com> > wrote: > > > Dmitriy, > > > > Shouldn't we report serialization problem properly instead? We already > had > > a problem when node hanged during job execution in case it was impossible > > to deserialize the job on receiver side. It was resolved properly - we > > caught exception on receiver side and reported it back sender side. > > > > I believe we must do the same for services. Otherwise we may end up in > > messy API which doesn't resolve original problem. > > > > Vladimir. > > > > On Tue, Nov 15, 2016 at 4:38 PM, Dmitriy Karachentsev < > > dkarachent...@gridgain.com> wrote: > > > >> Hi Igniters! > >> > >> I'd like to modify our public API and add IgniteServices.serviceProxy() > >> method with timeout argument as part of the task > >> https://issues.apache.org/jira/browse/IGNITE-3862 > >> > >> In short, without timeout, in case of serialization error, or so, > service > >> acquirement may hang and infinitely log errors. > >> > >> Do you have any concerns about this change? > >> > >> Thanks! > >> Dmitry. > >> > > > > >