[jira] [Commented] (HDFS-9095) RPC client should fail gracefully when the connection is timed out or reset

2015-09-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903152#comment-14903152
 ] 

Hadoop QA commented on HDFS-9095:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12761698/HDFS-9095.001.patch |
| Optional Tests | javadoc javac unit |
| git revision | trunk / cc2b473 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12605/console |


This message was automatically generated.

> RPC client should fail gracefully when the connection is timed out or reset
> ---
>
> Key: HDFS-9095
> URL: https://issues.apache.org/jira/browse/HDFS-9095
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9095.000.patch, HDFS-9095.001.patch
>
>
> The RPC client should fail gracefully when the connection is timed out or 
> reset. instead of bailing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9095) RPC client should fail gracefully when the connection is timed out or reset

2015-09-22 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903134#comment-14903134
 ] 

Haohui Mai commented on HDFS-9095:
--

Thanks [~James Clampffer] and [~bobhansen] for the reviews. The v1 patch 
changes {{CMAKE_CURRENT_SOURCE_DIR}} to {{CMAKE_CURRENT_LIST_DIR}}.

> RPC client should fail gracefully when the connection is timed out or reset
> ---
>
> Key: HDFS-9095
> URL: https://issues.apache.org/jira/browse/HDFS-9095
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9095.000.patch, HDFS-9095.001.patch
>
>
> The RPC client should fail gracefully when the connection is timed out or 
> reset. instead of bailing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9095) RPC client should fail gracefully when the connection is timed out or reset

2015-09-21 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901053#comment-14901053
 ] 

Haohui Mai commented on HDFS-9095:
--

bq. You may want to use CMAKE_CURRENT_LIST_DIR rather than 
CMAKE_CURRENT_SOURCE_DIR as a more stable root directory.

I don't understand why it's an issue here? I've not seen many people use 
{{CMAKE_CURRENT_LIST_DIR}} in practice.  ${CMAKE_CURRENT_SOURCE_DIR} will 
points to {{hadoop-hdfs-project/hadoop-hdfs-client/src/main/native/libhdfspp}}. 
When can it be a problem.

Following the experiences learned from the Java client, should the server 
address be passed in with the options (eventually, they will probably all be 
loaded from the same XML files at at startup).

bq. No. It's important to make the distinction here. Options specially mean 
tunable parameters, while server addresses are input for the RPC library.

bq. In RpcConnection methods, should we be calling into the handler while 
holding the lock on the engine state? If any method there does synchronous I/O 
or hangs for any reason, the whole Rpc system locks up.
bq. Can we have assertions that the lock is held in RpcConnection rather than 
comments stating that it should be?

This is a known issue coming from 
https://github.com/haohui/libhdfspp/issues/39. Please feel free to file jiras 
to fix it.

In RpcConnectionImpl, should options_ and next_layer_ be const?

bq. {{next_layer_}} cannot be const, but options_ should be. Will fix it.

> RPC client should fail gracefully when the connection is timed out or reset
> ---
>
> Key: HDFS-9095
> URL: https://issues.apache.org/jira/browse/HDFS-9095
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9095.000.patch
>
>
> The RPC client should fail gracefully when the connection is timed out or 
> reset. instead of bailing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9095) RPC client should fail gracefully when the connection is timed out or reset

2015-09-21 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900971#comment-14900971
 ] 

Bob Hansen commented on HDFS-9095:
--

You may want to use CMAKE_CURRENT_LIST_DIR rather than CMAKE_CURRENT_SOURCE_DIR 
as a more stable root directory.

I'm glad you started to add some logging and the start of an options 
architecture.  I was going to file another Jira for both of those (I probably 
will to make a space for more full-featured efforts).  

Following the experiences learned from the Java client, should the server 
address be passed in with the options (eventually, they will probably all be 
loaded from the same XML files at at startup).

In RpcConnection methods, should we be calling into the handler while holding 
the lock on the engine state?  If any method there does synchronous I/O or 
hangs for any reason, the whole Rpc system locks up.

Can we have assertions that the lock is held in RpcConnection rather than 
comments stating that it should be?

In RpcConnectionImpl, should options_ and next_layer_ be const?







> RPC client should fail gracefully when the connection is timed out or reset
> ---
>
> Key: HDFS-9095
> URL: https://issues.apache.org/jira/browse/HDFS-9095
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9095.000.patch
>
>
> The RPC client should fail gracefully when the connection is timed out or 
> reset. instead of bailing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9095) RPC client should fail gracefully when the connection is timed out or reset

2015-09-21 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901293#comment-14901293
 ] 

James Clampffer commented on HDFS-9095:
---

Agree with bob about making the CMakeLists as robust as possible, otherwise +1 
on the patch.  Getting in the basics for logging is very nice as well.

Re: In RpcConnection methods, should we be calling into the handler while 
holding the lock on the engine state? If any method there does synchronous I/O 
or hangs for any reason, the whole Rpc system locks up.

This was done to avoid using a std::recursive_mutex because right now that 
handler only gets called from OnRecvCompleted.  I don't think the handler is 
going to be changing much unless we start using multiple connections from a 
single RpcEngine.  Lock contention is one of the things I hope to start 
profiling soon; if the overhead is negligible I'll switch that back to a 
recursive_mutex and grab the lock in the handler as well (I'll file a jira if 
that's the case).

> RPC client should fail gracefully when the connection is timed out or reset
> ---
>
> Key: HDFS-9095
> URL: https://issues.apache.org/jira/browse/HDFS-9095
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9095.000.patch
>
>
> The RPC client should fail gracefully when the connection is timed out or 
> reset. instead of bailing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9095) RPC client should fail gracefully when the connection is timed out or reset

2015-09-21 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901244#comment-14901244
 ] 

Bob Hansen commented on HDFS-9095:
--

Re: CMAKE_CURRENT_LIST_DIR vs. CMAKE_CURRENT_SRC_DIR: 
According to ye olde 
[StackOverflow|http://stackoverflow.com/questions/15662497/in-cmake-what-is-the-difference-between-cmake-current-source-dir-and-cmake-curr],
 it becomes more of an issue when files are included across directories (as 
some of the protobuf stuff is).  The difference is what led to hours of angst 
in HDFS-9025 where the cwd was under the CMakeLists.txt.  It's not a super-big 
deal, but once bitten, twice shy.

Re: Options - what you have here is a good start; we can discuss an 
architectural solution under HDFS-9117.

> RPC client should fail gracefully when the connection is timed out or reset
> ---
>
> Key: HDFS-9095
> URL: https://issues.apache.org/jira/browse/HDFS-9095
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9095.000.patch
>
>
> The RPC client should fail gracefully when the connection is timed out or 
> reset. instead of bailing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9095) RPC client should fail gracefully when the connection is timed out or reset

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803187#comment-14803187
 ] 

Hadoop QA commented on HDFS-9095:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756394/HDFS-9095.000.patch |
| Optional Tests | javadoc javac unit |
| git revision | trunk / 58d1a02 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12508/console |


This message was automatically generated.

> RPC client should fail gracefully when the connection is timed out or reset
> ---
>
> Key: HDFS-9095
> URL: https://issues.apache.org/jira/browse/HDFS-9095
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9095.000.patch
>
>
> The RPC client should fail gracefully when the connection is timed out or 
> reset. instead of bailing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)