[
https://issues.apache.org/jira/browse/STORM-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699054#comment-14699054
]
ASF GitHub Bot commented on STORM-969:
--------------------------------------
Github user arunmahadevan commented on a diff in the pull request:
https://github.com/apache/storm/pull/664#discussion_r37160568
--- Diff:
external/storm-hdfs/src/main/java/org/apache/storm/hdfs/bolt/HdfsBolt.java ---
@@ -80,6 +86,11 @@ public HdfsBolt addRotationAction(RotationAction action){
return this;
}
+ public HdfsBolt withTickTupleIntervalSeconds(int interval) {
--- End diff --
Could give a more meaningful name to convey the actual usage (e.g
`withFlushIntervalSeconds`). I think we need to have a default value for this
to ensure a sync is `always` done even if user doesn't specify this option and
also ensure the option value is within some lower and upper thresholds so that
tuples are acked within TOPOLOGY_MESSAGE_TIMEOUT_SECS and a sync doesn't happen
too frequently.
> HDFS Bolt can end up in an unrecoverable state
> ----------------------------------------------
>
> Key: STORM-969
> URL: https://issues.apache.org/jira/browse/STORM-969
> Project: Apache Storm
> Issue Type: Improvement
> Components: storm-hdfs
> Reporter: Aaron Dossett
> Assignee: Aaron Dossett
>
> The body of the HDFSBolt.execute() method is essentially one try-catch block.
> The catch block reports the error and fails the current tuple. In some
> cases the bolt's FSDataOutputStream object (named 'out') is in an
> unrecoverable state and no subsequent calls to execute() can succeed.
> To produce this scenario:
> - process some tuples through HDFS bolt
> - put the underlying HDFS system into safemode
> - process some more tuples and receive a correct ClosedChannelException
> - take the underlying HDFS system out of safemode
> - subsequent tuples continue to fail with the same exception
> The three fundamental operations that execute takes (writing, sync'ing,
> rotating) need to be isolated so that errors from each are specifically
> handled.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)