[
https://issues.apache.org/jira/browse/STORM-1971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15389892#comment-15389892
]
Jakes commented on STORM-1971:
------------------------------
Thanks for your reply. I think overhead is very high in the current case.
Writing a message of x size y times to a hdfs cluster vs write a single message
of xy size to hdfs cluster. HDFS is best at large streaming reads and writes
What are the advantages of current implementation(one message write) over the
proposed one(batching writes)?
> HDFS Timed Synchronous Policy
> -----------------------------
>
> Key: STORM-1971
> URL: https://issues.apache.org/jira/browse/STORM-1971
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-hdfs
> Affects Versions: 0.10.0, 1.0.0
> Reporter: darion yaphet
> Assignee: darion yaphet
>
> When the data need to be wrote to HDFS is not very large in quantity . We
> need a timed synchronous policy to flush cached date into HDFS periodically.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)