I don’t see why it can’t be happen though it uses the tick tuple, what if the 
worker die in between? This file will never get rotate to final destination

From: Arun Iyer <ai...@hortonworks.com<mailto:ai...@hortonworks.com>>
Reply-To: "dev@storm.apache.org<mailto:dev@storm.apache.org>" 
<dev@storm.apache.org<mailto:dev@storm.apache.org>>
Date: Thursday, 10 September 2015 12:35 pm
To: "dev@storm.apache.org<mailto:dev@storm.apache.org>" 
<dev@storm.apache.org<mailto:dev@storm.apache.org>>
Subject: Re: How to rotate file incase of storm worker fails?

Sachin,

STORM-969 makes use of tick tuple to periodically ack and flush the tuples so 
the scenario you mentioned would not happen. The tickTupleInterval is 
configurable.

- Arun




On 9/9/15, 11:47 PM, "Sachin Pasalkar" 
<sachin_pasal...@symantec.com<mailto:sachin_pasal...@symantec.com>> wrote:

Hi,

I was looking at code where HDFSBolt writes the file with tuple coming to it. I 
also had a look  at JIRA 
STORM-969<https://issues.apache.org/jira/browse/STORM-969>, I have following 
question to it

1)Lets say I have setup fileRotation policy at 64 MB, and I have written file 
upto 59MB. Now, my worker failed, the file which I was writing to will be never 
get rotated to final location.
2) As per 969 Jira, they have added the forceSync way but they are keeping all 
tuples in memory and delaying the acks sent back to spout. In our case, to 
write 64 MB data we need to store 5,400,000 process messages, which leads to 
lot of data in     memory. This may lead to unnecessary reply of tuple from 
spout(I am aware its at least once and I can increase 
TOPOLOGY_MESSAGE_TIMEOUT_SECS to full fill my requirement but is there other 
way?)

Thanks,
Sachin

Reply via email to