agresch opened a new pull request #3363:
URL: https://github.com/apache/storm/pull/3363


   ## What is the purpose of the change
   
   Reduce the impact of listing files on a Hadoop name node by checking a 
single timestamp first when updating blobstores.  The Hadoop blobstore will 
update the modTime on the base blobstore directory anytime a blob is updated.  
Supervisors will fetch that timestamp once during AsyncLocalizer updateBlobs(). 
 For each local blob, if the last check matches this modTime, they will not 
query the remote Hadoop blob.  This reduces polling on the namenode.
   
   ## How was the change tested
   
   Ran code with debug logs on internal dev clusters with Hadoop and ran 
blobstore related integration tests with topologies.  Ran 
storm-client/server/hdfs-blobstore unit tests.  Ran a word count topology on a 
local cluster setup for 15 minutes.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to