Thanks for the response Chinmay. Yes, this issue is during restart of a failed application.
As per my observation, my dag/application failed during Resource Manager failover!! And during failover something went wrong to one of the files writing to hdfs!!! Application tried many a times to restore the file by launching the operator which is writing to hdfs on many containers and failed!! When I restarted the application, the application again tried many times to restore the hdfs file… and still launched many containers to recover….. App took really a very long time say 4 – 5hrs to successfully launch those hdfs operators and resume!!! Regards, Raja. From: Chinmay Kolhatkar <chin...@datatorrent.com<mailto:chin...@datatorrent.com>> Reply-To: "users@apex.apache.org<mailto:users@apex.apache.org>" <users@apex.apache.org<mailto:users@apex.apache.org>> Date: Saturday, July 23, 2016 at 12:11 AM To: "users@apex.apache.org<mailto:users@apex.apache.org>" <users@apex.apache.org<mailto:users@apex.apache.org>> Subject: Re: hdfs output file operator Hi Raja, I can see such a log message in AbstractFileOutputOperator at line 455. As this code is called from setup of the operator, the operator is getting deployed and then failing while restoring existing file because of mismatch in length of the file and the offset the operator has stored previously. From the code it looks like it takes care of such cases and restores the file. From what I understand either the file got changes by some other way or the offset management has a problem. Are you restarting the application from previous application Id? To narrow down the problem, can you please try to change the destination path and see if that works? Thanks, Chinmay. On Sat, Jul 23, 2016 at 5:00 AM, Sandesh Hegde <sand...@datatorrent.com<mailto:sand...@datatorrent.com>> wrote: Please check, 1. AppMaster logs 2. Cluster resources On Fri, Jul 22, 2016 at 1:14 PM Raja.Aravapalli <raja.aravapa...@target.com<mailto:raja.aravapa...@target.com>> wrote: Hi, I have File output operator which writes to hdfs files!! Application is trying to deploy the operator which writes to hdfs files in many different containers for a long time… but is not succeeding!!! Status is showing as PENDING_DEPLOY In the logs of the container which the Application is trying to deploy hdfs write operator, I can only see, path corrupted!! Can someone please guide or suggest me on this ? Regards, Raja.