Hi Sergey Onuchin, Sorry to hear that. But we could not give some suggestions based on the only information you mentioned. If any more on-site information may be better to trace, such as depoy architecture, NameNode log and jstack etc. Based on my practice, I did not receive some cases which delete directory without noise. Did you try to check operations (rename and delete) about the parent-directory? Good luck!
Best Regards, - He Xiaoqiao On Mon, Oct 16, 2023 at 11:58 PM < hdfs-issues-reject-1697471875.2027154.pkchcedhioidkhech...@hadoop.apache.org> wrote: > > ---------- Forwarded message ---------- > From: Sergey Onuchin <sergey.onuc...@acronis.com.invalid> > To: "hdfs-iss...@hadoop.apache.org" <hdfs-iss...@hadoop.apache.org> > Cc: > Bcc: > Date: Mon, 16 Oct 2023 15:57:47 +0000 > Subject: HDFS loses directories with production data > > Hello, > > > > We’ve been using Hadoop (+Spark) for 3 years on production w/o major > issues. > > > > Lately we observe that whole non-empty directories (table partitions) are > disappearing in random ways. > > We see in application logs (and in hdfs-audit) logs creation of the > directory + data files. > > Then later we see NO this directory in HDFS. > > > > hdfs-audit.log shows no traces of deletes or renames for the disappeared > directories. > > We can trust these logs, as we see our manual operations are present in > the logs. > > > > Time between creation and disappearing is 1-2 days. > > > > Maybe we are losing individual files as well, we just cannot find this out > reliably. > > > > This is a blocker issue for us, we have to stop production data processing > until we find out and fix data loss root cause. > > > > Please help to identify the root cause or find the right direction for > search/further questions. > > > > > > -- Hadoop version: ------------------------------ > > Hadoop 3.2.1 > > Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r > b3cbbb467e22ea829b3808f4b7b01d07e0bf3842 > > Compiled by rohithsharmaks on 2019-09-10T15:56Z > > Compiled with protoc 2.5.0 > > From source with checksum 776eaf9eee9c0ffc370bcbc1888737 > > > > Thank you! > > Sergey Onuchin > > >