Hello, We've been using Hadoop (+Spark) for 3 years on production w/o major issues.
Lately we observe that whole non-empty directories (table partitions) are disappearing in random ways. We see in application logs (and in hdfs-audit) logs creation of the directory + data files. Then later we see NO this directory in HDFS. hdfs-audit.log shows no traces of deletes or renames for the disappeared directories. We can trust these logs, as we see our manual operations are present in the logs. Time between creation and disappearing is 1-2 days. Maybe we are losing individual files as well, we just cannot find this out reliably. This is a blocker issue for us, we have to stop production data processing until we find out and fix data loss root cause. Please help to identify the root cause or find the right direction for search/further questions. -- Hadoop version: ------------------------------ Hadoop 3.2.1 Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r b3cbbb467e22ea829b3808f4b7b01d07e0bf3842 Compiled by rohithsharmaks on 2019-09-10T15:56Z Compiled with protoc 2.5.0 >From source with checksum 776eaf9eee9c0ffc370bcbc1888737 Thank you! Sergey Onuchin