Hello,

We've been using Hadoop (+Spark) for 3 years on production w/o major issues.

Lately we observe that whole non-empty directories (table partitions) are 
disappearing in random ways.
We see in application logs (and in hdfs-audit) logs creation of the directory + 
data files.
Then later we see NO this directory in HDFS.

hdfs-audit.log shows no traces of deletes or renames for the disappeared 
directories.
We can trust these logs, as we see our manual operations are present in the 
logs.

Time between creation and disappearing is 1-2 days.

Maybe we are losing individual files as well, we just cannot find this out 
reliably.

This is a blocker issue for us, we have to stop production data processing 
until we find out and fix data loss root cause.

Please help to identify the root cause or find the right direction for 
search/further questions.


-- Hadoop version: ------------------------------
Hadoop 3.2.1
Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r 
b3cbbb467e22ea829b3808f4b7b01d07e0bf3842
Compiled by rohithsharmaks on 2019-09-10T15:56Z
Compiled with protoc 2.5.0
>From source with checksum 776eaf9eee9c0ffc370bcbc1888737

Thank you!
Sergey Onuchin

Reply via email to