Re: Data Remanence in HDFS

2024-01-13 Thread Jim Halfpenny
Hi Daniel, In short you can’t create a HDFS block with unallocated data. You can create a zero length block, which will result in a zero byte file being created on the data node, but you can’t create a sparse file in HDFS. While HDFS has a block size e.g. 128MB if you create a small file then th

Re: Data Remanence in HDFS

2024-01-12 Thread Daniel Howard
Thank Jim, The scenario I have in mind is something like: 1) Ask HDFS to create a file that is 32k in length. 2) Attempt to read the contents of the file. Can I even attempt to read the contents of a file that has not yet been written? If so, what data would get sent? For example, I asked a vers

Re: Data Remanence in HDFS

2024-01-12 Thread Jim Halfpenny
Hi Danny, This does depend on a number of circumstances, mostly based on file permissions. If for example a file is deleted without the -skipTrash option then it will be moved to the .Trash directory. From here it could be read, but the original file permissions will be preserved. Therefore if a

Data Remanence in HDFS

2024-01-11 Thread Daniel Howard
Is it possible for a user with HDFS access to read the contents of a file previously deleted by a different user? I know a user can employ KMS to encrypt files with a personal key, making this sort of data leakage effectively impossible. But, without KMS, is it possible to allocate a file with uni