If you data is in different partitions in HDFS, you can simply use tools
like Hive or Pig to read the data in a give partition, filter out the bad
data and overwrite the partition. This data cleansing is common practice,
I'm not sure why there is such a back and forth on this topic. Of course
HBas
Following up on this, I was able to extract a winutils.exe and Hadoop.dll from
a Hadoop install for Windows, and set up HADDOP_HOME and PATH to find them. It
makes no difference to security, apparently.
John
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Saturday, August 23, 2014 2: