Hi, What are the best mechanisms of hiding data destined for Hive tables.
Let us assume that we are loading tons of CSV files into Hive. The way I do it is: --1 Move .CSV data into HDFS staging area --2 Create an external table. --3 Create the ORC table if needed --4 Insert or append the data from the external table to the Hive ORC table --5 Remove CSV files from staging area Within process 1 to 5 (that may take a good while), sensitive data residing on HDFS can be exposed. I would be interested to know possible solutions to this potential security breach. Thanks, Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com
