Then probably the easiest option would be in INSERT/SELECT from external table to target table and make that column NULL
Check the VAT column here that I made it NULL DROP TABLE IF EXISTS stg_t2; CREATE EXTERNAL TABLE stg_t2 ( INVOICENUMBER string ,PAYMENTDATE string ,NET string ,VAT string ,TOTAL string ) COMMENT 'from csv file from excel sheet xxxx' ROW FORMAT serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde' STORED AS TEXTFILE LOCATION '/data/stg/table2' TBLPROPERTIES ("skip.header.line.count"="1") ; --3) DROP TABLE IF EXISTS t2; CREATE TABLE t2 ( INVOICENUMBER INT ,PAYMENTDATE timestamp ,NET DECIMAL(20,2) ,VAT DECIMAL(20,2) ,TOTAL DECIMAL(20,2) ) COMMENT 'from csv file from excel sheet xxxx' CLUSTERED BY (INVOICENUMBER) INTO 256 BUCKETS STORED AS ORC TBLPROPERTIES ( "orc.compress"="ZLIB", "transactional"="true") ; --4) Put data in target table. do the conversion and ignore empty rows INSERT INTO TABLE t2 SELECT INVOICENUMBER , CAST(UNIX_TIMESTAMP(paymentdate,'DD/MM/YYYY')*1000 as timestamp) , CAST(REGEXP_REPLACE(net,'[^\\d\\.]','') AS DECIMAL(20,2)) , NULL , CAST(REGEXP_REPLACE(total,'[^\\d\\.]','') AS DECIMAL(20,2)) FROM stg_t2 WHERE -- INVOICENUMBER > 0 AND CAST(REGEXP_REPLACE(total,'[^\\d\\.]','') AS DECIMAL(20,2)) > 0.0 -- Exclude empty rows HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 17 March 2016 at 15:32, Ajay Chander <hadoopde...@gmail.com> wrote: > Mich, I am okay with replacing the columns data with some characters like > asterisk. Thanks > > > On Thursday, March 17, 2016, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> Hi Ajay, >> >> Do you want to be able to unmask it (at any time) or just have it totally >> scrambled (for example replace the column with random characters) in Hive? >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> On 17 March 2016 at 15:14, Ajay Chander <hadoopde...@gmail.com> wrote: >> >>> Mich thbaks for looking into this. I have a 'csvfile.txt ' on hdfs. I >>> have created an external table 'xyz' to load that data into it. One of the >>> columns data 'ssn' needs to be masked. Is there any built in function is >>> give that I could use? >>> >>> >>> On Thursday, March 17, 2016, Mich Talebzadeh <mich.talebza...@gmail.com> >>> wrote: >>> >>>> Are you loading your CSV file from an External table into Hive table.? >>>> >>>> Basically you want to scramble that column before putting into Hive >>>> table? >>>> >>>> Dr Mich Talebzadeh >>>> >>>> >>>> >>>> LinkedIn * >>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>> >>>> >>>> >>>> http://talebzadehmich.wordpress.com >>>> >>>> >>>> >>>> On 17 March 2016 at 14:37, Ajay Chander <hadoopde...@gmail.com> wrote: >>>> >>>>> Tustin, Is there anyway I can deidentify it in hive ? >>>>> >>>>> >>>>> On Thursday, March 17, 2016, Marcin Tustin <mtus...@handybook.com> >>>>> wrote: >>>>> >>>>>> This is a classic transform-load problem. You'll want to anonymise it >>>>>> once before making it available for analysis. >>>>>> >>>>>> On Thursday, March 17, 2016, Ajay Chander <hadoopde...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi Everyone, >>>>>>> >>>>>>> I have a csv.file which has some sensitive data in a particular >>>>>>> column in it. Now I have to create a table in hive and load the data >>>>>>> into >>>>>>> it. But when loading the data I have to make sure that the data is >>>>>>> masked. >>>>>>> Is there any built in function is used ch supports this or do I have to >>>>>>> write UDF ? Any suggestions are appreciated. Thanks >>>>>> >>>>>> >>>>>> Want to work at Handy? Check out our culture deck and open roles >>>>>> <http://www.handy.com/careers> >>>>>> Latest news <http://www.handy.com/press> at Handy >>>>>> Handy just raised $50m >>>>>> <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> >>>>>> led >>>>>> by Fidelity >>>>>> >>>>>> >>>> >>