Usually analysts will not have access to data stored in the PCI Zone, you
could write the data out to a table for the analysts by masking the
sensitive information.

Eg:


> val mask_udf = udf((info: String) => info.patch(0, "*" * 12, 7))
> val df = sc.parallelize(Seq(("user1", "400-000-444"))).toDF("user", 
> "sensitive_info")
> df.show

+-----+--------------+
| user|sensitive_info|
+-----+--------------+
|user1|   400-000-444|
+-----+--------------+

> df.withColumn("sensitive_info", mask_udf($"sensitive_info")).show

+-----+----------------+
| user|  sensitive_info|
+-----+----------------+
|user1|************-444|
+-----+----------------+


On Sat, Aug 19, 2017 at 10:42 PM, 李斌松 <libinsong1...@gmail.com> wrote:

> For example, the user's bank card number cannot be viewed by an analyst
> and replaced by an asterisk. How do you do that in spark?
>



-- 
Cheers!

Reply via email to