Usually analysts will not have access to data stored in the PCI Zone, you could write the data out to a table for the analysts by masking the sensitive information.
Eg: > val mask_udf = udf((info: String) => info.patch(0, "*" * 12, 7)) > val df = sc.parallelize(Seq(("user1", "400-000-444"))).toDF("user", > "sensitive_info") > df.show +-----+--------------+ | user|sensitive_info| +-----+--------------+ |user1| 400-000-444| +-----+--------------+ > df.withColumn("sensitive_info", mask_udf($"sensitive_info")).show +-----+----------------+ | user| sensitive_info| +-----+----------------+ |user1|************-444| +-----+----------------+ On Sat, Aug 19, 2017 at 10:42 PM, 李斌松 <libinsong1...@gmail.com> wrote: > For example, the user's bank card number cannot be viewed by an analyst > and replaced by an asterisk. How do you do that in spark? > -- Cheers!