Checking Data Integrity in Spark

2015-03-27 Thread Sathish Kumaran Vairavelu
Hello, I want to check if there is any way to check the data integrity of the data files. The use case is perform data integrity check on large files 100+ columns and reject records (write it another file) that does not meet criteria's (such as NOT NULL, date format, etc). Since there are lot of

Re: Checking Data Integrity in Spark

2015-03-27 Thread Arush Kharbanda
Its not possible to configure Spark to do checks based on xmls. You would need to write jobs to do the validations you need. On Fri, Mar 27, 2015 at 5:13 PM, Sathish Kumaran Vairavelu vsathishkuma...@gmail.com wrote: Hello, I want to check if there is any way to check the data integrity of