Hi, Griffin community, I am trying to explore Apache Griffin as a Data quality tool for our data lake environment. I wondering if you have any documentation on running Apache Griffin with the GCP dataproc cluster. Also in your documentation, it is mentioned that Griffin supports two kinds of data sources, batch data, and real-time data. Is it source can be a configuration or just a data schema for data accuracy validation? Another use case we have is can the source be kafka and the destination can be hive/batch to perform a Data quality test.
https://github.com/apache/griffin/blob/griffin-0.6.0-rc1/griffin-doc/intro.md Appreciate your help to get started. Please share if you have detailed material/video/documentation etc. Thanks and Regards, Vipin A -- This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.
smime.p7s
Description: S/MIME Cryptographic Signature