GitHub user ajithme opened a pull request: https://github.com/apache/carbondata/pull/2495
Added for kafka integration with Carbon StreamSQL 1. Pass source table properties to streamReader.load() 2. Do not pass schema when sparkSession.readStream 3. Remove querySchema validation against sink as dataFrame made from kafka source will not have schema ( its written in value column of schema ) 4. Extract the dataframe from kafka source which contain actual data schema @ writeStream Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? NO - [ ] Any backward compatibility impacted? NO - [ ] Document update required? Yes: Need to use CSV parser - [ ] Testing done Done - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ajithme/carbondata kafkaStreamSQLIntegration Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2495.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2495 ---- commit 0560c5e69c61d6594a91994da918493335bd0cb4 Author: Ajith <ajith2489@...> Date: 2018-07-12T03:47:22Z Added for kafka integration with Carbon StreamSQL 1. Pass source table properties to streamReader.load() 2. Do not pass schema when sparkSession.readStream 3. Remove querySchema validation against sink as dataFrame made from kafka source will not have schema ( its written in value column of schema ) 4. Extract the dataframe from kafka source which contain actual data schema @ writeStream ---- ---