Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2273#discussion_r187116998 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java --- @@ -151,6 +154,33 @@ public CarbonTable getOrCreateCarbonTable(Configuration configuration) throws IO SegmentStatusManager segmentStatusManager = new SegmentStatusManager(identifier); SegmentStatusManager.ValidAndInvalidSegmentsInfo segments = segmentStatusManager .getValidAndInvalidSegments(loadMetadataDetails, this.readCommittedScope); + + // For NonTransactional table, compare the schema of all index files with inferred schema. + // If there is a mismatch throw exception. As all files must be of same schema. + if (!carbonTable.getTableInfo().isTransactionalTable()) { + SchemaConverter schemaConverter = new ThriftWrapperSchemaConverterImpl(); + for (Segment segment : segments.getValidSegments()) { + Map<String, String> indexFiles = segment.getCommittedIndexFile(); + for (Map.Entry<String, String> indexFileEntry : indexFiles.entrySet()) { + Path indexFile = new Path(indexFileEntry.getKey()); + org.apache.carbondata.format.TableInfo tableInfo = CarbonUtil.inferSchemaFromIndexFile( + indexFile.toString(), carbonTable.getTableName()); + TableInfo wrapperTableInfo = schemaConverter.fromExternalToWrapperTableInfo( + tableInfo, identifier.getDatabaseName(), + identifier.getTableName(), + identifier.getTablePath()); + List<ColumnSchema> indexFileColumnList = + wrapperTableInfo.getFactTable().getListOfColumns(); + List<ColumnSchema> tableColumnList = + carbonTable.getTableInfo().getFactTable().getListOfColumns(); + if (!compareColumnSchemaList(indexFileColumnList, tableColumnList)) { + throw new IOException("All the files schema doesn't match. " --- End diff -- @kunal642 @sounakr I agree with @gvramana, skipping data file is not correct as it will miss some records which will not be acceptable. Blocking user while writing is not possible. I think throwing exception is correct. @ajantha-bhat Can u please check how Parquet works in similar scenario.
---