Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2583#discussion_r207097923 --- Diff: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java --- @@ -2651,8 +2651,17 @@ public static int isFilterPresent(byte[][] filterValues, carbonIndexSize = getCarbonIndexSize(fileStore, locationMap); for (Map.Entry<String, List<String>> entry : indexFilesMap.entrySet()) { // get the size of carbondata files + String tempBlockFilePath = null; for (String blockFile : entry.getValue()) { - carbonDataSize += FileFactory.getCarbonFile(blockFile).getSize(); + // the indexFileMap contains all the blocklets and index file mapping. For example, if one + // block contains 3 blocklets, then entry.getValue() will list all the blocklets of all + // the block present in it. Since all the three blocklets will have the same block path, + // so just get the size of one block path for exact data size and avoid wrong datasize + // calculation. + if (!blockFile.equals(tempBlockFilePath)) { --- End diff -- I feel this fix is not required, Please check PR https://github.com/apache/carbondata/pull/2596 to avoid duplicates from indexfileMap
---