Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151827453
  
    --- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
 ---
    @@ -687,16 +689,17 @@ protected Expression 
getFilterPredicates(Configuration configuration) {
         // get tokens for all the required FileSystem for table path
         TokenCache.obtainTokensForNamenodes(job.getCredentials(),
             new Path[] { new Path(absoluteTableIdentifier.getTablePath()) }, 
job.getConfiguration());
    -
    -    TableDataMap blockletMap = DataMapStoreManager.getInstance()
    -        .getDataMap(absoluteTableIdentifier, BlockletDataMap.NAME,
    -            BlockletDataMapFactory.class.getName());
    +    boolean distributedCG = 
Boolean.parseBoolean(CarbonProperties.getInstance()
    +        .getProperty(CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP,
    +            CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP_DEFAULT));
    +    TableDataMap blockletMap =
    +        
DataMapStoreManager.getInstance().chooseDataMap(absoluteTableIdentifier);
         DataMapJob dataMapJob = getDataMapJob(job.getConfiguration());
         List<ExtendedBlocklet> prunedBlocklets;
    -    if (dataMapJob != null) {
    +    if (distributedCG || blockletMap.getDataMapFactory().getDataMapType() 
== DataMapType.FG) {
    --- End diff --
    
    It seems distributedCG and FG behave the same, right?
    Earlier I though FG datamap will be called in ScanRDD.compute, but it seems 
not?
    If we collect all pruned blocklet in driver, will it be too many for 
driver? 


---

Reply via email to