[ https://issues.apache.org/jira/browse/CARBONDATA-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213673#comment-17213673 ]
Prasanna Ravichandran commented on CARBONDATA-3807: --------------------------------------------------- Model plan with bloom details: (Could not attach the screenshot) == CarbonData Profiler == Table Scan on uniqdata - total: 2 blocks, 2 blocklets - filter: (cust_name <> null and cust_name = CUST_NAME_00000) - pruned by Main Index - skipped: 0 blocks, 0 blocklets *- pruned by CG Index* *- name: datamapuniq_b1* *- provider: bloomfilter* - skipped: 0 blocks, 0 blocklets == Physical Plan == AdaptiveSparkPlan(isFinalPlan=false) +- HashAggregate(keys=[], functions=[count(1)]) +- Exchange SinglePartition, true, [id=#129] +- HashAggregate(keys=[], functions=[partial_count(1)]) +- Project +- Scan carbondata default.uniqdata[] PushedFilters: [IsNotNull(cust_name), EqualTo(cust_name,CUST_NAME_00000)], ReadSchema: struct<cust_name:string> > Filter queries and projection queries with bloom columns are not hitting the > bloom datamap. > ------------------------------------------------------------------------------------------- > > Key: CARBONDATA-3807 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3807 > Project: CarbonData > Issue Type: Bug > Environment: Ant cluster - opensource > Reporter: Prasanna Ravichandran > Priority: Major > Fix For: 2.0.0 > > Attachments: bloom-filtercolumn-plan.png, bloom-show index.png > > > Filter queries and projection queries with bloom columns are not hitting the > bloom datamap. > Bloom datamap is unused as per plan, even though created. > Test queries: > drop table if exists uniqdata; > CREATE TABLE uniqdata (cust_id int,cust_name String,active_emui_version > string, dob timestamp, doj timestamp, bigint_column1 bigint,bigint_column2 > bigint,decimal_column1 decimal(30,10), decimal_column2 > decimal(36,36),double_column1 double, double_column2 double,integer_column1 > int) stored as carbondata; > load data inpath 'hdfs://hacluster/user/prasanna/2000_UniqData.csv' into > table uniqdata > options('fileheader'='cust_id,cust_name,active_emui_version,dob,doj,bigint_column1,bigint_column2,decimal_column1,decimal_column2,double_column1,double_column2,integer_column1','bad_records_action'='force'); > create datamap datamapuniq_b1 on table uniqdata(cust_name) as 'bloomfilter' > PROPERTIES ('BLOOM_SIZE'='640000', 'BLOOM_FPP'='0.00001'); > show indexes on uniqdata; > explain select count(*) from uniqdata where cust_name="CUST_NAME_00000"; > --not hitting; > explain select cust_name from uniqdata; --not hitting; > > -- This message was sent by Atlassian Jira (v8.3.4#803005)