suyash yadav created CARBONDATA-4085:
----------------------------------------

             Summary: How to improve query execution time further
                 Key: CARBONDATA-4085
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-4085
             Project: CarbonData
          Issue Type: Improvement
          Components: sql
    Affects Versions: 2.0.1
            Reporter: suyash yadav
             Fix For: 2.0.1


Hi Team,

We are doing a POC where we would like oour query execution to be fatser, 
mostly in the range of 3 to 4 seconds.

We have read carbon docuements where it has been claimed that carbondata can 
help to scan PETABYTES of data and present results in 3 to 4 seconds , which 
does not seem to be the case as per our observation.

Our table size is 1.6 billionand  query is fetching only 4K records but still 
it takes around 22 to 25 seconds for query execution.

Below is our query that we are firing:

==============================

spark.sql("select ts,resource,metric,value from fact_timestamp_global left join 
tags_10_days_test on fact_timestamp_global.tags_id= tags_10_days_test.id where 
metric in ('Outbound Utilization (percent)','Inbound Utilization (percent)') 
and resource='10.212.7.98_if:<0001>' and ts>='2020-09-28 00:00:00' and 
ts<='2020-09-28 23:55:55'").show(false)

=================================



Definition of fact_timestamp_global is like below:

========================

spark.sql("create table Fact_timestamp_GLOBAL(ts timestamp,metric 
string,tags_id string,value double) partitioned by (ts2 timestamp) stored as 
carbondata TBLPROPERTIES 
('SORT_COLUMNS'='ts,metric','SORT_SCOPE'='GLOBAL_SORT')").show()

==========================

Definition of tags_10_days_test is like below:

====================

spark.sql("create table tags_10_days_test(id string,resource string) stored as 
carbondata TBLPROPERTIES('SORT_COLUMNS'='id,resource')").show()

======================

 

Kindly go through above points and help us the query performence further.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to