suyash yadav created CARBONDATA-4079:
----------------------------------------

             Summary: Queries with Date range are taking time
                 Key: CARBONDATA-4079
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-4079
             Project: CarbonData
          Issue Type: Improvement
          Components: data-query
    Affects Versions: 2.1.0
            Reporter: suyash yadav


Hi Team,

We are doing a POC to understand how can we improve performance of the query 
fired against table created in apache carbondata.

Below is the sample query:

 

*spark.sql("select ts,resource,metric,value from fact_timestamp_global left 
join tags_10_Days_test on fact_timestamp_global.tags_id= tags_10_Days_test.id 
where metric in ('Outbound Utilization (percent)','Inbound Utilization 
(percent)') and resource='10.212.7.98_if:<0001>' and  ts between '2020-09-21 
00:00:00' and '2020-09-21 12:55:55' group by 
ts,resource,metric,value").show(10000,false)*

As you can see above query contains the date range filter.We have noticed that 
due to this date range filter the query time is coming around 15 seconds which 
is not proving useful as we have to bring down the query execution time to 3 to 
4 seconds.

Could you please review above query and suggest a better way of framing the 
above query specially the date range filter which can be  helpful to get the 
desired query execution time?

 

In case you need more details then please do let me know. 

 

Regards

Suyash Yadav



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to