[jira] [Commented] (CARBONDATA-4132) Numer of records not matching in MVs

2021-07-14 Thread Sushant Sammanwar (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380687#comment-17380687
 ] 

Sushant Sammanwar commented on CARBONDATA-4132:
---

Thanks [~indhumuthumurugesh] for your response. 

With Explain plan we have checked query hits MV if there isn't any condition 
applied to timeseries column.
If no. of rows in MV can be more than expected ( 1/4th for a hour MV when data 
is inserted for 15 min granularity), then it doesnot save purpose to use MV.
MV is using same storage as raw table and will affect query time.
Although MV has partially aggregated data but sometimes MV is found to have 
same no. of rows as parent table. 
In such cases, query time for MV will be same as that for a table .

> Numer of records not matching in MVs
> 
>
> Key: CARBONDATA-4132
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4132
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.1
> Environment: Apache carbondata 2.0.1
>Reporter: suyash yadav
>Priority: Major
> Fix For: 2.0.1
>
>
> Hi Team, 
> We are working on a POC where we need to insert 300k records/second in a 
> table where we have already created Timeeries MVs with Minute,Hour,Day 
> granularity.
>  
> As per our the Minute based MV should contain 300K records till the insertion 
> of next minute data. Also the hour and Day based MVs should contain 300K 
> records till the arrival of next hour and next day data respectively.
>  
> But The count of records in MV is not coming out as per our expectation.It is 
> always more than our expectation.
> But the strange thing is, When we drop the MV and create the MV after 
> inserting the data in the table then the count if reocrds comes correct.So it 
> is clear there is no problem with MV definition and the data.
>  
> Kindly help us in resolving this issue on priority.Please find more details 
> below:
> Table definition:
> ===
> spark.sql("create table Flow_Raw_TS(export_ms bigint,exporter_ip 
> string,pkt_seq_num bigint,flow_seq_num int,src_ip string,dst_ip 
> string,protocol_id smallint,src_tos smallint,dst_tos smallint,raw_src_tos 
> smallint,raw_dst_tos smallint,src_mask smallint,dst_mask smallint,tcp_bits 
> int,src_port int,in_if_id bigint,in_if_entity_id bigint,in_if_enabled 
> boolean,dst_port int,out_if_id bigint,out_if_entity_id bigint,out_if_enabled 
> boolean,direction smallint,in_octets bigint,out_octets bigint,in_packets 
> bigint,out_packets bigint,next_hop_ip string,bgp_src_as_num 
> bigint,bgp_dst_as_num bigint,bgp_next_hop_ip string,end_ms timestamp,start_ms 
> timestamp,app_id string,app_name string,src_ip_group string,dst_ip_group 
> string,policy_qos_classification_hierarchy string,policy_qos_queue_id 
> bigint,worker_id int,day bigint ) stored as carbondata TBLPROPERTIES 
> ('local_dictionary_enable'='false')
> MV definition:
>  
> ==
> +*Minute based*+
> spark.sql("create materialized view Flow_Raw_TS_agg_001_min as select 
> timeseries(end_ms,'minute') as 
> end_ms,src_ip,dst_ip,app_name,in_if_id,src_tos,src_ip_group,dst_ip_group,protocol_id,bgp_src_as_num,
>  bgp_dst_as_num,policy_qos_classification_hierarchy, 
> policy_qos_queue_id,sum(in_octets) as octects, sum(in_packets) as packets, 
> sum(out_packets) as out_packets, sum(out_octets) as out_octects FROM 
> Flow_Raw_TS group by 
> timeseries(end_ms,'minute'),src_ip,dst_ip,app_name,in_if_id,src_tos,src_ip_group,
>  
> dst_ip_group,protocol_id,bgp_src_as_num,bgp_dst_as_num,policy_qos_classification_hierarchy,
>  policy_qos_queue_id").show()
> +*Hour Based*+
> val startTime = System.nanoTime
> spark.sql("create materialized view Flow_Raw_TS_agg_001_hour as select 
> timeseries(end_ms,'hour') as end_ms,app_name,sum(in_octets) as octects, 
> sum(in_packets) as packets, sum(out_packets) as out_packets, sum(out_octets) 
> as out_octects, in_if_id,src_tos,src_ip_group, 
> dst_ip_group,protocol_id,src_ip, dst_ip,bgp_src_as_num, 
> bgp_dst_as_num,policy_qos_classification_hierarchy, policy_qos_queue_id FROM 
> Flow_Raw_TS group by 
> timeseries(end_ms,'hour'),in_if_id,app_name,src_tos,src_ip_group,dst_ip_group,protocol_id,src_ip,
>  dst_ip,bgp_src_as_num,bgp_dst_as_num,policy_qos_classification_hierarchy, 
> policy_qos_queue_id").show()
> val endTime = System.nanoTime
> val elapsedSeconds = (endTime - startTime) / 1e9d



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4239) Carbondata 2.1.1 MV : Incremental refresh : Doesnot aggregate data correctly

2021-07-14 Thread Sushant Sammanwar (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380685#comment-17380685
 ] 

Sushant Sammanwar commented on CARBONDATA-4239:
---

Thanks [~Indhumathi27] for your response.

If it is expected to for MV to write data to new segment then what benefit is 
MV giving here.
I have data being inserted every 15 mins and for hourly MV all 4 rows are there 
in parent table as well as MV.
I donot get any benefit in terms of storage.
As far as query time is concerned as no. of rows are same in MV ,it will take 
same time to run query on table.

> Carbondata 2.1.1 MV : Incremental refresh : Doesnot aggregate data correctly 
> -
>
> Key: CARBONDATA-4239
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4239
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, data-load
>Affects Versions: 2.1.1
> Environment: RHEL  spark-2.4.5-bin-hadoop2.7 for carbon 2.1.1 
>Reporter: Sushant Sammanwar
>Priority: Major
>  Labels: Materialistic_Views, materializedviews, refreshnodes
>
> Hi Team ,
> We are doing a POC with Carbondata using MV .
> Our MV doesnot contain AVG function as we wanted to utilize the feature of 
> incremental refresh.
> But with incremetnal refresh , we noticed the MV doesnot aggregate value 
> correctly.
> If a row is inserted , it creates another row in MV instead of adding 
> incremental value .
> As a result no. of rows in MV are almost same as raw table.
> This doesnot happen with full refresh MV. 
> Below is the data in MV with 3 rows :
> scala> carbon.sql("select * from fact_365_1_eutrancell_21_30_minute").show()
> ++---+---+--+-+-++
> |fact_365_1_eutrancell_21_tags_id|fact_365_1_eutrancell_21_metric| ts| 
> sum_value|min_value|max_value|fact_365_1_eutrancell_21_ts2|
> ++---+---+--+-+-++
> | ff6cb0f7-fba0-413...| eUtranCell.HHO.X2...|2020-09-25 
> 06:30:00|5412.68105| 31.345| 4578.112| 2020-09-25 05:30:00|
> | ff6cb0f7-fba0-413...| eUtranCell.HHO.X2...|2020-09-25 05:30:00| 1176.7035| 
> 392.2345| 392.2345| 2020-09-25 05:30:00|
> | ff6cb0f7-fba0-413...| eUtranCell.HHO.X2...|2020-09-25 06:00:00| 58.112| 
> 58.112| 58.112| 2020-09-25 05:30:00|
> ++---+---+--+-+-++
> Below , i am inserting data for 6th hour, and it should add incremental 
> values to 6th hour row of MV. 
> Note the data being inserted ; columns which are part of groupby clause are 
> having same values as existing data.
> scala> carbon.sql("insert into fact_365_1_eutrancell_21 values ('2020-09-25 
> 06:05:00','eUtranCell.HHO.X2.InterFreq.PrepAttOut','ff6cb0f7-fba0-4134-81ee-55e820574627',118.112,'2020-09-25
>  05:30:00')").show()
> 21/06/28 16:01:31 AUDIT audit: \{"time":"June 28, 2021 4:01:31 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"7332282307468267","opStatus":"START"}
> 21/06/28 16:01:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:33 AUDIT audit: \{"time":"June 28, 2021 4:01:33 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"7332284066443156","opStatus":"START"}
> [Stage 40:=>(199 + 1) / 
> 200]21/06/28 16:01:44 WARN CarbonOutputIteratorWrapper: try to poll a row 
> batch one more time.
> 21/06/28 16:01:44 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:44 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:44 AUDIT audit: \{"time":"June 28, 2021 4:01:44 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"7332284066443156","opStatus":"SUCCESS","opTime":"11343 
> ms","table":"default.fact_365_1_eutrancell_21_30_minute","extraInfo":{}}
> 21/06/28 16:01:44 AUDIT audit: \{"time":"June 28, 2021 4:01:44 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"7332282307468267","opStatus":"SUCCESS","opTime":"13137 
> ms","table":"default.fact_365_1_eutrancell_21","extraInfo":{}}
> +--+
> |Segment ID|
> +--+
> | 8|
> +--+
> Below we can see it has added another row of 2020-09-25 06:00:00 .
> Note: All values of columns which are part of groupby caluse have same value.

[jira] [Commented] (CARBONDATA-4132) Numer of records not matching in MVs

2021-07-14 Thread Indhumathi (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380491#comment-17380491
 ] 

Indhumathi commented on CARBONDATA-4132:


Hi Suyash,

I think, this is not an issue. The data that is stored in MV is 
partially-aggregated data, becasue of incremental-dataloading concept.

Doing select * /count on mv_table will give partially-aggregated results. If 
you want to check the data correctness, fire the query(on which you have 
created MV), which will do the final aggregation on partial-aggregated data 
stored in mv.

That should give you correct results. It is recommended to check the results of 
MV query, to check data correctness.

To check, if that query is hitting MV table or not, you can run the EXPLAIN 
command with query and check in the plan.

Refer [https://github.com/apache/carbondata/blob/master/docs/mv-guide.md] for 
more info

> Numer of records not matching in MVs
> 
>
> Key: CARBONDATA-4132
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4132
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.1
> Environment: Apache carbondata 2.0.1
>Reporter: suyash yadav
>Priority: Major
> Fix For: 2.0.1
>
>
> Hi Team, 
> We are working on a POC where we need to insert 300k records/second in a 
> table where we have already created Timeeries MVs with Minute,Hour,Day 
> granularity.
>  
> As per our the Minute based MV should contain 300K records till the insertion 
> of next minute data. Also the hour and Day based MVs should contain 300K 
> records till the arrival of next hour and next day data respectively.
>  
> But The count of records in MV is not coming out as per our expectation.It is 
> always more than our expectation.
> But the strange thing is, When we drop the MV and create the MV after 
> inserting the data in the table then the count if reocrds comes correct.So it 
> is clear there is no problem with MV definition and the data.
>  
> Kindly help us in resolving this issue on priority.Please find more details 
> below:
> Table definition:
> ===
> spark.sql("create table Flow_Raw_TS(export_ms bigint,exporter_ip 
> string,pkt_seq_num bigint,flow_seq_num int,src_ip string,dst_ip 
> string,protocol_id smallint,src_tos smallint,dst_tos smallint,raw_src_tos 
> smallint,raw_dst_tos smallint,src_mask smallint,dst_mask smallint,tcp_bits 
> int,src_port int,in_if_id bigint,in_if_entity_id bigint,in_if_enabled 
> boolean,dst_port int,out_if_id bigint,out_if_entity_id bigint,out_if_enabled 
> boolean,direction smallint,in_octets bigint,out_octets bigint,in_packets 
> bigint,out_packets bigint,next_hop_ip string,bgp_src_as_num 
> bigint,bgp_dst_as_num bigint,bgp_next_hop_ip string,end_ms timestamp,start_ms 
> timestamp,app_id string,app_name string,src_ip_group string,dst_ip_group 
> string,policy_qos_classification_hierarchy string,policy_qos_queue_id 
> bigint,worker_id int,day bigint ) stored as carbondata TBLPROPERTIES 
> ('local_dictionary_enable'='false')
> MV definition:
>  
> ==
> +*Minute based*+
> spark.sql("create materialized view Flow_Raw_TS_agg_001_min as select 
> timeseries(end_ms,'minute') as 
> end_ms,src_ip,dst_ip,app_name,in_if_id,src_tos,src_ip_group,dst_ip_group,protocol_id,bgp_src_as_num,
>  bgp_dst_as_num,policy_qos_classification_hierarchy, 
> policy_qos_queue_id,sum(in_octets) as octects, sum(in_packets) as packets, 
> sum(out_packets) as out_packets, sum(out_octets) as out_octects FROM 
> Flow_Raw_TS group by 
> timeseries(end_ms,'minute'),src_ip,dst_ip,app_name,in_if_id,src_tos,src_ip_group,
>  
> dst_ip_group,protocol_id,bgp_src_as_num,bgp_dst_as_num,policy_qos_classification_hierarchy,
>  policy_qos_queue_id").show()
> +*Hour Based*+
> val startTime = System.nanoTime
> spark.sql("create materialized view Flow_Raw_TS_agg_001_hour as select 
> timeseries(end_ms,'hour') as end_ms,app_name,sum(in_octets) as octects, 
> sum(in_packets) as packets, sum(out_packets) as out_packets, sum(out_octets) 
> as out_octects, in_if_id,src_tos,src_ip_group, 
> dst_ip_group,protocol_id,src_ip, dst_ip,bgp_src_as_num, 
> bgp_dst_as_num,policy_qos_classification_hierarchy, policy_qos_queue_id FROM 
> Flow_Raw_TS group by 
> timeseries(end_ms,'hour'),in_if_id,app_name,src_tos,src_ip_group,dst_ip_group,protocol_id,src_ip,
>  dst_ip,bgp_src_as_num,bgp_dst_as_num,policy_qos_classification_hierarchy, 
> policy_qos_queue_id").show()
> val endTime = System.nanoTime
> val elapsedSeconds = (endTime - startTime) / 1e9d



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-4239) Carbondata 2.1.1 MV : Incremental refresh : Doesnot aggregate data correctly

2021-07-14 Thread Indhumathi (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Indhumathi closed CARBONDATA-4239.
--
Resolution: Won't Fix

> Carbondata 2.1.1 MV : Incremental refresh : Doesnot aggregate data correctly 
> -
>
> Key: CARBONDATA-4239
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4239
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, data-load
>Affects Versions: 2.1.1
> Environment: RHEL  spark-2.4.5-bin-hadoop2.7 for carbon 2.1.1 
>Reporter: Sushant Sammanwar
>Priority: Major
>  Labels: Materialistic_Views, materializedviews, refreshnodes
>
> Hi Team ,
> We are doing a POC with Carbondata using MV .
> Our MV doesnot contain AVG function as we wanted to utilize the feature of 
> incremental refresh.
> But with incremetnal refresh , we noticed the MV doesnot aggregate value 
> correctly.
> If a row is inserted , it creates another row in MV instead of adding 
> incremental value .
> As a result no. of rows in MV are almost same as raw table.
> This doesnot happen with full refresh MV. 
> Below is the data in MV with 3 rows :
> scala> carbon.sql("select * from fact_365_1_eutrancell_21_30_minute").show()
> ++---+---+--+-+-++
> |fact_365_1_eutrancell_21_tags_id|fact_365_1_eutrancell_21_metric| ts| 
> sum_value|min_value|max_value|fact_365_1_eutrancell_21_ts2|
> ++---+---+--+-+-++
> | ff6cb0f7-fba0-413...| eUtranCell.HHO.X2...|2020-09-25 
> 06:30:00|5412.68105| 31.345| 4578.112| 2020-09-25 05:30:00|
> | ff6cb0f7-fba0-413...| eUtranCell.HHO.X2...|2020-09-25 05:30:00| 1176.7035| 
> 392.2345| 392.2345| 2020-09-25 05:30:00|
> | ff6cb0f7-fba0-413...| eUtranCell.HHO.X2...|2020-09-25 06:00:00| 58.112| 
> 58.112| 58.112| 2020-09-25 05:30:00|
> ++---+---+--+-+-++
> Below , i am inserting data for 6th hour, and it should add incremental 
> values to 6th hour row of MV. 
> Note the data being inserted ; columns which are part of groupby clause are 
> having same values as existing data.
> scala> carbon.sql("insert into fact_365_1_eutrancell_21 values ('2020-09-25 
> 06:05:00','eUtranCell.HHO.X2.InterFreq.PrepAttOut','ff6cb0f7-fba0-4134-81ee-55e820574627',118.112,'2020-09-25
>  05:30:00')").show()
> 21/06/28 16:01:31 AUDIT audit: \{"time":"June 28, 2021 4:01:31 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"7332282307468267","opStatus":"START"}
> 21/06/28 16:01:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:33 AUDIT audit: \{"time":"June 28, 2021 4:01:33 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"7332284066443156","opStatus":"START"}
> [Stage 40:=>(199 + 1) / 
> 200]21/06/28 16:01:44 WARN CarbonOutputIteratorWrapper: try to poll a row 
> batch one more time.
> 21/06/28 16:01:44 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:44 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:44 AUDIT audit: \{"time":"June 28, 2021 4:01:44 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"7332284066443156","opStatus":"SUCCESS","opTime":"11343 
> ms","table":"default.fact_365_1_eutrancell_21_30_minute","extraInfo":{}}
> 21/06/28 16:01:44 AUDIT audit: \{"time":"June 28, 2021 4:01:44 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"7332282307468267","opStatus":"SUCCESS","opTime":"13137 
> ms","table":"default.fact_365_1_eutrancell_21","extraInfo":{}}
> +--+
> |Segment ID|
> +--+
> | 8|
> +--+
> Below we can see it has added another row of 2020-09-25 06:00:00 .
> Note: All values of columns which are part of groupby caluse have same value.
> This means there should have been single row for 2020-09-25 06:00:00 .
> scala> carbon.sql("select * from 
> fact_365_1_eutrancell_21_30_minute").show(1000,false)
> ++--+---+--+-+-++
> |fact_365_1_eutrancell_21_tags_id |fact_365_1_eutrancell_21_metric |ts 
> |sum_value 

[jira] [Commented] (CARBONDATA-4239) Carbondata 2.1.1 MV : Incremental refresh : Doesnot aggregate data correctly

2021-07-14 Thread Indhumathi Muthumurugesh (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380485#comment-17380485
 ] 

Indhumathi Muthumurugesh commented on CARBONDATA-4239:
--

Hi suyash,

Incremental dataloading concept in mv, will aggreagate the new incoming data 
(new Load/Insert) and write it to a new segment. It will not append to existing 
segment.

Full Refresh mode -> Will do Aggreagtion on table data (all segments) , ie, 
insert overwrite operation, whereas, incremental refresh will create new 
segment for new incoming data.

So, in INSERT case, number of rows will be same as parent table. And, When you 
do select * from mv_table, the data is partially-aggregated.

When the query (that you have created as MV) is fired, it will do aggregation 
on this partially-aggregated data and return results.

So, in your case, this is not an issue. For INSERT CASE, if you dont want to 
load to MV for each row, you can create MV with "

with deferred refresh" and refresh it when required.

Please have a look at the design document link below, for more understanding.

[https://docs.google.com/document/d/1AACOYmBpwwNdHjJLOub0utSc6JCBMZn8VL5CvZ9hygA/edit]

 

> Carbondata 2.1.1 MV : Incremental refresh : Doesnot aggregate data correctly 
> -
>
> Key: CARBONDATA-4239
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4239
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, data-load
>Affects Versions: 2.1.1
> Environment: RHEL  spark-2.4.5-bin-hadoop2.7 for carbon 2.1.1 
>Reporter: Sushant Sammanwar
>Priority: Major
>  Labels: Materialistic_Views, materializedviews, refreshnodes
>
> Hi Team ,
> We are doing a POC with Carbondata using MV .
> Our MV doesnot contain AVG function as we wanted to utilize the feature of 
> incremental refresh.
> But with incremetnal refresh , we noticed the MV doesnot aggregate value 
> correctly.
> If a row is inserted , it creates another row in MV instead of adding 
> incremental value .
> As a result no. of rows in MV are almost same as raw table.
> This doesnot happen with full refresh MV. 
> Below is the data in MV with 3 rows :
> scala> carbon.sql("select * from fact_365_1_eutrancell_21_30_minute").show()
> ++---+---+--+-+-++
> |fact_365_1_eutrancell_21_tags_id|fact_365_1_eutrancell_21_metric| ts| 
> sum_value|min_value|max_value|fact_365_1_eutrancell_21_ts2|
> ++---+---+--+-+-++
> | ff6cb0f7-fba0-413...| eUtranCell.HHO.X2...|2020-09-25 
> 06:30:00|5412.68105| 31.345| 4578.112| 2020-09-25 05:30:00|
> | ff6cb0f7-fba0-413...| eUtranCell.HHO.X2...|2020-09-25 05:30:00| 1176.7035| 
> 392.2345| 392.2345| 2020-09-25 05:30:00|
> | ff6cb0f7-fba0-413...| eUtranCell.HHO.X2...|2020-09-25 06:00:00| 58.112| 
> 58.112| 58.112| 2020-09-25 05:30:00|
> ++---+---+--+-+-++
> Below , i am inserting data for 6th hour, and it should add incremental 
> values to 6th hour row of MV. 
> Note the data being inserted ; columns which are part of groupby clause are 
> having same values as existing data.
> scala> carbon.sql("insert into fact_365_1_eutrancell_21 values ('2020-09-25 
> 06:05:00','eUtranCell.HHO.X2.InterFreq.PrepAttOut','ff6cb0f7-fba0-4134-81ee-55e820574627',118.112,'2020-09-25
>  05:30:00')").show()
> 21/06/28 16:01:31 AUDIT audit: \{"time":"June 28, 2021 4:01:31 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"7332282307468267","opStatus":"START"}
> 21/06/28 16:01:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:33 AUDIT audit: \{"time":"June 28, 2021 4:01:33 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"7332284066443156","opStatus":"START"}
> [Stage 40:=>(199 + 1) / 
> 200]21/06/28 16:01:44 WARN CarbonOutputIteratorWrapper: try to poll a row 
> batch one more time.
> 21/06/28 16:01:44 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:44 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
> 21/06/28 16:01:44 AUDIT audit: \{"time":"June 28, 2021 4:01:44 PM 
> IST","username":"root","opName":"INSERT 
>