[jira] [Commented] (CARBONDATA-4207) MV data getting lost

Indhumathi (Jira) Tue, 20 Jul 2021 03:33:05 -0700


    [ 
https://issues.apache.org/jira/browse/CARBONDATA-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384056#comment-17384056
 ]


Indhumathi commented on CARBONDATA-4207:
----------------------------------------

Hi Suyash,

Can you provide the create MV sql to replicate the issue.

For FULL REFRESH case, when loading ( (INSERT-OVERWRITE)) is in progress to MV 
table, and load failed due to any system / application crash/failure, in that 
case, MV will not have any data and it will be disabled. Have to sync the data 
again using Refresh MV command to enable it.

Let me know, what is the reason for insertion failure also.

> MV data getting lost
> --------------------
>
>                 Key: CARBONDATA-4207
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-4207
>             Project: CarbonData
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 2.0.1
>            Reporter: suyash yadav
>            Priority: Major
>             Fix For: 2.0.1
>
>
> Hi Team,
> We have observed one more issue, We had created one table and a timeseries MV 
> on it. We had loaded almost 15 hours of data into it and then when we were 
> loading 16th hour data the loading failed because of some reason but it 
> caused MV to go empty. Our mv has now zero rows. Could you please let us know 
> if there is any bug or this is how it is supposed to work. Because our MV did 
> not have any avg function so ideally the loading to MV should have been 
> incremental , and in that case MV should not have got impacted if the 
> subsequent hour loading to main table failed. Please have a look into this 
> issue. And let us know what information you need.
>  
> scala> spark.sql("insert into Flow_TS_2day_stats_04062021 select 
> start_time,end_time,source_ip_address,destintion_ip_address,appname,protocol_id,source_tos,src_as,dst_as,source_mask,destination_mask,dst_tos,input_pkt,output_pkt,input_byt,output_byt,source_port,destination_port,in_interface,out_interface
>  from Flow_TS_1day_stats_24052021  where start_time>='2021-03-04 07:00:00' 
> and start_time< '2021-03-04 09:00:00'").show()
>  
> [1:38|https://carbondataworkspace.slack.com/archives/D01GLHKSAFL/p1623226096008700]
> scala> spark.sql("insert into Flow_TS_2day_stats_04062021 select 
> start_time,end_time,source_ip_address,destintion_ip_address,appname,protocol_id,source_tos,src_as,dst_as,source_mask,destination_mask,dst_tos,input_pkt,output_pkt,input_byt,output_byt,source_port,destination_port,in_interface,out_interface
>  from Flow_TS_1day_stats_24052021  where start_time>='2021-03-04 15:00:00' 
> and start_time< '2021-03-04 16:00:00'").show()
> 21/06/06 14:25:33 AUDIT audit: \{"time":"June 6, 2021 2:25:33 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"4069819623887063","opStatus":"START"}
> 21/06/06 14:44:14 AUDIT audit: \{"time":"June 6, 2021 2:44:14 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"4070940294400824","opStatus":"START"}
> 21/06/06 16:06:05 AUDIT audit: \{"time":"June 6, 2021 4:06:05 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"4070940294400824","opStatus":"SUCCESS","opTime":"4911240 
> ms","table":"default.Interface_Level_Agg_10min_MV_04062021","extraInfo":{"SegmentId":"6","DataSize":"4.52GB","IndexSize":"108.27KB"}}
> 21/06/06 16:06:09 AUDIT audit: \{"time":"June 6, 2021 4:06:09 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"4069819623887063","opStatus":"SUCCESS","opTime":"6036073 
> ms","table":"default.flow_ts_2day_stats_04062021","extraInfo":{"SegmentId":"6","DataSize":"12.37GB","IndexSize":"262.43KB"}}[^Stack_Trace]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CARBONDATA-4207) MV data getting lost

Reply via email to