[ 
https://issues.apache.org/jira/browse/SENTRY-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540106#comment-16540106
 ] 

kalyan kumar kalvagadda edited comment on SENTRY-2305 at 7/11/18 1:42 PM:
--------------------------------------------------------------------------

[~LinaAtAustin] Bad is a relative term. Change proposed in  SENTRY-2306 will 
definitely reduce the snapshot size. Impact of this change depends on the kind 
of data. If none of the partitions are stored in default locations there will 
be no difference. If there are huge number of partitions and all of them are 
stored in default locations it will make a significant difference. 

 

Time taken for persisting the snapshot changes based on the environment. I 
think we should targeting improvement in various direction.


was (Author: kkalyan):
[~LinaAtAustin] Bad is a relative term. Change proposed in  SENTRY-2306 will 
definitely reduce the snapshot size. Impact of this change depends on the kind 
of data. If none of the paritions are stored in default locations there will be 
no difference. If there are huge number of partitions and all of them are 
stored in default locations it will make a significant difference. 

 

Time taken for persisting the snapshot changes based on the environment. I 
think we should targeting improvement in various direction.

> Optimize time taken for persistence HMS snapshot 
> -------------------------------------------------
>
>                 Key: SENTRY-2305
>                 URL: https://issues.apache.org/jira/browse/SENTRY-2305
>             Project: Sentry
>          Issue Type: Sub-task
>          Components: Sentry
>    Affects Versions: 2.1.0
>            Reporter: kalyan kumar kalvagadda
>            Assignee: kalyan kumar kalvagadda
>            Priority: Major
>
> There are couple of options
> # Break the total snapshot into to batches and persist all of them in 
> parallel in different transactions. As sentry uses repeatable_read isolation 
> level we should be able to have parallel writes on the same table. This bring 
> an issue if there is a failure in persisting any of the batches. This 
> approach needs additional logic of cleaning the partially persisted snapshot. 
> I’m evaluating this option. 
> ** *Result:* Initial results are promising. Time to persist the snapshot came 
> down by 60%.
> # Try disabling L1 Cache for persisting the snapshot.
> # Try persisting the snapshot entries sequentially in separate transactions. 
> As transactions which commit huge data might take longer as they take a lot 
> of CPU cycles to keep the rollback log up to date.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to