Hot swapping cube post build
We have a use case where we want to rebuild the cube with an updated data set without downtime on requests Lets say we have cube C1. We get some new data and we rebuild the cube. Lets call this C2 with the new data. (Assume no change to cube structure) When C2 is building, we want C1 to be still serving requests. Once C2 is done building, we hot swap C1 with C2 This way there is no downtime on the requests (even if it is, its very less) Another problem is, since it has to use the same hive table names and schema as C1, we can recreate the tables (external) pointing to the data for C2 We cannot use the incremental cube data addition since as of now its hard to figure out of the change set. What is the best way to achieve this ? Assumption: Since we cannot have two cubes with same name under same project, we need two different cubes. Regards, Abhilash
Re: Hot swapping cube post build
have you ever checked out the "refresh" function for cubes? On Thu, Jan 28, 2016 at 7:07 PM, Abhilash L L wrote: > We have a use case where we want to rebuild the cube with an updated data > set without downtime on requests > > Lets say we have cube C1. > We get some new data and we rebuild the cube. > Lets call this C2 with the new data. (Assume no change to cube structure) > > When C2 is building, we want C1 to be still serving requests. > Once C2 is done building, we hot swap C1 with C2 > > This way there is no downtime on the requests (even if it is, its very > less) > > Another problem is, since it has to use the same hive table names and > schema as C1, we can recreate the tables (external) pointing to the data > for C2 > > We cannot use the incremental cube data addition since as of now its hard > to figure out of the change set. > > What is the best way to achieve this ? > > Assumption: > Since we cannot have two cubes with same name under same project, we need > two different cubes. > > > Regards, > Abhilash > -- Regards, *Bin Mahone | 马洪宾* Apache Kylin: http://kylin.io Github: https://github.com/binmahone
Re: Hot swapping cube post build
Assuming the cube definition does not change, all you need is "refresh" an existing cube segment. The old cube segment will continue serving until the new build is complete. No down time during the whole process. Try "refresh" On Friday, January 29, 2016, hongbin ma wrote: > have you ever checked out the "refresh" function for cubes? > > On Thu, Jan 28, 2016 at 7:07 PM, Abhilash L L > wrote: > > > We have a use case where we want to rebuild the cube with an updated data > > set without downtime on requests > > > > Lets say we have cube C1. > > We get some new data and we rebuild the cube. > > Lets call this C2 with the new data. (Assume no change to cube structure) > > > > When C2 is building, we want C1 to be still serving requests. > > Once C2 is done building, we hot swap C1 with C2 > > > > This way there is no downtime on the requests (even if it is, its very > > less) > > > > Another problem is, since it has to use the same hive table names and > > schema as C1, we can recreate the tables (external) pointing to the data > > for C2 > > > > We cannot use the incremental cube data addition since as of now its > hard > > to figure out of the change set. > > > > What is the best way to achieve this ? > > > > Assumption: > > Since we cannot have two cubes with same name under same project, we need > > two different cubes. > > > > > > Regards, > > Abhilash > > > > > > -- > Regards, > > *Bin Mahone | 马洪宾* > Apache Kylin: http://kylin.io > Github: https://github.com/binmahone >
Re: Hot swapping cube post build
Sure, we will try out refresh What do you suggest in case there are some changes in the schema / extra measures etc Regards, Abhilash On Sat, Jan 30, 2016 at 12:28 PM, Li Yang wrote: > Assuming the cube definition does not change, all you need is "refresh" an > existing cube segment. The old cube segment will continue serving until the > new build is complete. No down time during the whole process. > > Try "refresh" > > > > On Friday, January 29, 2016, hongbin ma wrote: > > > have you ever checked out the "refresh" function for cubes? > > > > On Thu, Jan 28, 2016 at 7:07 PM, Abhilash L L > > wrote: > > > > > We have a use case where we want to rebuild the cube with an updated > data > > > set without downtime on requests > > > > > > Lets say we have cube C1. > > > We get some new data and we rebuild the cube. > > > Lets call this C2 with the new data. (Assume no change to cube > structure) > > > > > > When C2 is building, we want C1 to be still serving requests. > > > Once C2 is done building, we hot swap C1 with C2 > > > > > > This way there is no downtime on the requests (even if it is, its very > > > less) > > > > > > Another problem is, since it has to use the same hive table names and > > > schema as C1, we can recreate the tables (external) pointing to the > data > > > for C2 > > > > > > We cannot use the incremental cube data addition since as of now its > > hard > > > to figure out of the change set. > > > > > > What is the best way to achieve this ? > > > > > > Assumption: > > > Since we cannot have two cubes with same name under same project, we > need > > > two different cubes. > > > > > > > > > Regards, > > > Abhilash > > > > > > > > > > > -- > > Regards, > > > > *Bin Mahone | 马洪宾* > > Apache Kylin: http://kylin.io > > Github: https://github.com/binmahone > > >
Re: Hot swapping cube post build
Hi Abhilash, Any data model change will require rebuild so far. Our practices is to clone existing cube metadata as new one and add extra measure/dimension or make changes, then build it. Once new one ready, disable old one. Hope this will help you. Thanks. Luke Best Regards! - Luke Han On Mon, Feb 1, 2016 at 4:21 PM, Abhilash L L wrote: > Sure, we will try out refresh > > > What do you suggest in case there are some changes in the schema / extra > measures etc > > Regards, > Abhilash > > On Sat, Jan 30, 2016 at 12:28 PM, Li Yang wrote: > > > Assuming the cube definition does not change, all you need is "refresh" > an > > existing cube segment. The old cube segment will continue serving until > the > > new build is complete. No down time during the whole process. > > > > Try "refresh" > > > > > > > > On Friday, January 29, 2016, hongbin ma wrote: > > > > > have you ever checked out the "refresh" function for cubes? > > > > > > On Thu, Jan 28, 2016 at 7:07 PM, Abhilash L L > > > wrote: > > > > > > > We have a use case where we want to rebuild the cube with an updated > > data > > > > set without downtime on requests > > > > > > > > Lets say we have cube C1. > > > > We get some new data and we rebuild the cube. > > > > Lets call this C2 with the new data. (Assume no change to cube > > structure) > > > > > > > > When C2 is building, we want C1 to be still serving requests. > > > > Once C2 is done building, we hot swap C1 with C2 > > > > > > > > This way there is no downtime on the requests (even if it is, its > very > > > > less) > > > > > > > > Another problem is, since it has to use the same hive table names and > > > > schema as C1, we can recreate the tables (external) pointing to the > > data > > > > for C2 > > > > > > > > We cannot use the incremental cube data addition since as of now its > > > hard > > > > to figure out of the change set. > > > > > > > > What is the best way to achieve this ? > > > > > > > > Assumption: > > > > Since we cannot have two cubes with same name under same project, we > > need > > > > two different cubes. > > > > > > > > > > > > Regards, > > > > Abhilash > > > > > > > > > > > > > > > > -- > > > Regards, > > > > > > *Bin Mahone | 马洪宾* > > > Apache Kylin: http://kylin.io > > > Github: https://github.com/binmahone > > > > > >