Re: [Openstack-operators] Small openstack (part 2), distributed glance
Directions: nova->switch port, switch port -> glance, glance->switch port (to swift). I assume traffic from switch to swift outside installation. Glance-api receive and send same amount of traffic. It sounds like a minor issue until you starts to count CPU IRQ time of network card (doubled compare to a single direction of traffic). Glance on compute will consume less CPU (because of high performance loopback). On 01/21/2015 07:20 PM, Michael Dorman wrote: This is great info, George. Could you explain the 3x snapshot transport under the traditional Glance setup, please? I understand that you have compute —> glance, and glance —> swift. But what’s the third transfer? Thanks! Mike On 1/21/15, 10:36 AM, "George Shuklin" wrote: Ok, news so far: It works like a magic. Nova have option [glance] host=127.0.0.1 And I do not need to cheat with endpoint resolving (my initial plan was to resolve glance endpoint to 127.0.0.1 with /etc/hosts magic). Normal glance-api reply to external clients requests (image-create/download/list/etc), and local glance-apis (per compute) are used to connect to swift. Glance registry works in normal mode (only on 'official' api servers). I don't see any reason why we should centralize all traffic to swift through special dedicated servers, investing in fast CPU and 10G links. With that solution CPU load on glance-api is distributed evenly on all compute nodes, and overall snapshot traffic (on ports) was cut down 3 times! Why I didn't thought about this earlier? On 01/16/2015 12:20 AM, George Shuklin wrote: Hello everyone. One more thing in the light of small openstack. I really dislike tripple network load caused by current glance snapshot operations. When compute do snapshot, it playing with files locally, than it sends them to glance-api, and (if glance API is linked to swift), glance sends them to swift. Basically, for each 100Gb disk there is 300Gb on network operations. It is specially painful for glance-api, which need to get more CPU and network bandwidth than we want to spend on it. So idea: put glance-api on each compute node without cache. To help compute to go to the proper glance, endpoint points to fqdn, and on each compute that fqdn is pointing to localhost (where glance-api is live). Plus normal glance-api on API/controller node to serve dashboard/api clients. I didn't test it yet. Any ideas on possible problems/bottlenecks? And how many glance-registry I need for this? ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Small openstack (part 2), distributed glance
This is great info, George. Could you explain the 3x snapshot transport under the traditional Glance setup, please? I understand that you have compute —> glance, and glance —> swift. But what’s the third transfer? Thanks! Mike On 1/21/15, 10:36 AM, "George Shuklin" wrote: >Ok, news so far: > >It works like a magic. Nova have option >[glance] >host=127.0.0.1 > >And I do not need to cheat with endpoint resolving (my initial plan was >to resolve glance endpoint to 127.0.0.1 with /etc/hosts magic). Normal >glance-api reply to external clients requests >(image-create/download/list/etc), and local glance-apis (per compute) >are used to connect to swift. > >Glance registry works in normal mode (only on 'official' api servers). > >I don't see any reason why we should centralize all traffic to swift >through special dedicated servers, investing in fast CPU and 10G links. > >With that solution CPU load on glance-api is distributed evenly on all >compute nodes, and overall snapshot traffic (on ports) was cut down 3 >times! > >Why I didn't thought about this earlier? > >On 01/16/2015 12:20 AM, George Shuklin wrote: >> Hello everyone. >> >> One more thing in the light of small openstack. >> >> I really dislike tripple network load caused by current glance >> snapshot operations. When compute do snapshot, it playing with files >> locally, than it sends them to glance-api, and (if glance API is >> linked to swift), glance sends them to swift. Basically, for each >> 100Gb disk there is 300Gb on network operations. It is specially >> painful for glance-api, which need to get more CPU and network >> bandwidth than we want to spend on it. >> >> So idea: put glance-api on each compute node without cache. >> >> To help compute to go to the proper glance, endpoint points to fqdn, >> and on each compute that fqdn is pointing to localhost (where >> glance-api is live). Plus normal glance-api on API/controller node to >> serve dashboard/api clients. >> >> I didn't test it yet. >> >> Any ideas on possible problems/bottlenecks? And how many >> glance-registry I need for this? > > >___ >OpenStack-operators mailing list >OpenStack-operators@lists.openstack.org >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Small openstack (part 2), distributed glance
Ok, news so far: It works like a magic. Nova have option [glance] host=127.0.0.1 And I do not need to cheat with endpoint resolving (my initial plan was to resolve glance endpoint to 127.0.0.1 with /etc/hosts magic). Normal glance-api reply to external clients requests (image-create/download/list/etc), and local glance-apis (per compute) are used to connect to swift. Glance registry works in normal mode (only on 'official' api servers). I don't see any reason why we should centralize all traffic to swift through special dedicated servers, investing in fast CPU and 10G links. With that solution CPU load on glance-api is distributed evenly on all compute nodes, and overall snapshot traffic (on ports) was cut down 3 times! Why I didn't thought about this earlier? On 01/16/2015 12:20 AM, George Shuklin wrote: Hello everyone. One more thing in the light of small openstack. I really dislike tripple network load caused by current glance snapshot operations. When compute do snapshot, it playing with files locally, than it sends them to glance-api, and (if glance API is linked to swift), glance sends them to swift. Basically, for each 100Gb disk there is 300Gb on network operations. It is specially painful for glance-api, which need to get more CPU and network bandwidth than we want to spend on it. So idea: put glance-api on each compute node without cache. To help compute to go to the proper glance, endpoint points to fqdn, and on each compute that fqdn is pointing to localhost (where glance-api is live). Plus normal glance-api on API/controller node to serve dashboard/api clients. I didn't test it yet. Any ideas on possible problems/bottlenecks? And how many glance-registry I need for this? ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Small openstack (part 2), distributed glance
+1 On Sun, Jan 18, 2015 at 10:05 PM, Jay Pipes wrote: > On 01/15/2015 05:20 PM, George Shuklin wrote: > >> Hello everyone. >> >> One more thing in the light of small openstack. >> >> I really dislike tripple network load caused by current glance snapshot >> operations. When compute do snapshot, it playing with files locally, >> than it sends them to glance-api, and (if glance API is linked to >> swift), glance sends them to swift. Basically, for each 100Gb disk there >> is 300Gb on network operations. It is specially painful for glance-api, >> which need to get more CPU and network bandwidth than we want to spend >> on it. >> >> So idea: put glance-api on each compute node without cache. >> >> To help compute to go to the proper glance, endpoint points to fqdn, and >> on each compute that fqdn is pointing to localhost (where glance-api is >> live). Plus normal glance-api on API/controller node to serve >> dashboard/api clients. >> >> I didn't test it yet. >> >> Any ideas on possible problems/bottlenecks? And how many glance-registry >> I need for this? >> > > Honestly, the Glance project just needs to go away, IMO. > > The glance_store library should be the focus of all new image and volume > bit-moving functionality, and the glance_store library should replace all > current code in Nova and Cinder that does any copying through the Glance > API nodes. > > The Glance REST API should just be an artifact repository that the > glance_store library (should be renamed to oslo.bitmover or something) > should call the Glance REST API for the URIs of image locations (source and > targets) and handle all the bit moving operations in as efficient a manner > as possible. > > Best, > -jay > > > > ___ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Small openstack (part 2), distributed glance
On 01/15/2015 05:20 PM, George Shuklin wrote: Hello everyone. One more thing in the light of small openstack. I really dislike tripple network load caused by current glance snapshot operations. When compute do snapshot, it playing with files locally, than it sends them to glance-api, and (if glance API is linked to swift), glance sends them to swift. Basically, for each 100Gb disk there is 300Gb on network operations. It is specially painful for glance-api, which need to get more CPU and network bandwidth than we want to spend on it. So idea: put glance-api on each compute node without cache. To help compute to go to the proper glance, endpoint points to fqdn, and on each compute that fqdn is pointing to localhost (where glance-api is live). Plus normal glance-api on API/controller node to serve dashboard/api clients. I didn't test it yet. Any ideas on possible problems/bottlenecks? And how many glance-registry I need for this? Honestly, the Glance project just needs to go away, IMO. The glance_store library should be the focus of all new image and volume bit-moving functionality, and the glance_store library should replace all current code in Nova and Cinder that does any copying through the Glance API nodes. The Glance REST API should just be an artifact repository that the glance_store library (should be renamed to oslo.bitmover or something) should call the Glance REST API for the URIs of image locations (source and targets) and handle all the bit moving operations in as efficient a manner as possible. Best, -jay ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Small openstack (part 2), distributed glance
We do not using centralized storages (all instances running with local drives). And I just can't express my happiness about this. Every time monitoring send me '** PROBLEM ALERT bla-bla-bla', I know it not a big deal. Just one server. I do not want to turn gray prematurely. Just light glance on https://www.google.com/search?q=ceph+crash+corruption give me strong feeling I don't want to centralize points of failures. Btw: If I sold nodes designated for Ceph as normal compute nodes, it will be more effective than sell only space from them (and buy more compute nodes for actual work). On 01/16/2015 12:31 AM, Abel Lopez wrote: That specific bottleneck can be solved by running glance on ceph, and running ephemeral instances also on ceph. Snapshots are a quick backend operation then. But you've made your installation on a house of cards. On Thursday, January 15, 2015, George Shuklin mailto:george.shuk...@gmail.com>> wrote: Hello everyone. One more thing in the light of small openstack. I really dislike tripple network load caused by current glance snapshot operations. When compute do snapshot, it playing with files locally, than it sends them to glance-api, and (if glance API is linked to swift), glance sends them to swift. Basically, for each 100Gb disk there is 300Gb on network operations. It is specially painful for glance-api, which need to get more CPU and network bandwidth than we want to spend on it. So idea: put glance-api on each compute node without cache. To help compute to go to the proper glance, endpoint points to fqdn, and on each compute that fqdn is pointing to localhost (where glance-api is live). Plus normal glance-api on API/controller node to serve dashboard/api clients. I didn't test it yet. Any ideas on possible problems/bottlenecks? And how many glance-registry I need for this? ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Small openstack (part 2), distributed glance
That specific bottleneck can be solved by running glance on ceph, and running ephemeral instances also on ceph. Snapshots are a quick backend operation then. But you've made your installation on a house of cards. On Thursday, January 15, 2015, George Shuklin wrote: > Hello everyone. > > One more thing in the light of small openstack. > > I really dislike tripple network load caused by current glance snapshot > operations. When compute do snapshot, it playing with files locally, than > it sends them to glance-api, and (if glance API is linked to swift), glance > sends them to swift. Basically, for each 100Gb disk there is 300Gb on > network operations. It is specially painful for glance-api, which need to > get more CPU and network bandwidth than we want to spend on it. > > So idea: put glance-api on each compute node without cache. > > To help compute to go to the proper glance, endpoint points to fqdn, and > on each compute that fqdn is pointing to localhost (where glance-api is > live). Plus normal glance-api on API/controller node to serve dashboard/api > clients. > > I didn't test it yet. > > Any ideas on possible problems/bottlenecks? And how many glance-registry I > need for this? > > ___ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] Small openstack (part 2), distributed glance
Hello everyone. One more thing in the light of small openstack. I really dislike tripple network load caused by current glance snapshot operations. When compute do snapshot, it playing with files locally, than it sends them to glance-api, and (if glance API is linked to swift), glance sends them to swift. Basically, for each 100Gb disk there is 300Gb on network operations. It is specially painful for glance-api, which need to get more CPU and network bandwidth than we want to spend on it. So idea: put glance-api on each compute node without cache. To help compute to go to the proper glance, endpoint points to fqdn, and on each compute that fqdn is pointing to localhost (where glance-api is live). Plus normal glance-api on API/controller node to serve dashboard/api clients. I didn't test it yet. Any ideas on possible problems/bottlenecks? And how many glance-registry I need for this? ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators