Chuck et All. Let me go through the point one by one.
#1 Even seeing that "object-auditor" allways runs and never stops, we stoped the swift-*-auditor and didnt see any improvements, from all the datanodes we have an average of 8% IO-WAIT (using iostat), the only thing that we see is the pid "xfsbuf" runs once in a while causing 99% iowait for a sec, we delayed the runtime for that process, and didnt see changes either. Our object-auditor config for all devices is as follow : [object-auditor] files_per_second = 5 zero_byte_files_per_second = 5 bytes_per_second = 3000000 #2 Our 12 proxyes are 6 physical and 6 kvm instances running on nova, checking iftop we are at an average of 15Mb/s of bandwidth usage so i dont think we are saturating the networking. #3 The overall Idle CPU on all datanodes is 80%, im not sure how to check the CPU usage per worker, let me paste the config for a device for object, account and container. *object-server.conf* *------------------* [DEFAULT] devices = /srv/node/sda3 mount_check = false bind_port = 6010 user = swift log_facility = LOG_LOCAL2 log_level = DEBUG workers = 48 disable_fallocate = true [pipeline:main] pipeline = object-server [app:object-server] use = egg:swift#object [object-replicator] vm_test_mode = yes concurrency = 8 run_pause = 600 [object-updater] concurrency = 8 [object-auditor] files_per_second = 5 zero_byte_files_per_second = 5 bytes_per_second = 3000000 *account-server.conf* *-------------------* [DEFAULT] devices = /srv/node/sda3 mount_check = false bind_port = 6012 user = swift log_facility = LOG_LOCAL2 log_level = DEBUG workers = 48 db_preallocation = on disable_fallocate = true [pipeline:main] pipeline = account-server [app:account-server] use = egg:swift#account [account-replicator] vm_test_mode = yes concurrency = 8 run_pause = 600 [account-auditor] [account-reaper] *container-server.conf* *---------------------* [DEFAULT] devices = /srv/node/sda3 mount_check = false bind_port = 6011 user = swift workers = 48 log_facility = LOG_LOCAL2 allow_versions = True disable_fallocate = true [pipeline:main] pipeline = container-server [app:container-server] use = egg:swift#container allow_versions = True [container-replicator] vm_test_mode = yes concurrency = 8 run_pause = 500 [container-updater] concurrency = 8 [container-auditor] #4 We dont use SSL for swift so, no latency over there. Hope you guys can shed some light. * * * * *Alejandro Comisario #melicloud CloudBuilders* Arias 3751, Piso 7 (C1430CRG) Ciudad de Buenos Aires - Argentina Cel: +549(11) 15-3770-1857 Tel : +54(11) 4640-8443 On Mon, Jan 14, 2013 at 1:23 PM, Chuck Thier <[email protected]> wrote: > Hi Alejandro, > > I really doubt that partition size is causing these issues. It can be > difficult to debug these types of issues without access to the > cluster, but I can think of a couple of things to look at. > > 1. Check your disk io usage and io wait on the storage nodes. If > that seems abnormally high, then that could be one of the sources of > problems. If this is the case, then the first things that I would > look at are the auditors, as they can use up a lot of disk io if not > properly configured. I would try turning them off for a bit > (swift-*-auditor) and see if that makes any difference. > > 2. Check your network io usage. You haven't described what type of > network you have going to the proxies, but if they share a single GigE > interface, if my quick calculations are correct, you could be > saturating the network. > > 3. Check your CPU usage. I listed this one last as you have said > that you have already worked at tuning the number of workers (though I > would be interested to hear how many workers you have running for each > service). The main thing to look for, is to see if all of your > workers are maxed out on CPU, if so, then you may need to bump > workers. > > 4. SSL Termination? Where are you terminating the SSL connection? > If you are terminating SSL in Swift directly with the swift proxy, > then that could also be a source of issue. This was only meant for > dev and testing, and you should use an SSL terminating load balancer > in front of the swift proxies. > > That's what I could think of right off the top of my head. > > -- > Chuck > > On Mon, Jan 14, 2013 at 5:45 AM, Alejandro Comisario > <[email protected]> wrote: > > Chuck / John. > > We are having 50.000 request per minute ( where 10.000+ are put from > small > > objects, from 10KB to 150KB ) > > > > We are using swift 1.7.4 with keystone token caching so no latency over > > there. > > We are having 12 proxyes and 24 datanodes divided in 4 zones ( each > datanode > > has 48gb of ram, 2 hexacore and 4 devices of 3TB each ) > > > > The workers that are puting objects in swift are seeing an awful > > performance, and we too. > > With peaks of 2secs to 15secs per put operations coming from the > datanodes. > > We tunes db_preallocation, disable_fallocate, workers and concurrency > but we > > cant reach the request that we need ( we need 24.000 put per minute of > small > > objects ) but we dont seem to find where is the problem, other than from > the > > datanodes. > > > > Maybe worth pasting our config over here? > > Thanks in advance. > > > > alejandro > > > > On 12 Jan 2013 02:01, "Chuck Thier" <[email protected]> wrote: > >> > >> Looking at this from a different perspective. Having 2500 partitions > >> per drive shouldn't be an absolutely horrible thing either. Do you > >> know how many objects you have per partition? What types of problems > >> are you seeing? > >> > >> -- > >> Chuck > >> > >> On Fri, Jan 11, 2013 at 3:28 PM, John Dickinson <[email protected]> wrote: > >> > If effect, this would be a complete replacement of your rings, and > that > >> > is essentially a whole new cluster. All of the existing data would > need to > >> > be rehashed into the new ring before it is available. > >> > > >> > There is no process that rehashes the data to ensure that it is still > in > >> > the correct partition. Replication only ensures that the partitions > are on > >> > the right drives. > >> > > >> > To change the number of partitions, you will need to GET all of the > data > >> > from the old ring and PUT it to the new ring. A more complicated, but > >> > perhaps more efficient) solution may include something like walking > each > >> > drive and rehashing+moving the data to the right partition and then > letting > >> > replication settle it down. > >> > > >> > Either way, 100% of your existing data will need to at least be > rehashed > >> > (and probably moved). Your CPU (hashing), disks (read+write), RAM > (directory > >> > walking), and network (replication) may all be limiting factors in > how long > >> > it will take to do this. Your per-disk free space may also determine > what > >> > method you choose. > >> > > >> > I would not expect any data loss while doing this, but you will > probably > >> > have availability issues, depending on the data access patterns. > >> > > >> > I'd like to eventually see something in swift that allows for changing > >> > the partition power in existing rings, but that will be > >> > hard/tricky/non-trivial. > >> > > >> > Good luck. > >> > > >> > --John > >> > > >> > > >> > On Jan 11, 2013, at 1:17 PM, Alejandro Comisario > >> > <[email protected]> wrote: > >> > > >> >> Hi guys. > >> >> We've created a swift cluster several months ago, the things is that > >> >> righ now we cant add hardware and we configured lots of partitions > thinking > >> >> about the final picture of the cluster. > >> >> > >> >> Today each datanodes is having 2500+ partitions per device, and even > >> >> tuning the background processes ( replicator, auditor & updater ) we > really > >> >> want to try to lower the partition power. > >> >> > >> >> Since its not possible to do that without recreating the ring, we can > >> >> have the luxury of recreate it with a very lower partition power, and > >> >> rebalance / deploy the new ring. > >> >> > >> >> The question is, having a working cluster with *existing data* is it > >> >> possible to do this and wait for the data to move around *without > data loss* > >> >> ??? > >> >> If so, it might be true to wait for an improvement in the overall > >> >> cluster performance ? > >> >> > >> >> We have no problem to have a non working cluster (while moving the > >> >> data) even for an entire weekend. > >> >> > >> >> Cheers. > >> >> > >> >> > >> > > >> > > >> > _______________________________________________ > >> > Mailing list: https://launchpad.net/~openstack > >> > Post to : [email protected] > >> > Unsubscribe : https://launchpad.net/~openstack > >> > More help : https://help.launchpad.net/ListHelp > >> > >
_______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : [email protected] Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp

