Re: [PATCH] [RFC] KVM test: Control files automatic generation to save memory
On Sun, 2010-02-14 at 12:07 -0500, Michael Goldish wrote: - Lucas Meneghel Rodrigues l...@redhat.com wrote: As our configuration system generates a list of dicts with test parameters, and that list might be potentially *very* large, keeping all this information in memory might be a problem for smaller virtualization hosts due to the memory pressure created. Tests made on my 4GB laptop show that most of the memory is being used during a typical kvm autotest session. So, instead of keeping all this information in memory, let's take a different approach and unfold all the tests generated by the config system and generate a control file: job.run_test('kvm', params={param1, param2, ...}, tag='foo', ...) job.run_test('kvm', params={param1, param2, ...}, tag='bar', ...) By dumping all the dicts that were before in the memory to a control file, the memory usage of a typical kvm autotest session is drastically reduced making it easier to run in smaller virt hosts. The advantages of taking this new approach are: * You can see what tests are going to run and the dependencies between them by looking at the generated control file * The control file is all ready to use, you can for example paste it on the web interface and profit * As mentioned, a lot less memory consumption, avoiding memory pressure on virtualization hosts. This is a crude 1st pass at implementing this approach, so please provide comments. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- Interesting idea! - Personally I don't like the renaming of kvm_config.py to generate_control.py, and prefer to keep them separate, so that generate_control.py has the create_control() function and kvm_config.py has everything else. It's just a matter of naming; kvm_config.py deals mostly with config files, not with control files, and it can be used for other purposes than generating control files. Fair enough, no problem. - I wonder why so much memory is used by the test list. Our daily test sets aren't very big, so although the parser should use a huge amount of memory while parsing, nearly all of that memory should be freed by the time the parser is done, because the final 'only' statement reduces the number of tests to a small fraction of the total number in a full set. What test set did you try with that 4 GB machine, and how much memory was used by the test list? If a ridiculous amount of memory was used, this might indicate a bug in kvm_config.py (maybe it keeps references to deleted tests, forcing them to stay in memory). This problem wasn't found during the daily test routine, rather it was a comment I heard from Naphtali about the typical autotest memory usage. Also Marcelo made a similar comment, so I thought it was a problem worth looking. I tried to run the default test set that we selected for upstream (3 resulting dicts) on my 4GB RAM laptop, here are my findings: * Before autotest usage: Around 20% of memory used, 10% used as cache. * During autotest usage: About 99% of memory used, 27% used as cache. So yes, there's a significant memory usage increase, that doesn't happen using a flat, autogenerated control file. Sure it doesn't make my laptop crawl, but it is a *lot* of resource usage anyway. Also, let's assume that for small test sets, we can can reclaim all memory back. Still we have to consider large test sets. I am all for profiling the memory usage and fix eventual bugs, but we need to keep in mind that one might want to run large test sets, and large test sets imply keeping a fairly large amount of data in memory. If the amount of memory is negligible on most use cases, then let's just fix bugs and forget about using the proposed approach. Also, a flat control file is quicker to run, because there's no parsing of the config file happening in there. So, this control file generation thing makes some sense, that's why I decided to code this 1st pass attempt at doing it. - I don't think this approach will work for control.parallel, because the tests have to be assigned dynamically to available queues, and AFAIK this can't be done by a simple static control file. Not necessarily, as the control file is a program, we can just generate the code using some sort of function that can do the assignment. I don't fully see all that's needed to get the job done, but in theory should be possible. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [RFC] KVM test: Control files automatic generation to save memory
- Lucas Meneghel Rodrigues l...@redhat.com wrote: On Sun, 2010-02-14 at 12:07 -0500, Michael Goldish wrote: - Lucas Meneghel Rodrigues l...@redhat.com wrote: As our configuration system generates a list of dicts with test parameters, and that list might be potentially *very* large, keeping all this information in memory might be a problem for smaller virtualization hosts due to the memory pressure created. Tests made on my 4GB laptop show that most of the memory is being used during a typical kvm autotest session. So, instead of keeping all this information in memory, let's take a different approach and unfold all the tests generated by the config system and generate a control file: job.run_test('kvm', params={param1, param2, ...}, tag='foo', ...) job.run_test('kvm', params={param1, param2, ...}, tag='bar', ...) By dumping all the dicts that were before in the memory to a control file, the memory usage of a typical kvm autotest session is drastically reduced making it easier to run in smaller virt hosts. The advantages of taking this new approach are: * You can see what tests are going to run and the dependencies between them by looking at the generated control file * The control file is all ready to use, you can for example paste it on the web interface and profit * As mentioned, a lot less memory consumption, avoiding memory pressure on virtualization hosts. This is a crude 1st pass at implementing this approach, so please provide comments. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- Interesting idea! - Personally I don't like the renaming of kvm_config.py to generate_control.py, and prefer to keep them separate, so that generate_control.py has the create_control() function and kvm_config.py has everything else. It's just a matter of naming; kvm_config.py deals mostly with config files, not with control files, and it can be used for other purposes than generating control files. Fair enough, no problem. - I wonder why so much memory is used by the test list. Our daily test sets aren't very big, so although the parser should use a huge amount of memory while parsing, nearly all of that memory should be freed by the time the parser is done, because the final 'only' statement reduces the number of tests to a small fraction of the total number in a full set. What test set did you try with that 4 GB machine, and how much memory was used by the test list? If a ridiculous amount of memory was used, this might indicate a bug in kvm_config.py (maybe it keeps references to deleted tests, forcing them to stay in memory). This problem wasn't found during the daily test routine, rather it was a comment I heard from Naphtali about the typical autotest memory usage. Also Marcelo made a similar comment, so I thought it was a problem worth looking. I tried to run the default test set that we selected for upstream (3 resulting dicts) on my 4GB RAM laptop, here are my findings: * Before autotest usage: Around 20% of memory used, 10% used as cache. * During autotest usage: About 99% of memory used, 27% used as cache. Before autotest usage, were there any VMs running? 3 dicts can't possibly take up so much space. If it is indeed kvm_config's fault (which I doubt), there's probably a bug in it that prevents it from freeing unused memory, and once we fix that bug the problem should be gone. So yes, there's a significant memory usage increase, that doesn't happen using a flat, autogenerated control file. Sure it doesn't make my laptop crawl, but it is a *lot* of resource usage anyway. Also, let's assume that for small test sets, we can can reclaim all memory back. Still we have to consider large test sets. I am all for profiling the memory usage and fix eventual bugs, but we need to keep in mind that one might want to run large test sets, and large test sets imply keeping a fairly large amount of data in memory. If the amount of memory is negligible on most use cases, then let's just fix bugs and forget about using the proposed approach. Also, a flat control file is quicker to run, because there's no parsing of the config file happening in there. So, this control file Agreed, but on the other hand, the static control file idea introduces an extra preprocessing step (not necessarily bad). generation thing makes some sense, that's why I decided to code this 1st pass attempt at doing it. - I don't think this approach will work for control.parallel, because the tests have to be assigned dynamically to available queues, and AFAIK this can't be done by a simple static control file. Not necessarily, as the control file is a program, we can just generate the code using some sort of function that can do the assignment. I don't fully see all that's needed to get the job done,
Re: [PATCH] [RFC] KVM test: Control files automatic generation to save memory
- Lucas Meneghel Rodrigues l...@redhat.com wrote: As our configuration system generates a list of dicts with test parameters, and that list might be potentially *very* large, keeping all this information in memory might be a problem for smaller virtualization hosts due to the memory pressure created. Tests made on my 4GB laptop show that most of the memory is being used during a typical kvm autotest session. So, instead of keeping all this information in memory, let's take a different approach and unfold all the tests generated by the config system and generate a control file: job.run_test('kvm', params={param1, param2, ...}, tag='foo', ...) job.run_test('kvm', params={param1, param2, ...}, tag='bar', ...) By dumping all the dicts that were before in the memory to a control file, the memory usage of a typical kvm autotest session is drastically reduced making it easier to run in smaller virt hosts. The advantages of taking this new approach are: * You can see what tests are going to run and the dependencies between them by looking at the generated control file * The control file is all ready to use, you can for example paste it on the web interface and profit * As mentioned, a lot less memory consumption, avoiding memory pressure on virtualization hosts. This is a crude 1st pass at implementing this approach, so please provide comments. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- Interesting idea! - Personally I don't like the renaming of kvm_config.py to generate_control.py, and prefer to keep them separate, so that generate_control.py has the create_control() function and kvm_config.py has everything else. It's just a matter of naming; kvm_config.py deals mostly with config files, not with control files, and it can be used for other purposes than generating control files. - I wonder why so much memory is used by the test list. Our daily test sets aren't very big, so although the parser should use a huge amount of memory while parsing, nearly all of that memory should be freed by the time the parser is done, because the final 'only' statement reduces the number of tests to a small fraction of the total number in a full set. What test set did you try with that 4 GB machine, and how much memory was used by the test list? If a ridiculous amount of memory was used, this might indicate a bug in kvm_config.py (maybe it keeps references to deleted tests, forcing them to stay in memory). - I don't think this approach will work for control.parallel, because the tests have to be assigned dynamically to available queues, and AFAIK this can't be done by a simple static control file. - Whether or not this is a good idea probably depends on the users. On one hand, users will be required to run generate_control.py before autotest.py, and the generated control files will be very big and ugly; on the other hand, maybe they won't care. I probably haven't given this enough thought so I might have missed a few things. client/tests/kvm/control | 64 client/tests/kvm/generate_control.py | 586 ++ client/tests/kvm/kvm_config.py | 524 -- 3 files changed, 586 insertions(+), 588 deletions(-) delete mode 100644 client/tests/kvm/control create mode 100755 client/tests/kvm/generate_control.py delete mode 100755 client/tests/kvm/kvm_config.py diff --git a/client/tests/kvm/control b/client/tests/kvm/control deleted file mode 100644 index 163286e..000 --- a/client/tests/kvm/control +++ /dev/null @@ -1,64 +0,0 @@ -AUTHOR = -u...@redhat.com (Uri Lublin) -dru...@redhat.com (Dror Russo) -mgold...@redhat.com (Michael Goldish) -dh...@redhat.com (David Huff) -aerom...@redhat.com (Alexey Eromenko) -mbu...@redhat.com (Mike Burns) - -TIME = 'MEDIUM' -NAME = 'KVM test' -TEST_TYPE = 'client' -TEST_CLASS = 'Virtualization' -TEST_CATEGORY = 'Functional' - -DOC = -Executes the KVM test framework on a given host. This module is separated in -minor functions, that execute different tests for doing Quality Assurance on -KVM (both kernelspace and userspace) code. - -For online docs, please refer to http://www.linux-kvm.org/page/KVM-Autotest - - -import sys, os, logging -# Add the KVM tests dir to the python path -kvm_test_dir = os.path.join(os.environ['AUTODIR'],'tests/kvm') -sys.path.append(kvm_test_dir) -# Now we can import modules inside the KVM tests dir -import kvm_utils, kvm_config - -# set English environment (command output might be localized, need to be safe) -os.environ['LANG'] = 'en_US.UTF-8' - -build_cfg_path = os.path.join(kvm_test_dir, build.cfg) -build_cfg = kvm_config.config(build_cfg_path) -# Make any desired changes to the build configuration here. For example: -#build_cfg.parse_string( -#release_tag = 84 -#) -if not kvm_utils.run_tests(build_cfg.get_list(), job): -logging.error(KVM build
Re: [PATCH] [RFC] KVM test: Control files automatic generation to save memory
On 02/14/2010 07:07 PM, Michael Goldish wrote: - Lucas Meneghel Rodriguesl...@redhat.com wrote: As our configuration system generates a list of dicts with test parameters, and that list might be potentially *very* large, keeping all this information in memory might be a problem for smaller virtualization hosts due to the memory pressure created. Tests made on my 4GB laptop show that most of the memory is being used during a typical kvm autotest session. So, instead of keeping all this information in memory, let's take a different approach and unfold all the tests generated by the config system and generate a control file: job.run_test('kvm', params={param1, param2, ...}, tag='foo', ...) job.run_test('kvm', params={param1, param2, ...}, tag='bar', ...) By dumping all the dicts that were before in the memory to a control file, the memory usage of a typical kvm autotest session is drastically reduced making it easier to run in smaller virt hosts. The advantages of taking this new approach are: * You can see what tests are going to run and the dependencies between them by looking at the generated control file * The control file is all ready to use, you can for example paste it on the web interface and profit * As mentioned, a lot less memory consumption, avoiding memory pressure on virtualization hosts. This is a crude 1st pass at implementing this approach, so please provide comments. Signed-off-by: Lucas Meneghel Rodriguesl...@redhat.com --- Interesting idea! - Personally I don't like the renaming of kvm_config.py to generate_control.py, and prefer to keep them separate, so that generate_control.py has the create_control() function and kvm_config.py has everything else. It's just a matter of naming; kvm_config.py deals mostly with config files, not with control files, and it can be used for other purposes than generating control files. - I wonder why so much memory is used by the test list. Our daily test sets aren't very big, so although the parser should use a huge amount of memory while parsing, nearly all of that memory should be freed by the time the parser is done, because the final 'only' statement reduces the number of tests to a small fraction of the total number in a full set. What test set did you try with that 4 GB machine, and how much memory was used by the test list? If a ridiculous amount of memory was used, this might indicate a bug in kvm_config.py (maybe it keeps references to deleted tests, forcing them to stay in memory). I agree, it's worth getting to the bottom of it - I wonder how many objects are created on kvm unstable set. It should be a huge number. Besides that, one can always call the python garbage collection interface in order to free unreferenced memory immediately. - I don't think this approach will work for control.parallel, because the tests have to be assigned dynamically to available queues, and AFAIK this can't be done by a simple static control file. - Whether or not this is a good idea probably depends on the users. On one hand, users will be required to run generate_control.py before autotest.py, and the generated control files will be very big and ugly; on the other hand, maybe they won't care. I probably haven't given this enough thought so I might have missed a few things. client/tests/kvm/control | 64 client/tests/kvm/generate_control.py | 586 ++ client/tests/kvm/kvm_config.py | 524 -- 3 files changed, 586 insertions(+), 588 deletions(-) delete mode 100644 client/tests/kvm/control create mode 100755 client/tests/kvm/generate_control.py delete mode 100755 client/tests/kvm/kvm_config.py diff --git a/client/tests/kvm/control b/client/tests/kvm/control deleted file mode 100644 index 163286e..000 --- a/client/tests/kvm/control +++ /dev/null @@ -1,64 +0,0 @@ -AUTHOR = -u...@redhat.com (Uri Lublin) -dru...@redhat.com (Dror Russo) -mgold...@redhat.com (Michael Goldish) -dh...@redhat.com (David Huff) -aerom...@redhat.com (Alexey Eromenko) -mbu...@redhat.com (Mike Burns) - -TIME = 'MEDIUM' -NAME = 'KVM test' -TEST_TYPE = 'client' -TEST_CLASS = 'Virtualization' -TEST_CATEGORY = 'Functional' - -DOC = -Executes the KVM test framework on a given host. This module is separated in -minor functions, that execute different tests for doing Quality Assurance on -KVM (both kernelspace and userspace) code. - -For online docs, please refer to http://www.linux-kvm.org/page/KVM-Autotest - - -import sys, os, logging -# Add the KVM tests dir to the python path -kvm_test_dir = os.path.join(os.environ['AUTODIR'],'tests/kvm') -sys.path.append(kvm_test_dir) -# Now we can import modules inside the KVM tests dir -import kvm_utils, kvm_config - -# set English environment (command output might be localized, need to be safe) -os.environ['LANG'] = 'en_US.UTF-8' - -build_cfg_path = os.path.join(kvm_test_dir, build.cfg) -build_cfg =