DomGarguilo commented on code in PR #421: URL: https://github.com/apache/accumulo-website/pull/421#discussion_r1558276143
########## _posts/blog/2024-04-09-does-a-compactor-return-memory-to-OS.md: ########## @@ -0,0 +1,247 @@ +--- +title: "Does a compactor process return memory to the OS?" +author: Dominic Garguilo, Kevin Rathbun +--- + +## Goal +We need to determine if, once a compactor process is finished using memory, will it give the unused memory back to the OS? + +### Why does it matter? +There could be a scenario where the amount of memory on a machine limits the number of compactors that can be run. For example, on a machine with 32G of memory, if each compactor process uses 6G of memory, we can only "fit" 5 compactors on that machine (32/6=5.333). Since each compactor process only runs on a single core, we would only be utilizing 5 cores on that machine where we would like to be using as many as we can. + +If the compactor process does not return the memory to the OS, then we are stuck with only using the following number of compactor processes: +`(total memory)/(memory per compactor)`. +If the compactor processes return the memory to the OS, i.e. does not stay at the maximum 6G once they reach it, then we can oversubscribe the memory allowing us to run more compactor processes on that machine. + +It should be noted that there is an inherent risk when oversubscribing processes that the user must be willing to accept if they choose to do oversubscribe. In this case, there is the possibility that all compactors run at the same time which might use all the memory on the machine. This could cause one or more of the compactor processes to be killed by the OOM killer. + +## Test Setup + +### Environment Prerequisites +***Install gnuplot*** + +This is used for plotting the memory usage of the compactor over time from the perspective of the OS + +1. `sudo apt install gnuplot` +2. gnuplot can now be started with the command `gnuplot` + +***Install VisualVM*** + +This is used for plotting the memory usage of the compactor over time from the perspective of the JVM + +1. download the zip from [visualvm.github.io](https://visualvm.github.io/) +2. extract with `unzip visualvm_218.zip` +3. Can now be started with the command `./path/to/visualvm_218/bin/visualvm` + +***Configure and start accumulo*** + +Accumulo 2.1 will be used for experimentation. Configuring accumulo to start compactors: + +1. Uncomment lines in "install/accumulo-2.1.2/conf/cluster.yaml" regarding the compaction coordinator and compactor. Don't need q2 compactor. Will just be using q1. This allows these processes to start up. +2. Configure the java args for the compactor process in "accumulo-env.sh." Line will be: + `compactor) JAVA_OPTS=('-Xmx256m' '-Xms256m' "${JAVA_OPTS[@]}") ;;` +3. Start accumulo with `uno start accumulo` + +***Install java versions*** + +1. Install the java versions you want to test (we used 11, 17 and 21). For example, to install Java 17: + 1. `sudo apt install openjdk-17-jdk` + 2. `sudo update-alternatives --config java` and select the version you want to use before starting your accumulo instance Review Comment: not everyone exports this env var in their bashrc file. maybe we can change step 3 to something like "ensure your `JAVA_HOME` environment var is up up to date so uno will use the correct java version" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org