dr0ptp4kt added a comment.
**AWS EC2 servers** After exploring a battery of EC2 servers, four instance types were selected and the commands posted were run. The configuration most like our `wdqs1021-1023` servers (third generation Intel Xeon) is listed first. The fastest option among the four servers was a Gravitron3 ARM-based configuration from Amazon. | Time Disk ➡️ Disk | Time RAMdisk ➡️ RAMdisk | Instance Type | Cost Per Hour | HD Transfer | Processor Comment | RAM Comment | | ----------------- | ----------------------- | -------------------------------------------------------------- | ------------- | ----------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | | 26m46.651s | 26m26.923s | m6id <https://aws.amazon.com/ec2/instance-types/m6i/>.16xlarge | $3.7968 | EBS ➡️ NVMe | 64 vCPU @ "Up to 3.5 GHz 3rd Generation Intel Xeon Scalable processors (Ice Lake 8375C)" | 256 GB @ DDR4 | | 22m5.442s | 20m31.244s | m5zn <https://aws.amazon.com/ec2/instance-types/m5/>.metal | $3.9641 | EBS ➡️ EBS | 48 vCPU @ ""2nd Generation Intel Xeon Scalable Processors (Cascade Lake 8252C) with an all-core turbo frequency up to 4.5 GHz"" | 192 GiB @ DDR4 | | 21m40.537s | 20m57.268s | c5d <https://aws.amazon.com/ec2/instance-types/c5/>.12xlarge | $2.304 | EBS ➡️ NVMe | 48 vCPU @ " C5 <https://phabricator.wikimedia.org/C5> and C5d 12xlarge, 24xlarge, and metal instance sizes feature custom 2nd generation Intel Xeon Scalable Processors (Cascade Lake 8275CL) with a sustained all core Turbo frequency of 3.6GHz and single core turbo frequency of up to 3.9GHz." | 96 GiB @ DDR4 | | 19m18.825s | 19m23.868s | c7gd <https://aws.amazon.com/ec2/instance-types/c7g/>.16xlarge | $2.903 | EBS ➡️ NVMe | 64 vCPU @ "Powered by custom-built AWS Graviton3 processors" | 128 GiB @ DDR5 | | **2018 gaming desktop** Commands were then run against a a gaming-class desktop from 2018. This outperformed the fastest Gravitron3 configuration in AWS. The Blazegraph `bufferCapacity` configuration variable was tested. Increasing the `bufferCapacity` from 100000 to 1000000 yielded a sizable performance improvement. | Time Disk ➡️ Disk | Instance Type | bufferCapacity | HD Transfer | Processor Comment | RAM Comment | | ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------ | | 18m31.647s | Alienware Aurora R7 <https://www.bestbuy.com/site/alienware-aurora-r7-gaming-desktop-intel-core-i7-8700-16gb-memory-nvidia-gtx-1070-1tb-hdd-intel-optane-memory/6155310.p?skuId=6155310> (upgraded) i7-8700 | 100000 | SATA SSD ➡️ NVMe | 6 CPU @ up to 4.6 GHz (i7-8700 <https://ark.intel.com/content/www/us/en/ark/products/126686/intel-core-i7-8700-processor-12m-cache-up-to-4-60-ghz.html> page) | 64 GB @ DDR4 | | 18m3.798s | Alienware Aurora R7 <https://www.bestbuy.com/site/alienware-aurora-r7-gaming-desktop-intel-core-i7-8700-16gb-memory-nvidia-gtx-1070-1tb-hdd-intel-optane-memory/6155310.p?skuId=6155310> (upgraded) i7-8700 | 100000 | NVMe ➡️ same NVMe | 6 CPU @ up to 4.6 GHz (i7-8700 <https://ark.intel.com/content/www/us/en/ark/products/126686/intel-core-i7-8700-processor-12m-cache-up-to-4-60-ghz.html> page) | 64 GB @ DDR4 | | 15m1.658s | Alienware Aurora R7 <https://www.bestbuy.com/site/alienware-aurora-r7-gaming-desktop-intel-core-i7-8700-16gb-memory-nvidia-gtx-1070-1tb-hdd-intel-optane-memory/6155310.p?skuId=6155310> (upgraded) | 100000**00** | NVMe ➡️ same NVMe | 6 CPU @ up to 4.6 GHz (i7-8700 <https://ark.intel.com/content/www/us/en/ark/products/126686/intel-core-i7-8700-processor-12m-cache-up-to-4-60-ghz.html> page) | 64 GB @ DDR4 | | 13m28.076s | Alienware Aurora R7 <https://www.bestbuy.com/site/alienware-aurora-r7-gaming-desktop-intel-core-i7-8700-16gb-memory-nvidia-gtx-1070-1tb-hdd-intel-optane-memory/6155310.p?skuId=6155310> (upgraded) | 100000**0** | NVMe ➡️ same NVMe | 6 CPU @ up to 4.6 GHz (i7-8700 <https://ark.intel.com/content/www/us/en/ark/products/126686/intel-core-i7-8700-processor-12m-cache-up-to-4-60-ghz.html> page) | 64 GB @ DDR4 | | **2019 MacBook Pro** The commands were run against a 2019-era Intel-based MacBook Pro. A 16 GB Java heap was used, as this laptop only has 32 GB of memory. This was a fast arrangement, but not quite as fast as the gaming-class desktop. The 2019 MacBook Pro has a faster processor, but a somewhat slower disk, than the gaming desktop. | Time Disk ➡️ Disk | Instance Type | bufferCapacity | HD Transfer | Processor Comment | RAM Comment | | ----------------- | ------------------------------------------------------------------ | -------------- | ------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------- | ------------ | | 17m00.040s | 2019 MacBook Pro <https://support.apple.com/kb/SP809?locale=en_US> | 100000 | High-end SSD ➡️ Same SSD (Tom's Hardware <https://www.tomsguide.com/reviews/macbook-pro-16-inch-2019>) | "Configurable to 2.4GHz 8‑core Intel Core i9, Turbo Boost up to 5.0GHz, with 16MB shared L3 <https://phabricator.wikimedia.org/L3> cache" | 32 GB @ DDR4 | | 16m27.390s | 2019 MacBook Pro <https://support.apple.com/kb/SP809?locale=en_US> | 100000**0** | High-end SSD ➡️ Same SSD (Tom's Hardware <https://www.tomsguide.com/reviews/macbook-pro-16-inch-2019>) | "Configurable to 2.4GHz 8‑core Intel Core i9, Turbo Boost up to 5.0GHz, with 16MB shared L3 <https://phabricator.wikimedia.org/L3> cache" | 32 GB @ DDR4 | | **Gaming desktop with MemStore** The following represents an in-memory configuration of the Blazegraph database instead of a `.jnl` database journal file. It's fast, but do note that the operating system killed the process after completion, apparently due to memory pressure. A higher memory instance would be required in order to load the full database, and again, it appears there isn't an easy way to serialize to disk for later restoration. But this is illustrative of what's possible . | Time Disk ➡️ Disk | Instance Type | bufferCapacity | HD Transfer | Processor Comment | RAM Comment | | ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------ | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------ | | 9m34.808s | Alienware Aurora R7 <https://www.bestbuy.com/site/alienware-aurora-r7-gaming-desktop-intel-core-i7-8700-16gb-memory-nvidia-gtx-1070-1tb-hdd-intel-optane-memory/6155310.p?skuId=6155310> (upgraded) | 100000 w/ MemStore | NVMe ➡️ same NVMe | 6 core @ up to 4.6 GHz (i7-8700 <https://ark.intel.com/content/www/us/en/ark/products/126686/intel-core-i7-8700-processor-12m-cache-up-to-4-60-ghz.html> page) | 64 GB @ DDR4 | | **Queue Capacity** As suggested at https://github.com/blazegraph/database/wiki/IOOptimization the queue capacity was set at 4000 for these runs. com.bigdata.btree.writeRetentionQueue.capacity=4000 This is probably an okay middle ground value. But, for the purpose of checking out the behavior with this setting, the same command was run, setting the capacity to a value of 8000 on the 2019 MacBook Pro. Indeed, it was faster as expected, at 12m29.030s, showing promise like in a previous analysis <https://addshore.com/2021/02/testing-wdqs-blazegraph-data-load-performance/#queue-capacity>. This is likely an area for further examination against the 2018 gaming desktop. Soon to follow is some discussion of what was observed for NVMe versus SSD targets on that 2018 gaming desktop. TASK DETAIL https://phabricator.wikimedia.org/T359062 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dr0ptp4kt Cc: ssingh, bking, dr0ptp4kt, Aklapper, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org