Dear All, Happy new year to everyone! My goal here at the Paul Scherrer Institut is to merge two different GPFS building blocks. In partucular, these are not the same technology or not even the same brand: - an IBM ESS-3500 (a NVMe/Performance storage system) consisting of a Power9 confluent node and two AMD canisters, and 12 NVMe drives - a Lenovo G242 "hybrid" consisting of 4 HDD enclosures, 2 SSD enclosures, 1 Intel support node and 2 Intel storage nodes.
The final configuration I would expect is a single building block with 4 IO nodes, 3 declustered array: 1 for HDDs, 1 for SSDs, 1 for NVMe (the last one to be used as a cache pool). First of all, I would like to know if anyone has already tried this solution successfully. Then, below is the description of what I have done. I will preface by saying that I was able to configure the two storage clusters separately without any problem; therefore, I would exclude any inherent problem in each building block (which was installed from scratch). But when I try to have a single cluster, with different node classes, I have problems. The steps I followed (based on documentation I found in IBM pages, https://www.ibm.com/docs/en/ess-p8/5.3.1?topic=command-outline-mmvdisk-use-case) are as follows: 1 access one of the 2 building blocks (that already has a storage cluster configured, with no recoverygroups defined) 2 run "mmaddnode -N <the_two_IO_nodes_to_add_from_the_other_BB>" 3 mmchlicense... 3 mmvdisk nodeclass create ... to isolate the two "new" IO nodes in a dedicated nodeclass for the purpose of differentiating configuration parameters, connected drive topology, and then recovery groups 4 perform topology discovery with: mmvdisk server list --node-class ess --disk-topology In the following the cluster and node classes: Node Daemon node name IP address Admin node name Designation ---------------------------------------------------------------------- 1 sfdssio1.psi.ch 129.129.241.67 sfdssio1.psi.ch quorum-manager 2 sfdssio2.psi.ch 129.129.241.68 sfdssio2.psi.ch quorum-manager 3 sfessio1.psi.ch 129.129.241.27 sfessio1.psi.ch quorum-manager 4 sfessio2.psi.ch 129.129.241.28 sfessio2.psi.ch quorum-manager Node Class Name Members --------------------- ----------------------------------------------------------- ess sfessio1.psi.ch,sfessio2.psi.ch dss sfdssio1.psi.ch,sfdssio2.psi.ch The "mmnodeadd" operation was performed while logged into sfdssio1 (which belongs to the Lenovo G242). Then: [root@sfdssio1 ~]# mmvdisk server list --node-class ess --disk-topology node needs matching number server attention metric disk topology ------ -------------------------------- --------- -------- ------------- 3 sfessio1.psi.ch yes - unmatched server topology 4 sfessio2.psi.ch yes - unmatched server topology mmvdisk: To see what needs attention, use the command: mmvdisk: mmvdisk server list -N sfessio1.psi.ch --disk-topology -L mmvdisk: mmvdisk server list -N sfessio2.psi.ch --disk-topology -L [root@sfdssio1 ~]# mmvdisk server list -N sfessio1.psi.ch --disk-topology -L Unable to find a matching topology specification for topology file '/var/mmfs/tmp/cmdTmpDir.mmvdisk.1468913/pdisk-topology.sfessio1.psi.ch'. Topology component identification is using these CST stanza files: /usr/lpp/mmfs/data/compSpec-1304.stanza /usr/lpp/mmfs/data/compSpec-1400.stanza /usr/lpp/mmfs/data/cst/compSpec-Lenovo.stanza /usr/lpp/mmfs/data/cst/compSpec-topology.stanza Server component: serverType 'ESS3500-5141-FN2' serverArch 'x86_64' serverName 'sfessio1.psi.ch' Enclosure components: 1 found connected to HBAs Enclosure component: serialNumber '78E4395' enclosureClass 'unknown' HBA components: none found connected to enclosures Cabling: enclosure '78E4395' controller '' cabled to HBA slot 'UNKNOWN' port 'unknown' Disks: 12 SSDs 0 HDDs NVRAM: 0 devices/partitions Unable to match these components to a serverTopology specification. mmvdisk: Command failed. Examine previous error messages to determine cause. If I try to do a symmetric operation (I access an IO node of the IBM ESS3500 and try to add Lenovo nodes, trying to discover their drive topology) I get the same error; but, of course, the topology involved this time is that of the Lenovo hardware. Now, I suspect there is a (hidden?) step I would be supposed to know, but unfortunately I don't (this is my first experience with different and etherogenous building blocks merge). So I'd like to receive from you any suggestions, including a better documentation page (if any) covering this particular use case I have. Hope the description of the context is clear enough, in case it is not I apologize and please just ask for any further details required to understand my environment. Thank you very much, Alvise Dorigo
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org