Re: [Gluster-devel] [erik.jacob...@hpe.com: [Gluster-users] gluster forcing IPV6 on our IPV4 servers, glusterd fails (was gluster update question regarding new DNS resolution requirement)]
On Tue, 21 Sep 2021, Yaniv Kaul wrote: However, I do feel 'transport.address-family' should have been set to IPv4 to force IPv4 regardless. Then the question is why socket_client_get_remote_sockaddr() is calling client_fill_address_family() to get the family address, but then the next flow, a call to af_inet_client_get_remote_sockaddr() - which has this information, but ignores it (as we see above). Y. This isn't the only problem. The code in the gluster servers has pretty strong assumptions in it that IP must resolve to a hostname that resolves to the IP. Presumably as a security check, presumably from the original (pre-SSL) security model. This is a fragile assumption, both security wise, and operationally. With every additional network and/or address this constraint gets harder to maintain / more fragile. And of course, dual-stack networks will have multiple addresses. And you don't per se want to list an IPv6 address in the main service hostname. You may have IPv6 only hosts and dual-stack hosts. You may use IPv6 privacy addresses, which rotate. This assumption may work initially, but then fail later down the road. It makes gluster very flakey operationally, with IPv6. When you're using SSL for security, which provides a _secure_ notion of identity (unlike IP and hostname), independent of addresses, this fragile assumption feels antiquated. regards, -- Paul Jakma | p...@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A Fortune: At the source of every error which is blamed on the computer you will find at least two human errors, including the error of blaming it on the computer. --- Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [erik.jacob...@hpe.com: [Gluster-users] gluster forcing IPV6 on our IPV4 servers, glusterd fails (was gluster update question regarding new DNS resolution requirement)]
body: [2021-09-20 15:50:41.731543 +] E [name.c:265:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution failed on host 172.23.0.16 As you can see, we have defined on purpose everything using IPs but in 9.3 it appears this method fails. Are there any suggestions short of putting real host names in peer files? FYI This supercomputer will be using gluster for part of its system management. It is how we deploy the Image Objects (squashfs images) hosted on NFS today and served by gluster leader nodes and also store system logs, console logs, and other data. https://www.olcf.ornl.gov/frontier/ Erik Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list gluster-us...@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list gluster-us...@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list gluster-us...@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users --- Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel -- Paul Jakma | p...@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A Fortune: About the only thing on a farm that has an easy time is the dog. --- Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] docs: certmonger for gluster SSL certs
Hi, I found certmonger made life easier managing certs for gluster, with a CA. Wrote something up for gluster-docs. Would this be useful to add to the docs for others? Pull req here: https://github.com/gluster/glusterdocs/pull/694 regards, -- Paul Jakma | p...@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A Fortune: Do you know Montana? --- Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] On Identity, UUIDs, IPs, SSL, authentication/verification and fragility
Hi, I'm having issues with Gluster with Peer Rejected states. Adding new peers to a cluster pool tends to be pot-luck, and if I have a host that others don't like there doesn't seem to be any way to fix it (from within gluster). Debugging, my issues seem to stem from Glusters desire to (implicitly, via various lookups for the peerinfo) validate the IP of a connection, against the reverse DNS lookup for the hostname, against the forward lookup for that hostname, and known hostnames. I guess this stems from a history of IP-as-authentication-handle in Gluster, but it's a bit frustrating as I'm using SSL. All the packets were already authenticated - IP/DNS validation is not really adding for me security wise (indeed, I don't want any assumption of security by IP). Part of the background here is that I have a network where: - Hosts may have a number of IPs - site-local - public - different interfaces on different subnets (e.g. using VLANs to through L2 switches and give servers presence on multiple subnets, while avoiding having to bounce packets in and out of a router; v common). - service addresses - v4 - v6 - ephemeral privacy addresses - EUI-64 - stable - A DNS hostname may list multiple IPs - v4 and v6 - or v6 only - The PTR for an IP may not list the IP - There may be no PTR for an IP - An IP may not be listed in any /A record E.g., ephemeral and other addresses need not have DNS records. You generally wouldn't want them to have DNS records either. What I would like to do is make Gluster just accept the SSL identity, use that for the UUID binding, and not do any host/IP validation (other than checking for self). One way would be to just use the SSL peer identity as the hostname, and store that in the peerinfo, instead of IP or DNS/name-service host. Another would be to bolt on SSL identity alongside in the peerinfo, and check both. My inclination would be to just add an option to use SSL identity as the canonical hostname, and ignore IPs/name-service hostnames - I just don't see any value in validating the latter when you have strong TLS authentication in place. Would such a patch be welcome? Cause I really need some kind of fix for the current behaviour of gluster - it just doesn't work on more complex network setups, esp. with v6. Note: Client mount code seems to have similar issues. Will dig into that (it has a workaround), but need to fix the server side first for myself. regards, -- Paul Jakma | p...@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A Fortune: There is no fool to the old fool. -- John Heywood --- Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] mem-pool.c magic trailer causes SIGBUS, fix or remove?
Hi Amar, So fix is what I did first, but someone was wondering about the additional cost of the alignment calculation in the memory allocation path. (Though, the unaligned write of the trailer might be as many cycles of slowdown as a roundup calculation on platforms that handle unaligned access in microcode). The pull request with the fix is at: https://github.com/gluster/glusterfs/pull/2669 With that, glusterd can now run on SPARC64 - without crashing in one of the first few memory allocations. Though, I still can not get glusterd on SPARC64 to sync up with the pool. Whether with the trailer fixed, or the trailer removed. I end up with cksum errors on /var/lib/glusterfs/ files - starting with a completely clean /var/lib/glusterfs on the SPARC64 box. The 2 big differences between the SPARC64 host and the other hosts that are happily synced are: - distro (SPARC64 is debian; others are Fedora) - endianness. Wondering if either of those could be a factor? Thanks, Paul On Thu, 29 Jul 2021, Amar Tumballi wrote: Thanks for the initiative Paul. Let me answer the major question from the thread. Remove the trailer? Or fix it? Ideal one is to fix it, as we do use mem-pool info to identify leaks etc through statedumps of the process. Remove can be an option on SPARC to start with if fixing is time consuming. I recommend removing the trailer within a #ifdef codeblock, so it may continue work in places where its already working. regards, -- Paul Jakma | p...@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A Fortune: Honesty pays, but it doesn't seem to pay enough to suit some people. -- F. M. Hubbard --- Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] mem-pool.c magic trailer causes SIGBUS, fix or remove?
Hi, The magic trailer added at the end of memory allocations in mem-pool.c doesn't have its alignment ensured. This can lead to SIGBUS and abnormal exits on some platforms (e.g., SPARC64). I have a patch to round-up the header and allocation size to get the right alignment for the trailer (I think). It does complicate the memory allocation a little further. Question is whether it would just be simpler to remove the trailer, and simplify the code? There are a number of external tools that exist to debug memory allocs, from runtime loadable debug malloc libraries, to compiler features (ASAN, etc.), to valgrind. Glusterfs could just rely on those, and so simplify and (perhaps) speed-up its own, general-case memory code. Remove the trailer? Or fix it? regards, -- Paul Jakma | p...@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A Fortune: Many people are secretly interested in life. --- Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel