Re: [Gluster-devel] [erik.jacob...@hpe.com: [Gluster-users] gluster forcing IPV6 on our IPV4 servers, glusterd fails (was gluster update question regarding new DNS resolution requirement)]

2021-09-21 Thread Paul Jakma

On Tue, 21 Sep 2021, Yaniv Kaul wrote:

However, I do feel 'transport.address-family' should have been set to 
IPv4 to force IPv4 regardless. Then the question is why 
socket_client_get_remote_sockaddr() is calling 
client_fill_address_family() to get the family address, but then the 
next flow, a call to af_inet_client_get_remote_sockaddr() - which has 
this information, but ignores it (as we see above). Y.


This isn't the only problem.

The code in the gluster servers has pretty strong assumptions in it that 
IP must resolve to a hostname that resolves to the IP. Presumably as a 
security check, presumably from the original (pre-SSL) security model.


This is a fragile assumption, both security wise, and operationally. 
With every additional network and/or address this constraint gets harder 
to maintain / more fragile.


And of course, dual-stack networks will have multiple addresses. And you 
don't per se want to list an IPv6 address in the main service hostname. 
You may have IPv6 only hosts and dual-stack hosts. You may use IPv6 
privacy addresses, which rotate.


This assumption may work initially, but then fail later down the road. 
It makes gluster very flakey operationally, with IPv6.


When you're using SSL for security, which provides a _secure_ notion of 
identity (unlike IP and hostname), independent of addresses, this 
fragile assumption feels antiquated.


regards,
--
Paul Jakma | p...@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A
Fortune:
At the source of every error which is blamed on the computer you will find
at least two human errors, including the error of blaming it on the computer.
---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [erik.jacob...@hpe.com: [Gluster-users] gluster forcing IPV6 on our IPV4 servers, glusterd fails (was gluster update question regarding new DNS resolution requirement)]

2021-09-21 Thread Paul Jakma
body:

[2021-09-20 15:50:41.731543 +] E

[name.c:265:af_inet_client_get_remote_sockaddr] 0-management: DNS
resolution failed on host 172.23.0.16



As you can see, we have defined on purpose everything using IPs but in
9.3 it appears this method fails. Are there any suggestions short of
putting real host names in peer files?



FYI

This supercomputer will be using gluster for part of its system
management. It is how we deploy the Image Objects (squashfs images)
hosted on NFS today and served by gluster leader nodes and also store
system logs, console logs, and other data.

https://www.olcf.ornl.gov/frontier/


Erik




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
gluster-us...@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
gluster-us...@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
gluster-us...@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel






--
Paul Jakma | p...@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A
Fortune:
About the only thing on a farm that has an easy time is the dog.
---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



[Gluster-devel] docs: certmonger for gluster SSL certs

2021-08-08 Thread Paul Jakma

Hi,

I found certmonger made life easier managing certs for gluster, with a 
CA. Wrote something up for gluster-docs. Would this be useful to add to 
the docs for others? Pull req here:


https://github.com/gluster/glusterdocs/pull/694

regards,
--
Paul Jakma | p...@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A
Fortune:
Do you know Montana?
---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



[Gluster-devel] On Identity, UUIDs, IPs, SSL, authentication/verification and fragility

2021-08-07 Thread Paul Jakma

Hi,

I'm having issues with Gluster with Peer Rejected states. Adding new 
peers to a cluster pool tends to be pot-luck, and if I have a host that 
others don't like there doesn't seem to be any way to fix it (from 
within gluster).


Debugging, my issues seem to stem from Glusters desire to (implicitly, 
via various lookups for the peerinfo) validate the IP of a connection, 
against the reverse DNS lookup for the hostname, against the forward 
lookup for that hostname, and known hostnames.


I guess this stems from a history of IP-as-authentication-handle in 
Gluster, but it's a bit frustrating as I'm using SSL. All the packets 
were already authenticated - IP/DNS validation is not really adding for 
me security wise (indeed, I don't want any assumption of security by 
IP).


Part of the background here is that I have a network where:

- Hosts may have a number of IPs
  - site-local
  - public
  - different interfaces on different subnets (e.g. using VLANs to
through L2 switches and give servers presence on multiple subnets,
while avoiding having to bounce packets in and out of a router; v
common).
  - service addresses
  - v4
  - v6
- ephemeral privacy addresses
- EUI-64
- stable

- A DNS hostname may list multiple IPs
  - v4 and v6
  - or v6 only
- The PTR for an IP may not list the IP
- There may be no PTR for an IP
- An IP may not be listed in any /A record

E.g., ephemeral and other addresses need not have DNS records. You 
generally wouldn't want them to have DNS records either.


What I would like to do is make Gluster just accept the SSL identity, 
use that for the UUID binding, and not do any host/IP validation (other 
than checking for self).


One way would be to just use the SSL peer identity as the hostname, and 
store that in the peerinfo, instead of IP or DNS/name-service host. 
Another would be to bolt on SSL identity alongside in the peerinfo, and 
check both.


My inclination would be to just add an option to use SSL identity as the 
canonical hostname, and ignore IPs/name-service hostnames - I just don't 
see any value in validating the latter when you have strong TLS 
authentication in place.


Would such a patch be welcome?

Cause I really need some kind of fix for the current behaviour of 
gluster - it just doesn't work on more complex network setups, esp. 
with v6.


Note: Client mount code seems to have similar issues. Will dig into that 
(it has a workaround), but need to fix the server side first for myself.


regards,
--
Paul Jakma | p...@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A
Fortune:
There is no fool to the old fool.
-- John Heywood
---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] mem-pool.c magic trailer causes SIGBUS, fix or remove?

2021-07-29 Thread Paul Jakma

Hi Amar,

So fix is what I did first, but someone was wondering about the 
additional cost of the alignment calculation in the memory allocation 
path. (Though, the unaligned write of the trailer might be as many 
cycles of slowdown as a roundup calculation on platforms that handle 
unaligned access in microcode).


The pull request with the fix is at: 
https://github.com/gluster/glusterfs/pull/2669


With that, glusterd can now run on SPARC64 - without crashing in one of 
the first few memory allocations.


Though, I still can not get glusterd on SPARC64 to sync up with the 
pool. Whether with the trailer fixed, or the trailer removed. I end up 
with cksum errors on /var/lib/glusterfs/ files - starting with a 
completely clean /var/lib/glusterfs on the SPARC64 box.


The 2 big differences between the SPARC64 host and the other hosts that 
are happily synced are:

- distro (SPARC64 is debian; others are Fedora)
- endianness.

Wondering if either of those could be a factor?

Thanks,

Paul

On Thu, 29 Jul 2021, Amar Tumballi wrote:


Thanks for the initiative Paul. Let me answer the major question from the
thread.


Remove the trailer? Or fix it?


Ideal one is to fix it, as we do use mem-pool info to identify leaks etc
through statedumps of the process. Remove can be an option on SPARC to
start with if fixing is time consuming. I recommend removing the trailer
within a #ifdef codeblock, so it may continue work in places where its
already working.


regards,
--
Paul Jakma | p...@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A
Fortune:
Honesty pays, but it doesn't seem to pay enough to suit some people.
-- F. M. Hubbard
---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



[Gluster-devel] mem-pool.c magic trailer causes SIGBUS, fix or remove?

2021-07-27 Thread Paul Jakma

Hi,

The magic trailer added at the end of memory allocations in mem-pool.c 
doesn't have its alignment ensured. This can lead to SIGBUS and 
abnormal exits on some platforms (e.g., SPARC64).


I have a patch to round-up the header and allocation size to get the 
right alignment for the trailer (I think).


It does complicate the memory allocation a little further.

Question is whether it would just be simpler to remove the trailer, and 
simplify the code?


There are a number of external tools that exist to debug memory allocs, 
from runtime loadable debug malloc libraries, to compiler features 
(ASAN, etc.), to valgrind.


Glusterfs could just rely on those, and so simplify and (perhaps) 
speed-up its own, general-case memory code.


Remove the trailer? Or fix it?

regards,
--
Paul Jakma | p...@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A
Fortune:
Many people are secretly interested in life.
---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel