In other words - you are hitting an internal interface of a VR? I would replace (for a test) bind9 with just the default setup of DNSmasq, while specifying it's uper/ROOT DNS servers to be the VR IP - i.e. client --> DNSmasq (internal server) --> DNSmasq (VR). See if that work - so you can draw possibly some conclusions.
Andrija On Fri, 24 May 2019 at 21:12, Eric Lee Green <eric.lee.gr...@gmail.com> wrote: > On 5/24/19 10:16 AM, Andrija Panic wrote: > > Eric, > > > > your BIND9 servers is on a "Public" network (trying to talk to the Public > > IP of the VR during forwarding DNS requests) or a VM inside an Isolated > > network behind VR)? > > It's on *a* public network, but not *the* public network. I don't have > any Isolated networks, though I have them enabled from VLAN 1000-2000. I > am using "Advanced Networking" but for my own purposes -- I have one > "Shared" guest network at VLAN 102, and then several isolated specialty > physical "Shared" networks like "Security Cameras" (VLAN 103) and > "Storage Network" (VLAN 200) that are attached to virtual machines that > need access to those things. The "Shared" guest network (VLAN 102) is > routed by my layer 3 switch with the rest of my network's public VLANs > so if I am on e.g. 10.31.1.2 (VLAN 31), which is similarly a routed > public VLAN (but not one that Cloudstack is allowed to directly talk to > or manage, it has to go thru the layer 3 switch) or 10.120.0.5 (VLAN > 120), I can talk directly to 10.102.199.148 since all are routed into > the common fabric via the layer 3 switch. I only care about the VM's > that are VLAN 102, which are supposed to be publicly available to my > users, thus why my quicky script hack to generate a zone file out of the > database does > > select v.name, n.ip4_address from vm_instance as v, nics as n where > v.removed is null and n.instance_id = v.id and n.ip4_address like > '10.102.%' and type = 'User' order by n.ip4_address; > > in order to select out the name and IP address of virtual machines with > NIC's on that VLAN. (Which, if it's a different list from the last list > that was queried, then gets massaged into a zone file for > name.cloud.mydomain.com by the script, which then scp's to my master > domain server and does a reload to reload the zone file from the new > version). > > Both of my BIND9 servers can talk directly to 10.102.199.148 (the IP of > the virtual router for the 10.102.xxx.xxx network, VLAN 102) if I use > 'host' to directly query 10.102.199.148 for an API address like, say, > 'api-default1.cloud.mydomain.com' but when I try to put a forward domain > there, nope. This was working, but now is not. I suspect it's got to do > with the recent changes in DNS software, both bind9 and dnsmasq, to > deal with multiple attacks on the domain name system, but I'm having > trouble figuring out why, or what my solution should be. > > Note that it's quite reasonable / feasible / viable to put a DNS server > actually inside the Cloudstack constellation if that's necessary and > then do a two-stage hop if necessary. I'm just trying to figure out the > "right" way to do this right now so I can retire my hack script. > > > On Fri, 24 May 2019 at 02:15, Eric Lee Green <eric.lee.gr...@gmail.com> > > wrote: > > > >> I had this working under 4.9. All I did was, on my main BIND9 servers, > >> point a forward zone at 'cloud.<mydomain>.com' to the virtual router > >> associated with all VM's that were publicly available. I could then > >> resolve all foo.cloud.<mydomain>.com names on my global network. > >> > >> Somehow, though, this quit working after I updated to 4.11. I'm not > >> quite sure why. > >> > >> The 'Guest Network' is defined with domain 'cloud.mydomain.com'. > >> > >> Okay, so my router for the 'Guest Network' advanced networking is > >> located at 10.102.199.148. In my master BIND9 DNS server at 10.31.1.2 I > >> have this: > >> zone "cloud.mydomain.com" IN { > >> type forward; > >> forward only; > >> forwarders { > >> 10.102.199.148; > >> }; > >> }; > >> > >> If I send a NAMED request directly to the virtual router while logged > >> into my main name server, it works: > >> > >> [root@ypbind ~]# host eric-gui.cloud.mydomain.com 10.102.199.148 > >> Using domain server: > >> Name: 10.102.199.148 > >> Address: 10.102.199.148#53 > >> Aliases: > >> > >> eric-gui.cloud.mydomain.com has address 10.102.199.234 > >> > >> If I try to use the name server however, it doesn't work: > >> > >> [root@ypbind logs]# host eric-gui.cloud.mydomain.com > >> Host eric-gui.cloud.viakoo.com not found: 3(NXDOMAIN) > >> > >> I'm baffled, because this *was* working. > >> > >> So I disabled any dnssec in the {options} on bind9 and gave all > >> permissions to see if that was the problem (note that this is internal > >> to my infrastructure, so DNS amplification isn't an issue): > >> > >> dnssec-enable no; > >> dnssec-validation no; > >> dnssec-lookaside auto; > >> recursion yes; > >> allow-recursion { any; }; > >> allow-query { any; }; > >> allow-query-cache { any; };user > >> > >> Still nope. Still baffled. > >> > >> Anybody got any clues as to what I may be doing wrong? I'm thinking it > >> has to be on the BIND9 side, because I can resolve the host name if I > >> talk to the virtual router directly, but for some reason it's not > >> allowing me to get any records from the router. > >> > >> Right now I've temporarily worked around this with a script that > >> directly queries the MySQL database every few minutes and generates a > >> revised zone file on my master DNS server when the list of virtual > >> machines queried out of the database changes, but that's clearly not the > >> right way to do it. The question is, what *is* the right way to do it? > >> > >> -Eric > >> > >> > >> > > -- Andrija Panić