Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up.

Mark David Dumlao Tue, 18 Nov 2008 17:01:36 -0800

On Mon, Nov 17, 2008 at 11:06 AM, Steve Cayford <[EMAIL PROTECTED]> wrote:

> Mark David Dumlao wrote:
> > On Sat, Nov 15, 2008 at 7:47 AM, JF Straeten <[EMAIL PROTECTED]
> > <mailto:[EMAIL PROTECTED]>> wrote:
> >
> >     Perhaps could you give gPXE a try, instead of etherboot, and see if
> it
> >     change anything ?
> >
> > Will do, but Im just annoyed by the individual ROM generation from
> > ROM-o-matic. My refurbished units all have different lan cards and I was
> > hoping for a universal diskette bootrom.
> >
> > Any help? :)
>
> I think if you download the gPXE source you can build yourself a
> universal loader. I forget the details, but I think it's possible.

Thanks Steve, I didn't know gPXE was universal because I only downloaded
images from rom-o-matic.

I was just about to pat myself on the back and congratulate myself after
gPXE was able to boot the whole lab in abuot 10-15 minutes. However, after I
rebooted and tried again later, my problem with the inability to get
addresses returned. It's apparently intermittent, and I think it is highly
related to either a huge system slowdown or dns. I don't know why it would
be related to DNS after all.

Previously I configured my DNS under bind. However, I noticed that after
booting 4 or 5 thin clients, bind9 would magically stop replying to queries.
At this point, sudo would become very slow and few of my clients would be
able to boot or even get IP addresses from the server.

Neither dhcp3-server nor sudo should be making DNS lookups. My hosts file
looks like so:

===
127.0.0.1    localhost
127.0.1.1    mars.schoolsite.local    mars

192.168.1.8    mars.schoolsite.local    mars
192.168.11.254    mars.schoolsite.local    mars

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
===

Nevertheless, when my bind9 server dies, my hosts are unable to get ips from
the dhcp3-server, and sudo take forever. What the human is going on, I
wonder.

Thinking it was a problem with bind9, I replaced it with pdns. PDNS has a
bind backend, which basically just reads your bind zones and makes it easy
to transfer. But no joy. After 4 or 5 boots, dhcp3 server goes to hell and
stops responding.

I tried all sorts of network dumping and monitoring. After a while I noticed
that there were reverse IP queries somewhere, and not bothering to find out
why, I gave the entire subnet reverse entries in my PDNS.

Amazingly, I was able to boot the whole lab. In under 10 minutes. It brought
tears to my eyes. However the problem came back, and checking my logs:

Nov 17 16:53:40 mars pdns[2981]: Scheduling exit on remote request

And I was like, what the hell.

I'm not 100% sure, but it would seem like my bot-infested waters (worms
everywhere in the network, it wasn't maintained well) are trying to remotely
my bind or something like that, and maybe all sorts of poisoning tricks.
Since my pdns server shared my bind rndc key, if the key was weak, then it's
conceivable that bots are shutting it down.

Anyways, it would seem that I have two problems that ask for a solution:
1) Are my local services performing DNS lookups? Why? How do I get them to
stop doing that?
2) Is my pdns / bind / whatever dns under attack by bots using rndc? How do
I stop them?

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

_____________________________________________________________________
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
      https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net

Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up.

Reply via email to