[Telrad] Fixing & resetting a CPE8000 the hard way

Nathan Anderson Thu, 09 Feb 2017 04:08:04 -0800

Okay, here is how to repair this problem if you find yourself affected by it:


First, you need to determine if you can access the shell on the CPE via the 
ethernet port or not.

If you cannot ping your CPE on the ethernet port at the address you think it 
should be at (default 192.168.254.251, or whatever you set yours to if you 
don't leave it at the default), and if it doesn't hand out IPs via DHCP 
anymore, try to see if it responds to 192.168.0.1.  If that doesn't work, odds 
are good that the CPE will actually respond to IPv6 on its link-local address 
(fe80::/10 space); the CPE8000 uses this convention for generating its 
link-local address based on the ethernet MAC address: 
http://www.sput.nl/internet/ipv6/ll-mac.html

If you are extremely lucky and the web interface still works somehow, see if 
you can enable telnet and/or SSH.  If the web interface is dead, see if the CPE 
will accept a connection via telnet/SSH anyway.  If it does, try to log in with 
username 'root', password 'root123'.

If none of this works, or if it does but the CPE won't accept either the 
default root password or whatever root password you were sure you set it to, 
you are going to have to get out your screwdriver.

Unscrew and remove the antenna/cover from the top of the CPE.  On the CPE's 
main circuit board, you should see a 6-pin header at the upper-right hand 
corner.  This is a serial console port.  You are either going to need a 
TTL-to-RS232 converter, or a TTL-compatible serial port on your PC (you can 
purchase direct USB-to-TTL-serial adapters).  These are the only important 
signals you need to worry about (I am counting the pins starting at #1 on the 
left):

Pin 2: ground
Pin 3: receive (PC transmit)
Pin 4: transmit (PC receive)

(If you have a USB-to-TTL adapter that supplies +5V on a fourth wire, do not 
hook that up to anything.)

Fire up a terminal emulator, point it at your serial interface, and set 
bitrate/data/parity/stop/flow-control to 115200/8/none/1/none.  Now apply power 
to your CPE.  You should see the bootup messages scrolling by.  Once text stops 
scrolling, press Enter and you should be greeted with a shell prompt.

Okay, now that you have a shell (either via ethernet or via serial terminal), 
time to get to work.  In the following steps, any line preceded by a hash (#) 
is a command that you will copy and paste into your shell/session with the CPE; 
however, leave off the hash:

- - -

1. Search and destory any virus files taking up valuable storage space:

# find / -iname Photo.scr -exec echo "{}" ";" -exec rm "{}" ";"
# find / -iname IMG001.scr -exec echo "{}" ";" -exec rm "{}" ";"
# find / -iname Info.zip -exec echo "{}" ";" -exec rm "{}" ";"

(If any files are found, in addition to removing them, it will also print out 
every location where a file was found.)

2. There may be other worms doing similar things with different file names not 
covered above.  Check to make sure the partition containing the active config 
has plenty of free space:

# df /nvm

It will show you something like this:

Filesystem                Size      Used Available Use% Mounted on
/dev/mtdblock5            3.0M    536.0K      2.5M  17% /nvm

"Use%" should normally be under 20.  If it is considerably higher than that, 
look at the contents of that mount point:

# ls -l /nvm

All you should see are directories "bsp", "etc", and possibly "CrushPoint".  If 
you see a big (> 1MB) file there as well, make a note of its name, and re-run 
the same find command from step #1, but substitute in this new file name.

3. Time to wipe the half-baked config and reset to defaults:

# /usr/local/bin/restore-defaults.sh

After it's done, it will probably tell you that it is about to reboot, but it 
doesn't.

4. Make double-sure that the unsecured FTP server does *not* start up:

# /usr/local/bin/set_admin_services.sh commercial

5. Finally, reboot the CPE:

# reboot

- - -

After you complete these steps, you should be greeted by a CPE that: a) has its 
default ethernet IP back of 192.168.254.251, b) has a DHCP running on the 
ethernet interface, c) has a working web interface, d) is back to stock factory 
defaults in every other way (config, passwords, etc.).

Hope this helps,

-- Nathan

From: [email protected] [mailto:[email protected]] On Behalf Of 
Nathan Anderson
Sent: Thursday, February 09, 2017 2:54 AM
To: Telrad List
Subject: Re: [Telrad] UE upgrade failure rate

After much trial and error, I managed to learn that the white port is a 
TTL-level serial interface.  And there was much rejoicing.

ALSO, I FIGURED OUT WHAT HAS BEEN KILLING (at least our) CPE8000s.

Remember that problem that the EPC firmware had back when it was first 
released?  Back when root access was still available on the EPC firmware, there 
was an FTP server running on it that accepted connections via the PDN IP 
address, and if you didn't change the root password from the default insecure 
one (which was ironically named), then infected machines trying to spread that 
stupid Photo.scr worm would successfully log into the EPC via FTP and, thinking 
that it had managed to log into a public web server somewhere, upload a 
bajillion copies of the virus to it in various directories, filling up the disk.

The exact same thing is happening here, believe it or not.  It hadn't ever 
occurred to me to test for this, but it turns out that under certain 
circumstances that I haven't yet managed to nail down, the CPE8000 firmware 
actually starts running an FTP server.  Even worse, this FTP server, once 
enabled, does not ask for any credentials.  You can literally type in any 
username when prompted, and you are in.  I see no config option on the web 
interface for the CPE that allows you to turn this on and off...but whatever is 
triggering it ends up creating a ready and completely unsecured backdoor to the 
CPE.

*headdesk*

If you guys give out routable IPs to your LTE users, or if you have somebody on 
your network that has a PC infected with this particular virus, then it might 
be that this could also explain your CPE8000 firmware upgrade problems.

After figuring out the serial port bit and examining the "dead" CPEs more 
in-depth, I found the filesystems littered with files named things like 
Photo.scr, IMG001.scr, Info.zip, etc.  Once the writable partition with the CPE 
configuration is completely full, if at that point you issue either a 
reset-to-defaults, or upload a configuration backup, or initiate a firmware 
upgrade (which has to migrate your configuration from the old firmware version 
during the process), your CPE gets bricked because there isn't enough disk 
space left for it to properly finish writing the config changes to disk.  So it 
gets only half-done, and the configuration is left in an inconsistent state.

I've managed to fix my dead units, and also found the mechanism for disabling 
the FTP server.  Still not sure how it is getting toggled on in the first place 
(perhaps there is some other vulnerability that is getting exploited first?), 
but I'll keep looking.

I'll write up some instructions for y'all and post them here soon.

-- Nathan

From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Nathan Anderson
Sent: Wednesday, February 08, 2017 1:49 PM
To: Telrad List; Adam Moffett
Subject: Re: [Telrad] UE upgrade failure rate

Does anybody happen to know if the 6-pin white connector on the 8000's board is 
either a serial port or a JTAG interface?

-- Nathan

From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Nathan Anderson
Sent: Wednesday, February 08, 2017 1:40 PM
To: Telrad List; Adam Moffett
Subject: Re: [Telrad] UE upgrade failure rate

This thread is interesting because I was just complaining last night to our 
vendor about how fragile the software on the CPE8000s seems to be.

We have not had specific issues with flashing CPEs "over the air" from the web 
interface, but sometimes ACS-initiated updates don't complete correctly.  On 
7000s it usually takes the form of the upgrade not completing and the UE 
falling off of the ACS, but the radio stays up and attached to the network.  We 
go in via the web interface OTA and reboot it and it comes back with the same 
version of firmware it was already running.  Second time is usually the charm, 
and I'm thinking that perhaps if the UE had been freshly-rebooted before 
attempting the update, we might have a higher success rate.  (We have also seen 
7000s just stop talking to the ACS without us touching the firmware, and even 
though they are otherwise working fine.  Again, rebooting the CPE fixes this.  
Although it is rare, we have seen this even on the latest .116)

We once had a 7000 that did drop off the network after pushing the update via 
ACS.  We never checked what state it was in from the ethernet side, but we had 
the customer powercycle it themselves and it came back…again running the same 
firmware.  So the upgrade did not take, but it didn't brick it either and 
resetting config to defaults on the UE was not (and at least for us never has 
been) necessary.

So we have never had to truck-roll to a 7000 as a result of a failed firmware 
upgrade.  The 8000s, however, seem to be another story.  I am so scared to 
touch the ones we have in the field anymore.  We have had a couple that seem to 
get their configs corrupted after a firmware change, and get into very funky 
states.

One of them had these symptoms: defaulted to a 192.168.0.1 IP on the ethernet 
(!), no web server running, no DHCP server running, had telnet access that 
didn't prompt for a password (!!).  Fixed it by resetting to defaults (found a 
shell script that performs this function on the CPE's filesystem).  I got lucky 
with this one.

One that I have sitting on my desk now is one that we tried to rollback the 
firmware on (customer was experiencing random network detaches, and the latest 
8000 firmware doesn't reattach for 15 minutes on-the-dot, so customer was -- I 
think justifiably -- getting a bit pissy).  Current symptoms are: NO IPv4 on 
the ethernet, IPv6 link-local responds, no web server running, no DHCP server 
running, telnet responds (calls itself "KZTECH") but default root/root123 
doesn't work, so I have NO way to get in and reset the damn thing, and the 
8000s don't seem to have a reset button.  Thus it seems that it is possible for 
a scrambled config to completely brick an 8000.

If anybody has reliable information on how to get the 8000 to wipe its config 
during bootup even though it seemingly lacks a reset button, I would be 
eternally grateful...

-- Nathan

From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Matthew Carpenter
Sent: Wednesday, February 08, 2017 6:38 AM
To: Adam Moffett; Telrad List
Subject: Re: [Telrad] UE upgrade failure rate

Hi,

So far only 1 CPE8000 UE that did not come back after a firmware update.  
Normally a hard reboot would fix it, but in this case we had to replace it.
I have that CPE8000 on my desk and need to see what the status is from the LAN 
side.  Thanks for the info on defaulting it, will try it.

Matt C.



On Wed, Feb 8, 2017 at 8:23 AM, Adam Moffett 
<[email protected]<mailto:[email protected]>> wrote:
We've had a helluva time upgrading UE firmware over the air.  It was worse with 
Wimax.  On Wimax it was more like 75% of the time we would lose the channel 
scan table and have to go on site to add it back in.  It became SOP to leave 
the operator password at default so we had the option of having the customer 
log in and fix the scan table for us.

I think we've had more success since going to LTE.  However, failed firmware 
updates was one of the incentives to set up a dedicated management bearer.  I 
was hoping it would help with these things.  We haven't pushed out an update 
recently enough to say whether it helped.

-Adam



------ Original Message ------
From: "Shayne Lebrun" 
<[email protected]<mailto:[email protected]>>
To: [email protected]<mailto:[email protected]>
Sent: 2/8/2017 9:14:36 AM
Subject: [Telrad] UE upgrade failure rate

Does anybody else experience a ten to fifteen percent failure rate when 
upgrading UEs?  The behavior is, you upgrade the firmware, reboot, and the 
device doesn’t come back.  Logging into the UE’s management from LAN, you’ll 
see it’s stuck in ‘device init.’  Defaulting the unit and rebooting allows it 
to boot and attach.

We’re not using the residential gateway device or anything, and the only config 
we put in is device name, SNMP and ACS settings.  Sometimes we hardcode the 
client’s device in the DHCP server, to turn on DMZ to allow port forwarding, 
but that doesn’t seem to be a causal factor.

_______________________________________________
Telrad mailing list
[email protected]<mailto:[email protected]>
http://lists.wispa.org/mailman/listinfo/telrad



--
Matthew Carpenter
806-316-5071 office
806-236-9558 cell

[https://docs.google.com/uc?export=download&id=0BxDRq5UV7HPOaEM4LXVaVnk5cWM&revid=0BxDRq5UV7HPOTDdiVjM0TXRIc3ZzMXVUVDdDVjBiaFU0bHJNPQ]

_______________________________________________
Telrad mailing list
[email protected]
http://lists.wispa.org/mailman/listinfo/telrad

[Telrad] Fixing & resetting a CPE8000 the hard way

Reply via email to