Okay, here is how to repair this problem if you find yourself affected by it:
First, you need to determine if you can access the shell on the CPE via the ethernet port or not. If you cannot ping your CPE on the ethernet port at the address you think it should be at (default 192.168.254.251, or whatever you set yours to if you don't leave it at the default), and if it doesn't hand out IPs via DHCP anymore, try to see if it responds to 192.168.0.1. If that doesn't work, odds are good that the CPE will actually respond to IPv6 on its link-local address (fe80::/10 space); the CPE8000 uses this convention for generating its link-local address based on the ethernet MAC address: http://www.sput.nl/internet/ipv6/ll-mac.html If you are extremely lucky and the web interface still works somehow, see if you can enable telnet and/or SSH. If the web interface is dead, see if the CPE will accept a connection via telnet/SSH anyway. If it does, try to log in with username 'root', password 'root123'. If none of this works, or if it does but the CPE won't accept either the default root password or whatever root password you were sure you set it to, you are going to have to get out your screwdriver. Unscrew and remove the antenna/cover from the top of the CPE. On the CPE's main circuit board, you should see a 6-pin header at the upper-right hand corner. This is a serial console port. You are either going to need a TTL-to-RS232 converter, or a TTL-compatible serial port on your PC (you can purchase direct USB-to-TTL-serial adapters). These are the only important signals you need to worry about (I am counting the pins starting at #1 on the left): Pin 2: ground Pin 3: receive (PC transmit) Pin 4: transmit (PC receive) (If you have a USB-to-TTL adapter that supplies +5V on a fourth wire, do not hook that up to anything.) Fire up a terminal emulator, point it at your serial interface, and set bitrate/data/parity/stop/flow-control to 115200/8/none/1/none. Now apply power to your CPE. You should see the bootup messages scrolling by. Once text stops scrolling, press Enter and you should be greeted with a shell prompt. Okay, now that you have a shell (either via ethernet or via serial terminal), time to get to work. In the following steps, any line preceded by a hash (#) is a command that you will copy and paste into your shell/session with the CPE; however, leave off the hash: - - - 1. Search and destory any virus files taking up valuable storage space: # find / -iname Photo.scr -exec echo "{}" ";" -exec rm "{}" ";" # find / -iname IMG001.scr -exec echo "{}" ";" -exec rm "{}" ";" # find / -iname Info.zip -exec echo "{}" ";" -exec rm "{}" ";" (If any files are found, in addition to removing them, it will also print out every location where a file was found.) 2. There may be other worms doing similar things with different file names not covered above. Check to make sure the partition containing the active config has plenty of free space: # df /nvm It will show you something like this: Filesystem Size Used Available Use% Mounted on /dev/mtdblock5 3.0M 536.0K 2.5M 17% /nvm "Use%" should normally be under 20. If it is considerably higher than that, look at the contents of that mount point: # ls -l /nvm All you should see are directories "bsp", "etc", and possibly "CrushPoint". If you see a big (> 1MB) file there as well, make a note of its name, and re-run the same find command from step #1, but substitute in this new file name. 3. Time to wipe the half-baked config and reset to defaults: # /usr/local/bin/restore-defaults.sh After it's done, it will probably tell you that it is about to reboot, but it doesn't. 4. Make double-sure that the unsecured FTP server does *not* start up: # /usr/local/bin/set_admin_services.sh commercial 5. Finally, reboot the CPE: # reboot - - - After you complete these steps, you should be greeted by a CPE that: a) has its default ethernet IP back of 192.168.254.251, b) has a DHCP running on the ethernet interface, c) has a working web interface, d) is back to stock factory defaults in every other way (config, passwords, etc.). Hope this helps, -- Nathan From: [email protected] [mailto:[email protected]] On Behalf Of Nathan Anderson Sent: Thursday, February 09, 2017 2:54 AM To: Telrad List Subject: Re: [Telrad] UE upgrade failure rate After much trial and error, I managed to learn that the white port is a TTL-level serial interface. And there was much rejoicing. ALSO, I FIGURED OUT WHAT HAS BEEN KILLING (at least our) CPE8000s. Remember that problem that the EPC firmware had back when it was first released? Back when root access was still available on the EPC firmware, there was an FTP server running on it that accepted connections via the PDN IP address, and if you didn't change the root password from the default insecure one (which was ironically named), then infected machines trying to spread that stupid Photo.scr worm would successfully log into the EPC via FTP and, thinking that it had managed to log into a public web server somewhere, upload a bajillion copies of the virus to it in various directories, filling up the disk. The exact same thing is happening here, believe it or not. It hadn't ever occurred to me to test for this, but it turns out that under certain circumstances that I haven't yet managed to nail down, the CPE8000 firmware actually starts running an FTP server. Even worse, this FTP server, once enabled, does not ask for any credentials. You can literally type in any username when prompted, and you are in. I see no config option on the web interface for the CPE that allows you to turn this on and off...but whatever is triggering it ends up creating a ready and completely unsecured backdoor to the CPE. *headdesk* If you guys give out routable IPs to your LTE users, or if you have somebody on your network that has a PC infected with this particular virus, then it might be that this could also explain your CPE8000 firmware upgrade problems. After figuring out the serial port bit and examining the "dead" CPEs more in-depth, I found the filesystems littered with files named things like Photo.scr, IMG001.scr, Info.zip, etc. Once the writable partition with the CPE configuration is completely full, if at that point you issue either a reset-to-defaults, or upload a configuration backup, or initiate a firmware upgrade (which has to migrate your configuration from the old firmware version during the process), your CPE gets bricked because there isn't enough disk space left for it to properly finish writing the config changes to disk. So it gets only half-done, and the configuration is left in an inconsistent state. I've managed to fix my dead units, and also found the mechanism for disabling the FTP server. Still not sure how it is getting toggled on in the first place (perhaps there is some other vulnerability that is getting exploited first?), but I'll keep looking. I'll write up some instructions for y'all and post them here soon. -- Nathan From: [email protected]<mailto:[email protected]> [mailto:[email protected]] On Behalf Of Nathan Anderson Sent: Wednesday, February 08, 2017 1:49 PM To: Telrad List; Adam Moffett Subject: Re: [Telrad] UE upgrade failure rate Does anybody happen to know if the 6-pin white connector on the 8000's board is either a serial port or a JTAG interface? -- Nathan From: [email protected]<mailto:[email protected]> [mailto:[email protected]] On Behalf Of Nathan Anderson Sent: Wednesday, February 08, 2017 1:40 PM To: Telrad List; Adam Moffett Subject: Re: [Telrad] UE upgrade failure rate This thread is interesting because I was just complaining last night to our vendor about how fragile the software on the CPE8000s seems to be. We have not had specific issues with flashing CPEs "over the air" from the web interface, but sometimes ACS-initiated updates don't complete correctly. On 7000s it usually takes the form of the upgrade not completing and the UE falling off of the ACS, but the radio stays up and attached to the network. We go in via the web interface OTA and reboot it and it comes back with the same version of firmware it was already running. Second time is usually the charm, and I'm thinking that perhaps if the UE had been freshly-rebooted before attempting the update, we might have a higher success rate. (We have also seen 7000s just stop talking to the ACS without us touching the firmware, and even though they are otherwise working fine. Again, rebooting the CPE fixes this. Although it is rare, we have seen this even on the latest .116) We once had a 7000 that did drop off the network after pushing the update via ACS. We never checked what state it was in from the ethernet side, but we had the customer powercycle it themselves and it came back…again running the same firmware. So the upgrade did not take, but it didn't brick it either and resetting config to defaults on the UE was not (and at least for us never has been) necessary. So we have never had to truck-roll to a 7000 as a result of a failed firmware upgrade. The 8000s, however, seem to be another story. I am so scared to touch the ones we have in the field anymore. We have had a couple that seem to get their configs corrupted after a firmware change, and get into very funky states. One of them had these symptoms: defaulted to a 192.168.0.1 IP on the ethernet (!), no web server running, no DHCP server running, had telnet access that didn't prompt for a password (!!). Fixed it by resetting to defaults (found a shell script that performs this function on the CPE's filesystem). I got lucky with this one. One that I have sitting on my desk now is one that we tried to rollback the firmware on (customer was experiencing random network detaches, and the latest 8000 firmware doesn't reattach for 15 minutes on-the-dot, so customer was -- I think justifiably -- getting a bit pissy). Current symptoms are: NO IPv4 on the ethernet, IPv6 link-local responds, no web server running, no DHCP server running, telnet responds (calls itself "KZTECH") but default root/root123 doesn't work, so I have NO way to get in and reset the damn thing, and the 8000s don't seem to have a reset button. Thus it seems that it is possible for a scrambled config to completely brick an 8000. If anybody has reliable information on how to get the 8000 to wipe its config during bootup even though it seemingly lacks a reset button, I would be eternally grateful... -- Nathan From: [email protected]<mailto:[email protected]> [mailto:[email protected]] On Behalf Of Matthew Carpenter Sent: Wednesday, February 08, 2017 6:38 AM To: Adam Moffett; Telrad List Subject: Re: [Telrad] UE upgrade failure rate Hi, So far only 1 CPE8000 UE that did not come back after a firmware update. Normally a hard reboot would fix it, but in this case we had to replace it. I have that CPE8000 on my desk and need to see what the status is from the LAN side. Thanks for the info on defaulting it, will try it. Matt C. On Wed, Feb 8, 2017 at 8:23 AM, Adam Moffett <[email protected]<mailto:[email protected]>> wrote: We've had a helluva time upgrading UE firmware over the air. It was worse with Wimax. On Wimax it was more like 75% of the time we would lose the channel scan table and have to go on site to add it back in. It became SOP to leave the operator password at default so we had the option of having the customer log in and fix the scan table for us. I think we've had more success since going to LTE. However, failed firmware updates was one of the incentives to set up a dedicated management bearer. I was hoping it would help with these things. We haven't pushed out an update recently enough to say whether it helped. -Adam ------ Original Message ------ From: "Shayne Lebrun" <[email protected]<mailto:[email protected]>> To: [email protected]<mailto:[email protected]> Sent: 2/8/2017 9:14:36 AM Subject: [Telrad] UE upgrade failure rate Does anybody else experience a ten to fifteen percent failure rate when upgrading UEs? The behavior is, you upgrade the firmware, reboot, and the device doesn’t come back. Logging into the UE’s management from LAN, you’ll see it’s stuck in ‘device init.’ Defaulting the unit and rebooting allows it to boot and attach. We’re not using the residential gateway device or anything, and the only config we put in is device name, SNMP and ACS settings. Sometimes we hardcode the client’s device in the DHCP server, to turn on DMZ to allow port forwarding, but that doesn’t seem to be a causal factor. _______________________________________________ Telrad mailing list [email protected]<mailto:[email protected]> http://lists.wispa.org/mailman/listinfo/telrad -- Matthew Carpenter 806-316-5071 office 806-236-9558 cell [https://docs.google.com/uc?export=download&id=0BxDRq5UV7HPOaEM4LXVaVnk5cWM&revid=0BxDRq5UV7HPOTDdiVjM0TXRIc3ZzMXVUVDdDVjBiaFU0bHJNPQ]
_______________________________________________ Telrad mailing list [email protected] http://lists.wispa.org/mailman/listinfo/telrad
