This is exactly it. We didn't have the visibility into things to see what was causing the poor throughput at first (yet another one of our longstanding frustrations with the platform), but this is the problem that Jeremy and I were referring to.
I'm glad to say that we have not (knowingly) experienced the CPU usage fluctuations on our EPCs. As far as the data corruption one, you likely will not have run up against it unless you are running a preproduction release of 6.7. The symptoms are that we will see clusters of 4 consecutive bytes that have various bits flipped (usually what happens is that bytes 1 and 2 are zeroed out, and bytes 3 and 4 are completely different than what they would normally be, but the pattern of what exactly is changed is not clear to us yet). We see on average between 12 and 60 bytes per 100MB transferred per user in this state. The VERY BAD and VERY SCARY part is that if you do a packet capture, you will see that exactly zero TCP packets have a checksum that does not validate. So it's not like data is getting corrupted, and a lot of packets are being thrown out because the checksum doesn't compute/match, but a small percentage or handful get through. No, every single packet has a valid checksum, even the ones with corrupt data in them. What this means is that 1) HTTPS transfers just stop and die when the corruption occurs, and 2) HTTP/FTP/other unencrypted transfers introduce silent data corruption into the download that you won't discover until it is too late. That all packets have a checksum that validates would seem to suggest that the EPC is ingesting TCP packets from the PDN interface, throwing out the original TCP checksum (as a shortcut, or...? what valid reasons would you possibly have for doing this?), doing something internally that causes random corruption, and then recomputing a new checksum from scratch before sending it onto the target user over S1-U. That a bug like this is even *possible* BLOWS MY MIND. If you're going to ignore the original checksum that the packet arrives with, what's the point of the checksum in the first place? How can I ever trust the data flowing through this device again knowing that it is working around and subverting a key component that helps to ensure and preserve data integrity? -- Nathan From: telrad-boun...@wispa.org [mailto:telrad-boun...@wispa.org] On Behalf Of Adam Moffett Sent: Tuesday, March 14, 2017 8:34 PM To: telrad@wispa.org; telrad@wispa.org Subject: Re: [Telrad] Uplink throughput again * UE getting stuck at MCS4....apparently until an S1 reset. This may or may not be the same throughput issue that you guys were talking about earlier in the thread.
_______________________________________________ Telrad mailing list Telrad@wispa.org http://lists.wispa.org/mailman/listinfo/telrad