ReLU is a literal switch. An electrical switch is n volts in, n volts out when on. Zero volts out when off. The weighted sum (dot product) of a number of weighted sums is still a linear system.
For a particular input to a ReLU neural network all the switches are decidedly in either the on or off state. A particular linear projection is in effect between the input and output. For a particular input and a particular output neuron there is a particular composition of weighted sums that may be condensed down into a single equivalent weighted sum. You can look at that to see what it is looking at in the input or calculate some metrics like the angle between the input and the weight vector of the equivalent weighted sum. If the angle is near 90 degrees and the output of the neuron is large then the vector length of weight vector must be large. That makes the output very sensitive to noise in the inputs. If the angle is near zero then there are averaging and central limit theorem effects that provide some error correction. Since ReLU switches at zero there are no sudden discontinuities in the output of an ReLU neural network for gradual change in the input. It is a seamless system of switched linear projections. There are efficient algorithms for calculating certain dot products like the FFT or WHT. There is no reason you cannot incorporate those directly into ReLU neural networks since they are fully compatible, all dot products are friends! ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T894f73971549b2ee-M18bcaa761e20b9fccd9aad74 Delivery options: https://agi.topicbox.com/groups/agi/subscription