Hi, I'm wondering if any has any rules of thumb around model size and memory usage for SGD? I'm doing some testing of it myself, but thought I would ask to see how it compares.
Thanks, Grant
Hi, I'm wondering if any has any rules of thumb around model size and memory usage for SGD? I'm doing some testing of it myself, but thought I would ask to see how it compares.
Thanks, Grant