Netflix is a small dataset. 12G for that seems quite excessive. Note also that this is before you have done any work.
Ideally, 100million observations should take << 1GB. On Tue, Jul 16, 2013 at 8:19 AM, Peng Cheng <[email protected]> wrote: > The second idea is indeed splendid, we should separate time-complexity > first and space-complexity first implementation. What I'm not quite sure, > is that if we really need to create two interfaces instead of one. > Personally, I think 12G heap space is not that high right? Most new laptop > can already handle that (emphasis on laptop). And if we replace hash map > (the culprit of high memory consumption) with list/linkedList, it would > simply degrade time complexity for a linear search to O(n), not too bad > either. The current DataModel is a result of careful thoughts and has > underwent extensive test, it is easier to expand on top of it instead of > subverting it.
