I'm thinking of a new algorithm for Julia.. I'm most concerned, about how much needs to fit in *RAM*, and curious what is considered big, in RAM (or not..).
A. For 2D (or more), dense or sparse (including non-square), is at most a 2 billion for any highest dimensional a big limit? Note for square dense array you can't get more than 8.4 million × 8.4 million (with 2015/2015 era x86 CPUs as address busses are capped at 46-bit; while theoretical 4 billion × 4 billion could fit if 64-bit addressing was available) to fit in RAM (one byte per entry).. and in practice much lower.. limited by actual RAM.. I see, however, a map-reduce way: http://infolab.stanford.edu/~ullman/mmds/book.pdf 2.6.7 Case Study: Matrix Multiplication Would that use much less RAM? At any point? B. I'm aware of billion row tables, but you usually query them (or kind of "stream" them), how much would be limiting to fit in RAM? Would a 2 GB (or say 8 or 16 GB) be limiting? https://books.google.is/books?id=BKEoDAAAQBAJ&pg=PA145&lpg=PA145&dq=big+one+dimensional+dataset&source=bl&ots=qkbpp3Ks_T&sig=ewWSbdVp8MUhQHjMqMWfnQh4Rfs&hl=en&sa=X&redir_esc=y#v=onepage&q=big%20one%20dimensional%20dataset&f=false Three billion DNA <https://en.wikipedia.org/wiki/DNA> base pairs <https://en.wikipedia.org/wiki/Base_pair>, seem to blow 2 GB limit, but not if you need less than one byte per base. I also doubt all chromosomes would be kept in the same array. Can't imagine 2 GB being limiting for UFT-8..