In message <[EMAIL PROTECTED]>, Prof Brian
Ripley <[EMAIL PROTECTED]> writes
>But Bert's caveats apply: you have 200 problems of size 20,000 since in
>QDA each class's distribution is estimated separately, and a single pass
>will give you the sufficient statistics however large the dataset is.
>
On Tue, 15 Feb 2005, Graham Jones wrote:
In message <[EMAIL PROTECTED]>, r-help-
[EMAIL PROTECTED] writes
[Actually quoting Bert Gunter, BTW]
Can comeone give me an example (perhaps in a private response, since I'm off
topic here) where one actually needs all cases in a large data set ("large"
bein
Hi,
Also I agree those cases are relatively rare in STATISTICAL analysis, you
can encounter them for simulation topics (natural catalysm a 5 meter in the
topographics can change all the simulations)
Two ideas (in addition to loading several sections) is
1- to search for duplicate cases and estim
On Mon, 14 Feb 2005, Berton Gunter wrote:
read all 200 million rows a pipe dream no matter what
platform I'm using?
In principle R can handle this with enough memory. However,
200 million
rows and three columns is 4.8Gb of storage, and R usually needs a few
times the size of the data for working s
The purpose of investigating the entire (200 million record) data set is to
investigate several interpolation models for creating gridded elevation
data. Most models and algorithms do just that...take a manageable number of
"points" and do the math. My reasoning behind using the entire dataset
(wh