Re: [R] Allocate virtual memory on hard drive

2013-03-11 Thread Carolina Bello
Hi
you can use
memory.limit(size = 10) to allocated  aprox 100 gigas of your HDD as ram,
it´s depends  that you have that capacity available in your pc, as Prof
Ripley said it makes your pc really slow.

gc() is a garabge collector so it just delete what is hidden in memory

look

CB

2013/3/11 Prof Brian Ripley 

> On 11/03/2013 16:45, Jie wrote:
>
>> The vector contains 1.5*10^8 numeric elements. It takes about 3~4 GB in
>> memory.
>> And I would like to find percentiles: 0%, 0.5%, 1%, ... 100%
>> I use 64 bit R and windows 7 with 24GB Ram.
>>
>
> So:
>
> 1) Try R 3.0.0 alpha.  Many operations on large vectors are more efficient
> there.
>
> 2) You could try --max-mem-size=32G, say.  In my experience Windows
> virtual memory management is too slow to be useful, but you could try 
>
> 3) Add more RAM.  24GB is not a lot these days.
>
> However, I tried this on a Linux box. Such a vector is only just over 1GB
> and the maximum memory usage was 2.9GB.  Have you really told us the true
> story?
>
>  Thank you.
>>
>> Best,
>>
>>
>> On Mon, Mar 11, 2013 at 12:40 PM, jim holtman  wrote:
>>
>>> R runs with data in memory.  What type of system are you running on (32
>>> or
>>> 64 bit)?  How big is your data; you did not provide much information
>>> about
>>> your problem.  Depending on what you what to 'sort', there might be other
>>> ways of doing it.  This gets back to my tag line: "Tell me what you want
>>> to
>>> do, not how you want to do it".
>>>
>>> On Mon, Mar 11, 2013 at 11:20 AM, Jie  wrote:
>>>

 Dear All,

 I have a long sequence and want to find the quantile, or sort it first.
 It seems sort() or quantile() reaches the memory limit.
 Is there a way to allocate more memoy on SSD for R when startup, so
 that R can use both RAM and hard drive space?
 Thank you.

 Best wishes,
 Jie

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/**posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

>>>
>>>
>>>
>>>
>>> --
>>> Jim Holtman
>>> Data Munger Guru
>>>
>>> What is the problem that you are trying to solve?
>>> Tell me what you want to do, not how you want to do it.
>>>
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  
> http://www.stats.ox.ac.uk/~**ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vegdist Error en double(N * (N - 1)/2) : tama?o del vector especificado es muy grande

2013-02-11 Thread Carolina Bello
Brian thanks for answer me,

The idea  is to make a hierarchical cluster of 138037 pixels  of 1 km^2
from a study area of colombian Andes as a  biogeographical regionalization
. I have distributions models for 89 species so i have a matrix with the
pixels in the rows and is full with absence(0)/presence(1) of each species
per pixel. Also for agglomeration method in hierarchical cluster i need
the hole matrix so it can´t be divided I made  some profs with smaller area
and it works,  the code:

  MODELOS=stack(list.files(pattern="*.tif$")
DF=as.data.frame(MODELOS)
DF=na.omit(DF)
DISTAN=vegdist(DF[,2:ncol(DF)],"jaccard")
 E1=hclust(DISTAN,"ward")

 Thanks a lot for helping me in my problem  i think this is  exceeding my
programing capacities =(


Carolina Bello


2013/2/9 Prof Brian Ripley 

> Suppose N = 138037 (you haven't really told us).  A dissimilarity
> half-matrix would have 9 billlion elements.  The maximum size of a vector
> in current versions of R is 2 billion.
>
> You will be able to get further in R-devel (3.0.0-to-be) with a 64-bit
> version of R, although as you appear to be using Windows it will be very
> slow and 32GB of RAM is not enough to even store that object.
>
> What do you propose to do with a distance matrix on 140,000 objects?  I
> think you need to re-think whatever that is.
>
>
>
> On 08/02/2013 23:25, Carolina Bello wrote:
>
>> -- Forwarded message --
>> From: 
>> Date: 2013/2/8
>> Subject: vegdist Error en double(N * (N - 1)/2) : tama?o del vector
>> especificado es muy grande
>> To: caro.bell...@gmail.com
>>
>>
>> Message rejected by filter rule match
>>
>>
>>
>> -- Mensaje reenviado --
>> From: caro bello 
>> To: r-help@r-project.org
>> Cc:
>> Date: Fri, 8 Feb 2013 15:18:40 -0800 (PST)
>> Subject: vegdist Error en double(N * (N - 1)/2) : tamaño del vector
>> especificado es muy grande
>> Hi
>> I have some problems with the vegdist function. I want to calculate a
>> distance matrix with jaccard. I have binary data.
>>
>> The problem is that i have a matrix of 138037 rows (sites) and 89 columns
>> (species). my script is:
>>
>>  rm(list=ls(all=T))
>>
>>  gc() ##para borrar todo lo que quede oculto en memoria
>>
>>  memory.limit(size = 10) # it gives 1 Tera from HDD in case ram
>> memory is over
>>
>>  DF=as.data.frame(MODELOS)
>>
>>  DF=na.omit(DF)
>>
>>  DISTAN=vegdist(DF[,2:ncol(DF)]**,"jaccard")
>>
>> Almost immediately IT produces the error: Error en double(N * (N - 1)/2) :
>> tamaño del vector especificado es muy grande
>>
>> I think this a memory error, but i don´t know why if i have a pc with 32GB
>> of ram and 1 Tera of HDD.
>>
>> I also try to do a dist matrix whit the function dist from package proxy,
>> i
>> did:
>>
>>library(proxy)
>>
>>  vector=dist(DF, method = "Jaccard")
>>
>> it starts to run but when it gets to 10 GB of ram, a window announces
>> that R
>> committed an error and it will close, so it closes and start a new
>> section.
>>
>> I really don't know what is going on and less how to solve this, can
>> anybody
>> help me?
>>
>> thanks
>>
>> Carolina Bello IAVH-COLOMBIA
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/**vegdist-Error-en-double-N-N-1-**
>> 2-tama-o-del-vector-**especificado-es-muy-grande-**tp4658010.html<http://r.789695.n4.nabble.com/vegdist-Error-en-double-N-N-1-2-tama-o-del-vector-especificado-es-muy-grande-tp4658010.html>
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> [[alternative HTML version deleted]]
>>
>>
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  
> http://www.stats.ox.ac.uk/~**ripley/<http://www.stats.ox.ac.uk/~ripley/>
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 18

[R] vegdist Error en double(N * (N - 1)/2) : tama?o del vector especificado es muy grande

2013-02-08 Thread Carolina Bello
-- Forwarded message --
From: 
Date: 2013/2/8
Subject: vegdist Error en double(N * (N - 1)/2) : tama?o del vector
especificado es muy grande
To: caro.bell...@gmail.com


Message rejected by filter rule match



-- Mensaje reenviado --
From: caro bello 
To: r-help@r-project.org
Cc:
Date: Fri, 8 Feb 2013 15:18:40 -0800 (PST)
Subject: vegdist Error en double(N * (N - 1)/2) : tamaño del vector
especificado es muy grande
Hi
I have some problems with the vegdist function. I want to calculate a
distance matrix with jaccard. I have binary data.

The problem is that i have a matrix of 138037 rows (sites) and 89 columns
(species). my script is:

rm(list=ls(all=T))

gc() ##para borrar todo lo que quede oculto en memoria

memory.limit(size = 10) # it gives 1 Tera from HDD in case ram
memory is over

DF=as.data.frame(MODELOS)

DF=na.omit(DF)

DISTAN=vegdist(DF[,2:ncol(DF)],"jaccard")

Almost immediately IT produces the error: Error en double(N * (N - 1)/2) :
tamaño del vector especificado es muy grande

I think this a memory error, but i don´t know why if i have a pc with 32GB
of ram and 1 Tera of HDD.

I also try to do a dist matrix whit the function dist from package proxy, i
did:

  library(proxy)

vector=dist(DF, method = "Jaccard")

it starts to run but when it gets to 10 GB of ram, a window announces that R
committed an error and it will close, so it closes and start a new section.

I really don't know what is going on and less how to solve this, can anybody
help me?

thanks

Carolina Bello IAVH-COLOMBIA




--
View this message in context:
http://r.789695.n4.nabble.com/vegdist-Error-en-double-N-N-1-2-tama-o-del-vector-especificado-es-muy-grande-tp4658010.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.