[R] anova or liklihood ratio test from biglm output

2011-11-02 Thread Chris Howden
(Sorry if this is a repost, I got a bounce reply from the r-help server)



Hi,



I’m using the biglm() function to create some linear models for a very
large data set than lm() can’t fit due to memory issues (the problem is
with the number of interactions, I can fit the main effects model)



I need to determine if the 2-way interactions are necessary or not. Ideally
I’d like to use anova() to get an anova table and a p-value for the
interactions, however it appears that anova is not supported for biglm
objects.



So my next idea was to compare the main effects model with the 2-way
interaction model using a likelihood ratio test. I seem to be able to get
the deviance and residual DF from a biglm object, so I think I should be
able to calculate the LRT and get my p-value if I assume a chi-squared
distribution.



I was wondering if anyone sees any problems with this approach (or would be
kind enough to confirm it)? Or has any better suggestions, ideas or
comments?



Thankyou





Chris Howden B.Sc. (Hons) GStat.

Founding Partner

Evidence Based Strategic Development, IP Commercialisation and Innovation,
Data Analysis, Modelling and Training

(mobile) 0410 689 945

(fax) +612 4782 9023

ch...@trickysolutions.com.au

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can R handle a matrix with 8 billion entries?

2011-08-10 Thread Chris Howden
Thanks Corey,



I’ve looked into them before and I don’t think they can help me with this
problem. The Big functions are great for handling and analysing data sets
that are too big for R to store in memory.



However I believe my problem goes 1 step beyond that. In that my distance
matrix has too many entries for R’s architecture to know how to store in
memory, even if I had memory that was big enough to store it.



Again, I’m no expert in this so I may be wrong.



Chris Howden

Founding Partner

Tricky Solutions

Tricky Solutions 4 Tricky Problems

Evidence Based Strategic Development, IP Commercialisation and Innovation,
Data Analysis, Modelling and Training

(mobile) 0410 689 945

(fax / office)

ch...@trickysolutions.com.au



Disclaimer: The information in this email and any attachments to it are
confidential and may contain legally privileged information. If you are not
the named or intended recipient, please delete this communication and
contact us immediately. Please note you are not authorised to copy, use or
disclose this communication or any attachments without our consent. Although
this email has been checked by anti-virus software, there is a risk that
email messages may be corrupted or infected by viruses or other
interferences. No responsibility is accepted for such interference. Unless
expressly stated, the views of the writer are not those of the company. Tricky
Solutions always does our best to provide accurate forecasts and analyses
based on the data supplied, however it is possible that some important
predictors were not included in the data sent to us. Information provided by
us should not be solely relied upon when making decisions and clients should
use their own judgement.





*From:* Corey Dow-Hygelund [mailto:godelsthe...@gmail.com]
*Sent:* Thursday, 11 August 2011 3:00 AM
*To:* Chris Howden
*Cc:* r-help@r-project.org
*Subject:* Re: [R] Can R handle a matrix with 8 billion entries?



You might want to look into the packages bigmemory and biganalytics.

Corey

On Tue, Aug 9, 2011 at 8:38 PM, Chris Howden 
wrote:

Hi,

I’m trying to do a hierarchical cluster analysis in R with a Big Data set.
I’m running into problems using the dist() function.

I’ve been looking at a few threads about R’s memory and have read the
memory limits section in R help. However I’m no computer expert so I’m
hoping I’ve misunderstood something and R can handle my Big Data set,
somehow. Although at the moment I think my dataset is simply too big and
there is no way around it, but I’d like to be proved wrong!

My data set has 90523 rows of data and 24 columns.

My understanding is that this means the distance matrix has a min of
90523^2 elements which is 8194413529. Which roughly translates as 8GB of
memory being required (if I assume each entry requires 1 bit). I only have
4GB on a 32bit build of windows and R. So there is no way that’s going to
work.

So then I thought of getting access to a more powerful computer, and maybe
using cloud computing.

However the R memory limit help mentions  “On all builds of R, the maximum
length (number of elements) of a vector is 2^31 - 1 ~ 2*10^9”. Now as the
distance matrix I require has more elements than this does this mean it’s
too big for R no matter what I do?

Any ideas would be welcome.

Thanks.


Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP Commercialisation and Innovation,
Data Analysis, Modelling and Training
(mobile) 0410 689 945
(fax / office)
ch...@trickysolutions.com.au

Disclaimer: The information in this email and any attachments to it are
confidential and may contain legally privileged information. If you are
not the named or intended recipient, please delete this communication and
contact us immediately. Please note you are not authorised to copy, use or
disclose this communication or any attachments without our consent.
Although this email has been checked by anti-virus software, there is a
risk that email messages may be corrupted or infected by viruses or other
interferences. No responsibility is accepted for such interference. Unless
expressly stated, the views of the writer are not those of the company.
Tricky Solutions always does our best to provide accurate forecasts and
analyses based on the data supplied, however it is possible that some
important predictors were not included in the data sent to us. Information
provided by us should not be solely relied upon when making decisions and
clients should use their own judgement.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




-- 
*The mark of a successful man is one that has spent an entire day on the
bank of a river without feeling guilty about it.*

[[alternative HTML version deleted

[R] Can R handle a matrix with 8 billion entries?

2011-08-09 Thread Chris Howden
Hi,

I’m trying to do a hierarchical cluster analysis in R with a Big Data set.
I’m running into problems using the dist() function.

I’ve been looking at a few threads about R’s memory and have read the
memory limits section in R help. However I’m no computer expert so I’m
hoping I’ve misunderstood something and R can handle my Big Data set,
somehow. Although at the moment I think my dataset is simply too big and
there is no way around it, but I’d like to be proved wrong!

My data set has 90523 rows of data and 24 columns.

My understanding is that this means the distance matrix has a min of
90523^2 elements which is 8194413529. Which roughly translates as 8GB of
memory being required (if I assume each entry requires 1 bit). I only have
4GB on a 32bit build of windows and R. So there is no way that’s going to
work.

So then I thought of getting access to a more powerful computer, and maybe
using cloud computing.

However the R memory limit help mentions  “On all builds of R, the maximum
length (number of elements) of a vector is 2^31 - 1 ~ 2*10^9”. Now as the
distance matrix I require has more elements than this does this mean it’s
too big for R no matter what I do?

Any ideas would be welcome.

Thanks.


Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP Commercialisation and Innovation,
Data Analysis, Modelling and Training
(mobile) 0410 689 945
(fax / office)
ch...@trickysolutions.com.au

Disclaimer: The information in this email and any attachments to it are
confidential and may contain legally privileged information. If you are
not the named or intended recipient, please delete this communication and
contact us immediately. Please note you are not authorised to copy, use or
disclose this communication or any attachments without our consent.
Although this email has been checked by anti-virus software, there is a
risk that email messages may be corrupted or infected by viruses or other
interferences. No responsibility is accepted for such interference. Unless
expressly stated, the views of the writer are not those of the company.
Tricky Solutions always does our best to provide accurate forecasts and
analyses based on the data supplied, however it is possible that some
important predictors were not included in the data sent to us. Information
provided by us should not be solely relied upon when making decisions and
clients should use their own judgement.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there a better way to parse strings than this?

2011-04-18 Thread Chris Howden
Thanks for the explanation,

I think I understand it now. So to paraphrase all your explanations

To match "." in a regular expression then the string "\.\.\." needs to be
passed to it. This tells it to escape the special meaning of ".". But in
order to get the \ into the string being passed to the function I also
need to escape its special meaning, so I need to use "\\.\\.\\."



Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP Commercialisation and Innovation,
Data Analysis, Modelling and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
ch...@trickysolutions.com.au


-Original Message-
From: h.wick...@gmail.com [mailto:h.wick...@gmail.com] On Behalf Of Hadley
Wickham
Sent: Friday, 15 April 2011 11:07 AM
To: Chris Howden
Cc: r-help@r-project.org
Subject: Re: [R] Is there a better way to parse strings than this?

> I was trying strsplit(string,"\.\.\.") as per the suggestion in Venables
> and Ripleys book to "(use '\.' to match '.')", which is in the Regular
> expressions section.
>
> I noticed that in the suggestions sent to me people used:
> strsplit(test,"\\.\\.\\.")
>
>
> Could anyone please explain why I should have used "\\.\\.\\." rather
than
> "\.\.\."?

Basically,

 * you want to match .
 * so the regular expression you need is \.
 * and the way you represent that in a string in R is \\.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there a better way to parse strings than this?

2011-04-14 Thread Chris Howden
Thanks for the suggestions, they were all exactly what I was looking for.
(I knew that had to be a more elegant way then my brute force method)

One question though.

I was playing around with strsplit but couldn't get it to work, I realised
my problem was that I was using "." as the string.

I was trying strsplit(string,"\.\.\.") as per the suggestion in Venables
and Ripleys book to "(use '\.' to match '.')", which is in the Regular
expressions section.

I noticed that in the suggestions sent to me people used:
strsplit(test,"\\.\\.\\.")


Could anyone please explain why I should have used "\\.\\.\\." rather than
"\.\.\."?



Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP Commercialisation and Innovation,
Data Analysis, Modelling and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
ch...@trickysolutions.com.au


-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
Sent: Wednesday, 13 April 2011 10:55 PM
To: Chris Howden
Cc: r-help@r-project.org
Subject: Re: [R] Is there a better way to parse strings than this?

On Wed, Apr 13, 2011 at 12:07 AM, Chris Howden
 wrote:
> Hi Everyone,
>
>
> I needed to parse some strings recently.
>
> The code I've wound up using seems rather clunky, and I was wondering if
> anyone had any suggestions on a better way?
>
> Basically I do the following:
>
> 1) Use substr() to do the parsing
> 2) Use regexpr() to find the location of the string I want to parse on,
I
> then pass this onto substr()
> 3) Use nchar() as the stop input to substr() where necessary
>
>
>
> I've got a simple example of the parsing code I used below. It takes
> questionnaire variable names that includes the question and the brand it
> was answered for and then parses it so the variable name and the brand
are
> in separate columns. I then use this to restructure the data from
> unstacked to stacked, but that's another story.
>
>> # this is the data set
>> test
> [1] "A5.Brands.bought...Dulux"
> [2] "A5.Brands.bought...Haymes"
> [3] "A5.Brands.bought...Solver"
> [4] "A5.Brands.bought...Taubmans.or.Bristol"
> [5] "A5.Brands.bought...Wattyl"
> [6] "A5.Brands.bought...Other"
>
>> # Where do I want to parse?
>> break1 <-  regexpr('...',test, fixed=TRUE)
>> break1
> [1] 17 17 17 17 17 17
> attr(,"match.length")
> [1] 3 3 3 3 3 3
>
>> # Put Variable name in a variable
>> str1 <- substr(test,1,break1-1)
>> str1
> [1] "A5.Brands.bought" "A5.Brands.bought" "A5.Brands.bought"
> "A5.Brands.bought"
> [5] "A5.Brands.bought" "A5.Brands.bought"
>
>> # Put Brand name in a variable
>> str2 <- substr(test,break1+3, nchar(test))
>> str2
> [1] "Dulux"               "Haymes"              "Solver"
> [4] "Taubmans.or.Bristol" "Wattyl"              "Other"
>
>

Try this:

> x <- c("A5.Brands.bought...Dulux", "A5.Brands.bought...Haymes",
+ "A5.Brands.bought...Solver")
>
> do.call(rbind, strsplit(x, "...", fixed = TRUE))
 [,1]   [,2]
[1,] "A5.Brands.bought" "Dulux"
[2,] "A5.Brands.bought" "Haymes"
[3,] "A5.Brands.bought" "Solver"
>
> # or
> xa <- sub("...", "\1", x, fixed = TRUE)
> read.table(textConnection(xa), sep = "\1", as.is = TRUE)
V1 V2
1 A5.Brands.bought  Dulux
2 A5.Brands.bought Haymes
3 A5.Brands.bought Solver


--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Is there a better way to parse strings than this?

2011-04-12 Thread Chris Howden
Hi Everyone,


I needed to parse some strings recently.

The code I've wound up using seems rather clunky, and I was wondering if
anyone had any suggestions on a better way?

Basically I do the following:

1) Use substr() to do the parsing
2) Use regexpr() to find the location of the string I want to parse on, I
then pass this onto substr()
3) Use nchar() as the stop input to substr() where necessary



I've got a simple example of the parsing code I used below. It takes
questionnaire variable names that includes the question and the brand it
was answered for and then parses it so the variable name and the brand are
in separate columns. I then use this to restructure the data from
unstacked to stacked, but that's another story.

> # this is the data set
> test
[1] "A5.Brands.bought...Dulux"
[2] "A5.Brands.bought...Haymes"
[3] "A5.Brands.bought...Solver"
[4] "A5.Brands.bought...Taubmans.or.Bristol"
[5] "A5.Brands.bought...Wattyl"
[6] "A5.Brands.bought...Other"

> # Where do I want to parse?
> break1 <-  regexpr('...',test, fixed=TRUE)
> break1
[1] 17 17 17 17 17 17
attr(,"match.length")
[1] 3 3 3 3 3 3

> # Put Variable name in a variable
> str1 <- substr(test,1,break1-1)
> str1
[1] "A5.Brands.bought" "A5.Brands.bought" "A5.Brands.bought"
"A5.Brands.bought"
[5] "A5.Brands.bought" "A5.Brands.bought"

> # Put Brand name in a variable
> str2 <- substr(test,break1+3, nchar(test))
> str2
[1] "Dulux"   "Haymes"  "Solver"
[4] "Taubmans.or.Bristol" "Wattyl"  "Other"



Thanks for any and all suggestions


Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP Commercialisation and Innovation,
Data Analysis, Modelling and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
ch...@trickysolutions.com.au

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory issues

2011-01-16 Thread Chris Howden
Hi Emmanuel,

Try the following:

1) removing unnecessary programs from memory, this might give u a larger
contiguous memory block for R
2) remove unnecessary data from R's memory, so many of the preceding data
sets U no longer need can be removed. use the rm() command. U might need
to run gc() after this to insure the new memory is available
3) make sure U've assigned as much memory to R as possible using
memory.size()




And make sure u have r's

Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP Commercialisation and Innovation,
Data Analysis, Modelling, and Training
(mobile) 0410 689 945
ch...@trickysolutions.com.au

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Emmanuel Bellity
Sent: Monday, 17 January 2011 4:53 AM
To: r-help@r-project.org
Subject: [R] Memory issues

Hi,

I have read several threads about memory issues in R and I can't seem to
find a solution to my problem.

I am running a sort of LASSO regression on several subsets of a big
dataset.
For some subsets it works well, and for some bigger subsets it does not
work, with errors of type "cannot allocate vector of size 1.6Gb". The
error
occurs at this line of the code:

   example <- cv.glmnet(x=bigmatrix, y=price, nfolds=3)


It also depends on the number of variables that were included in
"bigmatrix".


I tried on R and R64 for both Mac and R for PC but recently went onto a
faster virtual machine on Linux thinking I would avoid any memory issues.
It
was better but still had some limits, even though memory.limit indicates
"Inf".


Is there anyway to make this work or do I have to cut a few variables in
the
matrix or take a smaller subset of data ?

I have read that R is looking for some contiguous bits of memory and that
maybe I should pre-allocate the matrix ? Any idea ?


Many thanks

Emmanuel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] is there a way to update both packages if they occur in 2 libraries?

2010-10-20 Thread Chris Howden
Thanks for the explanation Brian,

I used the summary(packageStatus()) to have a look at what was available
and in each library. And then deleted all libraries that came with R2.12.0
from my personal library.

And everything now works.



Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
ch...@trickysolutions.com.au


-Original Message-
From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
Sent: Wednesday, 20 October 2010 10:11 PM
To: Uwe Ligges
Cc: Chris Howden; r-help
Subject: Re: [R] is there a way to update both packages if they occur in 2
libraries?

On Wed, 20 Oct 2010, Uwe Ligges wrote:

>
>
> On 20.10.2010 13:59, Chris Howden wrote:
>> Thanks Uwe,
>>
>> It may operate like that on most peoples machines, but either its not
>> operating like that on mine. Or I have another problem :-(
>>
>> As u can see from my code below I've run
update.packages(checkBuilt=TRUE)
>> and my 'private' library is in my LibPaths()...
>>
>> However when I try to load the foreign package I get an error message
>> telling me "package "foreign' was built before R 2.10.0:
>
>
> Ah, I haven't read your original message carefully enough: Package
foreign is
> a base package. Base packages should only be in the R base library, not
in
> any other library. They cannot be updated via update.packages().

Not quite (and it is a recommended not a base package).  They can be
updated *if updates are available*.

update.packages(checkBuilt=TRUE) cannot update packages that are
not currently on the selected repositories.  This includes

For Windows binaries, the recommended packages (which you have anyway
in .Library) until later versions than those in 2.12.0 are available.

Any packages which have been withdrawn.

Any packages for which binaries are not available for R 2.12.x (and
there are few, see
http://cran.r-project.org/bin/windows/contrib/2.12/ReadMe and those
with ERROR on
http://cran.r-project.org/bin/windows/contrib/checkSummaryWin.html).

A useful check is to run

summary(packageStatus())

which reports packages which are unavailable in each library, directly
via (for the jth library)

summary(packageStatus())$Libs[[j]]$unavailable

>
> Best wishes,
> Uwe
>
>
>> please re-install
>> it". But then if I remove my private library from the search path I can
>> load foreignso this suggests the problem is with the foreign
package
>> in my 'private library'.
>>
>> Furthermore, if I look at the description file for foreign it claims to
>> have been built for R package 2.9.2. (I've copied it below).
>>
>> I'm concluding the issue is with the foreign package in my private
library
>> since it claims to have been built for R 2.9.2&  I can get the package
to
>> load if I remove my private library from the library search path and
laod
>> the foreaign package from the base library.
>>  I'm then concluding the problem is due to it not updating since
>> the description file claims it was built for R version 2.9.2
and due to
>> the error message I'm getting ie "foreign' was built before R
>> 2.10.0: please re-install it"
>>
>>
>> BUT I'm happy to be proven wrong... I just can't think of what else the
>> problem could be?
>>
>>
>>
>> FOREIGN DESCRIPTION FILE
>> Package: foreign
>> Priority: recommended
>> Version: 0.8-39
>> Date: 2010-01-03
>> Title: Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat, dBase,
>>  ...
>> Depends: R (>= 2.6.0), stats
>> Imports: methods, utils
>> Maintainer: R-core
>> Author: R-core members, Saikat DebRoy, Roger
>>  Bivand  and others: see COPYRIGHTS file
in
>>  the sources.
>> Description: Functions for reading and writing data stored by
>>  statistical packages such as Minitab, S, SAS, SPSS, Stata,
>>  Systat, ..., and for reading and writing .dbf (dBase) files.
>> LazyLoad: yes
>> License: GPL (>= 2)
>> BugReports: http://bugs.r-project.org
>> Packaged: 2010-01-03 10:24:13 UTC; ripley
>> Repository: CRAN
>> Date/Publication: 2010-01-03 14:06:04
>> Built: R 2.9.2; i386-pc-mingw32; 2010-01-03 23:21:40 UTC; windows
>>
>>
>>
>>
>>
>>
>> Chris Howden
>> Founding Partner
>> Tricky Solutions
>> Tricky Solutions 4 Tricky Problems
>> Evidence Based Strategic Development, IP development, Data Analysis,
>> Modelling, and Traini

Re: [R] is there a way to update both packages if they occur in 2 libraries?

2010-10-20 Thread Chris Howden
Thanks Uwe,

I was wondering if it was something like that.

I'll delete the base packages from my personal library.


And just as a comment...although I'm a rather new user to R (as U may have
guessed). I gather that every now and then popular and necessary packages
are added to base R.

So I'm guessing the problem I was having would occur when ever this
happens and people have the old package in their personal libraries.
(which would likely be the case if it's considered good enough to add to
base R)

Not really a 'bug' of R. But something I'll remember!!!

Thanks again.



Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
ch...@trickysolutions.com.au


-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
Sent: Wednesday, 20 October 2010 9:38 PM
To: Chris Howden
Cc: r-help
Subject: Re: [R] is there a way to update both packages if they occur in 2
libraries?



On 20.10.2010 13:59, Chris Howden wrote:
> Thanks Uwe,
>
> It may operate like that on most peoples machines, but either its not
> operating like that on mine. Or I have another problem :-(
>
> As u can see from my code below I've run
update.packages(checkBuilt=TRUE)
> and my 'private' library is in my LibPaths()...
>
> However when I try to load the foreign package I get an error message
> telling me "package "foreign' was built before R 2.10.0:


Ah, I haven't read your original message carefully enough: Package
foreign is a base package. Base packages should only be in the R base
library, not in any other library. They cannot be updated via
update.packages().

Best wishes,
Uwe


> please re-install
> it". But then if I remove my private library from the search path I can
> load foreignso this suggests the problem is with the foreign package
> in my 'private library'.
>
> Furthermore, if I look at the description file for foreign it claims to
> have been built for R package 2.9.2. (I've copied it below).
>
> I'm concluding the issue is with the foreign package in my private
library
> since it claims to have been built for R 2.9.2&  I can get the package
to
> load if I remove my private library from the library search path and
laod
> the foreaign package from the base library.
>   I'm then concluding the problem is due to it not updating since
> the description file  claims it was built for R version 2.9.2 and due to
> the error message I'm getting ie  "foreign' was built before R
> 2.10.0: please re-install it"
>
>
> BUT I'm happy to be proven wrong... I just can't think of what else the
> problem could be?
>
>
>
> FOREIGN DESCRIPTION FILE
> Package: foreign
> Priority: recommended
> Version: 0.8-39
> Date: 2010-01-03
> Title: Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat, dBase,
>  ...
> Depends: R (>= 2.6.0), stats
> Imports: methods, utils
> Maintainer: R-core
> Author: R-core members, Saikat DebRoy, Roger
>  Bivand  and others: see COPYRIGHTS file in
>  the sources.
> Description: Functions for reading and writing data stored by
>  statistical packages such as Minitab, S, SAS, SPSS, Stata,
>  Systat, ..., and for reading and writing .dbf (dBase) files.
> LazyLoad: yes
> License: GPL (>= 2)
> BugReports: http://bugs.r-project.org
> Packaged: 2010-01-03 10:24:13 UTC; ripley
> Repository: CRAN
> Date/Publication: 2010-01-03 14:06:04
> Built: R 2.9.2; i386-pc-mingw32; 2010-01-03 23:21:40 UTC; windows
>
>
>
>
>
>
> Chris Howden
> Founding Partner
> Tricky Solutions
> Tricky Solutions 4 Tricky Problems
> Evidence Based Strategic Development, IP development, Data Analysis,
> Modelling, and Training
> (mobile) 0410 689 945
> (fax / office) (+618) 8952 7878
> ch...@trickysolutions.com.au
>
>
> -Original Message-
> From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
> Sent: Wednesday, 20 October 2010 9:15 PM
> To: Chris Howden
> Cc: r-help
> Subject: Re: [R] is there a way to update both packages if they occur in
2
> libraries?
>
> update.packages() updates all packages in all libraries listed in
> .libPaths() unless you specify an explicit library.
>
> It may happen that the version number has not changed and you just want
> to reinstall for your upgraded R. In that case use:
>
> update.packages(checkBuilt=TRUE)
>
> Best,
> Uwe Ligges
>
>
>
> On 20.10.2010 04:07, Chris Howden wrote:
>> Hi everyone,
>>
>>
>>
>> I

Re: [R] is there a way to update both packages if they occur in 2 libraries?

2010-10-20 Thread Chris Howden
Thanks Uwe,

It may operate like that on most peoples machines, but either its not
operating like that on mine. Or I have another problem :-(

As u can see from my code below I've run update.packages(checkBuilt=TRUE)
and my 'private' library is in my LibPaths()...

However when I try to load the foreign package I get an error message
telling me "package "foreign' was built before R 2.10.0: please re-install
it". But then if I remove my private library from the search path I can
load foreignso this suggests the problem is with the foreign package
in my 'private library'.

Furthermore, if I look at the description file for foreign it claims to
have been built for R package 2.9.2. (I've copied it below).

I'm concluding the issue is with the foreign package in my private library
since it claims to have been built for R 2.9.2 & I can get the package to
load if I remove my private library from the library search path and laod
the foreaign package from the base library.
I'm then concluding the problem is due to it not updating since
the description fileclaims it was built for R version 2.9.2 and due to
the error message I'm getting ie"foreign' was built before R
2.10.0: please re-install it"


BUT I'm happy to be proven wrong... I just can't think of what else the
problem could be?



FOREIGN DESCRIPTION FILE
Package: foreign
Priority: recommended
Version: 0.8-39
Date: 2010-01-03
Title: Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat, dBase,
...
Depends: R (>= 2.6.0), stats
Imports: methods, utils
Maintainer: R-core 
Author: R-core members, Saikat DebRoy , Roger
Bivand  and others: see COPYRIGHTS file in
the sources.
Description: Functions for reading and writing data stored by
statistical packages such as Minitab, S, SAS, SPSS, Stata,
Systat, ..., and for reading and writing .dbf (dBase) files.
LazyLoad: yes
License: GPL (>= 2)
BugReports: http://bugs.r-project.org
Packaged: 2010-01-03 10:24:13 UTC; ripley
Repository: CRAN
Date/Publication: 2010-01-03 14:06:04
Built: R 2.9.2; i386-pc-mingw32; 2010-01-03 23:21:40 UTC; windows






Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
ch...@trickysolutions.com.au


-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
Sent: Wednesday, 20 October 2010 9:15 PM
To: Chris Howden
Cc: r-help
Subject: Re: [R] is there a way to update both packages if they occur in 2
libraries?

update.packages() updates all packages in all libraries listed in
.libPaths() unless you specify an explicit library.

It may happen that the version number has not changed and you just want
to reinstall for your upgraded R. In that case use:

update.packages(checkBuilt=TRUE)

Best,
Uwe Ligges



On 20.10.2010 04:07, Chris Howden wrote:
> Hi everyone,
>
>
>
> I've recently added a private library as a way to manage my R libraries.
And
> I did this by simply copying my old library to a new folder and then
linking
> this to R by setting my R_LIBS  environmental variable in .Renviron.
>
>
>
> However I have run into a problem.
>
>
>
> When I update my packages it is not updating those that are current in
the
> base R library.
>
>
>
> This means I can't load packages that are included in base R, since R is
> looking in my private library first and when it finds the package it
tries
> to use it. But it's an outdated version.
>
>
>
> The easiest solution I can think of is to update both libraries, but
when I
> run update.packages(lib.loc="private library location" ask = FALSE,
> checkBuilt=TRUE) it's not updating them.
>
>
>
> So I was wondering if there is a way to update all packages  that occur
in
> all libraries?
>
>
>
>
>
> (Note that I can think of other solutions to my problem, but they are
all
> time consuming and defeats the purpose of why I want a private library
i.e.
> it makes updating R easier since I don't need to copy over the library
> folder each time nor update any environmental variables. So far the best
> alternative I've come up with is to delete all the duplicate base R
> libraries from my private library)
>
>
>
> If anyone is interested the code I used to understand my problem is
below.
>
>
>
>
>
> Thanks everyone
>
>
>
>
>
>
>
>> update.packages(lib.loc="C:\\Program Files\\R\\library", ask = FALSE,
> checkBuilt=TRUE)
>
> --- Please select a CRAN mirror for use in this session ---
>
>
>
>> update.packages(ask = FALSE, checkBuilt = TRUE)
>

[R] is there a way to update both packages if they occur in 2 libraries?

2010-10-19 Thread Chris Howden
Hi everyone,



I’ve recently added a private library as a way to manage my R libraries. And
I did this by simply copying my old library to a new folder and then linking
this to R by setting my R_LIBS  environmental variable in .Renviron.



However I have run into a problem.



When I update my packages it is not updating those that are current in the
base R library.



This means I can’t load packages that are included in base R, since R is
looking in my private library first and when it finds the package it tries
to use it. But it’s an outdated version.



The easiest solution I can think of is to update both libraries, but when I
run update.packages(lib.loc=”private library location” ask = FALSE,
checkBuilt=TRUE) it’s not updating them.



So I was wondering if there is a way to update all packages  that occur in
all libraries?





(Note that I can think of other solutions to my problem, but they are all
time consuming and defeats the purpose of why I want a private library i.e.
it makes updating R easier since I don’t need to copy over the library
folder each time nor update any environmental variables. So far the best
alternative I’ve come up with is to delete all the duplicate base R
libraries from my private library)



If anyone is interested the code I used to understand my problem is below.





Thanks everyone







> update.packages(lib.loc="C:\\Program Files\\R\\library", ask = FALSE,
checkBuilt=TRUE)

--- Please select a CRAN mirror for use in this session ---



> update.packages(ask = FALSE, checkBuilt = TRUE)



Foreign package won’t load

> library(foreign)

Error: package 'foreign' was built before R 2.10.0: please re-install it



> .libPaths()

[1] "C:\\Program Files\\R\\library"   "C:/PROGRA~1/R/R-212~1.0/library"



> .libPaths("new")



> .libPaths()

[1] "C:/PROGRA~1/R/R-212~1.0/library"



Foreign package will load

> library(foreign)











Chris Howden

Founding Partner

Tricky Solutions

Tricky Solutions 4 Tricky Problems

Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training

(mobile) 0410 689 945

(fax / office) (+618) 8952 7878

ch...@trickysolutions.com.au

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can't find and install reshape2??

2010-10-18 Thread Chris Howden
Thanks for the ideas,

Just wanted to say that it was because I was using an old version of R (as
U suggested).

I have now updated to v12.0 and I can see and load reshape2.

(and I agree with Hadley that it would be nice if there was some way of
getting a more informative error message. However thanks to the helpful R
community I know now what to do if I have a similar problem in the future)

Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
ch...@trickysolutions.com.au

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Bernardo Rangel Tura
Sent: Tuesday, 12 October 2010 6:53 PM
To: r-help
Subject: Re: [R] can't find and install reshape2??

On Mon, 2010-10-04 at 10:27 +0930, Chris Howden wrote:
> Hi everyone,
>
>
>
> Im trying to install reshape2.
>
>
>
> But when I click on install package its not coming up!?!?! Im getting
> reshape, but no reshape2?
>
>
>
> Ive also tried download.packages(reshape2, destdir="c:\\") &
> download.packages(Reshape2, destdir="c:\\")but no luck!!!
>
>
>
> Does anyone have any ideas what could be going on?
>
>
>
> Chris Howden
>

Hi Chris,

I have two guess:

1- You don't have installed 'stringr' pakage
2- Your R is outdated

Try this two things and after this mail me

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merging and working with BIG data sets. Is sqldf the best way??

2010-10-15 Thread Chris Howden
Thanks for the advice Gabor,

I was indeed not starting and finishing with sqldf(). Which was why it was
not working for me. Please forgive a blatantly obvious mistake.


I have tried what U suggested and unfortunately R is still having problems
doing the join. The problem seems to be one of memory since I am receiving
the below error message when I run the natural join using sqldf.
Error: cannot allocate vector of size 250.0 Mb
Timing stopped at: 32.8 0.48 34.79


I have tried it on a subset of the data and it works. So I think it's a
memory issue, being caused by my very large dataset (11 million rows and 2
columns).

I think I may have to admit that R cannot do this (on my machine). And try
doing it in a full blown database such as postgre.


Unless U (or anyone else) have any other suggestions???

Thanks again for your help.



For anyone who's interested here's all my code and R log.
##
# Info on input data
##
> class(A)
[1] "data.frame"
> class(B)
[1] "data.frame"

> names(A)
[1] "POINTID"  "alistair"
> names(B)
[1] "POINTID""alistair_range"

> dim(A)
[1] 110485922
> dim(B)
[1] 110485922


##
# Tried the join with an index on the entire data set
##
> sqldf()


>  system.time(sqldf("create index ai1 on A(POINTID, alistair)"))
   user  system elapsed
  76.850.34   79.67

>  system.time(sqldf("create index ai2 on B(POINTID, alistair_range)"))
   user  system elapsed
  75.430.43   77.16

> system.time(sqldf("select * from main.A natural join main.B"))
Error: cannot allocate vector of size 250.0 Mb
Timing stopped at: 32.8 0.48 34.79

> sqldf()
Error in sqliteCloseConnection(conn, ...) :
  RS-DBI driver: (close the pending result sets before closing this
connection)


##
# Also tried the join with an index built from only the variable I intend
to merge on, since I wasn't exactly sure which index was correct.
##
> sqldf()


> system.time(sqldf("create index ai1 on A(POINTID)"))
   user  system elapsed
  66.670.44   69.28

> system.time(sqldf("create index ai2 on B(POINTID)"))
   user  system elapsed
  68.180.31   68.73

> system.time(sqldf("select * from main.A natural join main.B"))
Error: cannot allocate vector of size 31.2 Mb
Timing stopped at: 10.56 0.04 10.87

> sqldf()
Error in sqliteCloseConnection(conn, ...) :
  RS-DBI driver: (close the pending result sets before closing this
connection)

##
# and some memory info
##
> memory.size()
[1] 412.6

> memory.size(NA)
[1] 4095



Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
ch...@trickysolutions.com.au


-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
Sent: Friday, 15 October 2010 1:03 PM
To: Chris Howden
Cc: r-help@r-project.org
Subject: Re: [R] merging and working with BIG data sets. Is sqldf the best
way??

On Thu, Oct 14, 2010 at 10:56 PM, Chris Howden
 wrote:
> Thanks for the suggestion and code Gabor,
>
> I've tried creating 2 indices:
>
> 1) just for the variable I intend to merge on
> 2) on the entire data set I am merging (which I think is the one I
should
> be using??)
>
> However neither seemed to work. The first was still going after 2 hours,
> and the second after 12 hours, so I stopped the join.
>
> If it's not too much bother I was wondering if U could let me know which
> index I should be using?
>
>
> Or in other words since I plan to merge using POINTID do I create an
index
> on
>
> system.time(sqldf("create index ai1 on A(POINTID)"))
> system.time(sqldf("create index ai2 on B(POINTID)"))
>
> or
>
> system.time(sqldf("create index ai1 on A(POINTID,alistair)"))
> system.time(sqldf("create index ai2 on B(POINTID, alistair_range)")
>
>
>
> I'm now using the following join statement
> system.time(data2 <- sqldf("select * from A natural join B"))
>

If you only ran the three sqldf statements you mentioned in your post
(thereby omitting 2 of the 5 sqldf calls in example 4i):

sqldf("create...")
sqldf("create...")
sqldf("select...")

then what you are doing is to create a database, upload your data to
it, create an index on it, destroy the database, then create a second
database, upload the data to this second database, create an second
index and then destroy that database too and then finally create a
third database, upload the data to it and then do a join without any
indexes.

You must bracket all this with empty sqldf calls as shown in 4i to
force persistence:

sqldf()
sqldf("create...")
s

Re: [R] merging and working with BIG data sets. Is sqldf the best way??

2010-10-14 Thread Chris Howden
Thanks for the suggestion and code Gabor,

I've tried creating 2 indices:

1) just for the variable I intend to merge on
2) on the entire data set I am merging (which I think is the one I should
be using??)

However neither seemed to work. The first was still going after 2 hours,
and the second after 12 hours, so I stopped the join.

If it's not too much bother I was wondering if U could let me know which
index I should be using?


Or in other words since I plan to merge using POINTID do I create an index
on

system.time(sqldf("create index ai1 on A(POINTID)"))
system.time(sqldf("create index ai2 on B(POINTID)"))

or

system.time(sqldf("create index ai1 on A(POINTID,alistair)"))
system.time(sqldf("create index ai2 on B(POINTID, alistair_range)")



I'm now using the following join statement
system.time(data2 <- sqldf("select * from A natural join B"))


thanks

Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
ch...@trickysolutions.com.au


-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
Sent: Thursday, 14 October 2010 9:02 AM
To: Chris Howden
Cc: r-help@r-project.org
Subject: Re: [R] merging and working with BIG data sets. Is sqldf the best
way??

On Tue, Oct 12, 2010 at 2:39 AM, Chris Howden
 wrote:
> I’m working with some very big datasets (each dataset has 11 million
rows
> and 2 columns). My first step is to merge all my individual data sets
> together (I have about 20)
>
> I’m using the following command from sqldf
>
>               data1 <- sqldf("select A.*, B.* from A inner join B
> using(ID)")
>
> But it’s taking A VERY VERY LONG TIME to merge just 2 of the datasets
(well
> over 2 hours, possibly longer since it’s still going).

You need to add indexes to your tables.   See example 4i on the sqldf home
page
http://sqldf.googlecode.com
This can result in huge speedups for large tables.

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] merging and working with BIG data sets. Is sqldf the best way??

2010-10-11 Thread Chris Howden
Hi everyone,



I’m working with some very big datasets (each dataset has 11 million rows
and 2 columns). My first step is to merge all my individual data sets
together (I have about 20)



I’m using the following command from sqldf

   data1 <- sqldf("select A.*, B.* from A inner join B
using(ID)")



But it’s taking A VERY VERY LONG TIME to merge just 2 of the datasets (well
over 2 hours, possibly longer since it’s still going).





I was wondering if anyone could suggest a better way, or maybe some
suggestions on how I could tweak my computer set up to speed it up?





I’ve looked at the following packages and this is the only way I’ve found to
actually merge large data sets in R. These packages seem great for accessing
large data sets by avoiding storing them in RAM….but I can’t see how they
can be used to merge data sets together:

·ff

·filehash

·bigmemory



Does anyone have any ideas?



At the moment my best idea is to hand it over to someone with a dedicated
database server and get them to do the merges (and then hope package biglm
can do the modelling)



Thanks for any ideas at all!!







Chris Howden

Founding Partner

Tricky Solutions

Tricky Solutions 4 Tricky Problems

Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training

(mobile) 0410 689 945

(fax / office) (+618) 8952 7878

ch...@trickysolutions.com.au

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can't find and install reshape2??

2010-10-11 Thread chris howden
Just wanted to say that I've gone onto the CRAN website and downloaded it
directly from there.

So its no longer a problem for me.

But it may be one for other people, it is kinda weird I couldn't see it on
the list of packages on 4 mirrors!!

Thanks for your help though.


-Original Message-----
From: chris howden [mailto:tall.chr...@yahoo.com.au] 
Sent: Tuesday, 12 October 2010 3:48 PM
To: 'Jeffrey Spies'; 'David Winsemius'
Cc: 'r-help@r-project.org'
Subject: RE: [R] can't find and install reshape2??

Hi Guys,

Thanks for your suggestions and sorry for the delay in replying, I've been
having one of those weeks.

I feel a little silly not trying the package name input as a character
string, I should have know that. However I have tried your suggestions and
neither worked. The code and error messages are at the bottom of this email
and U can see the reason would appear the "reshape2" package is not
available on the repositories I'm trying to access.

I then tried closing R, reopening it and looking in the following CRAN
mirrors:
Australia
UK(London)
Canada(BC)
USA(AZ)

Reshape 2 was in none of them, my choices were:
ResearchMethods
Reshape
ResistorArray

But no reshape2

Any ideas as to why I can't see reshape2? 

Is it just me or are other people having this problem?

thanks

> download.packages('reshape2', destdir="c:\\")
Warning in download.packages("reshape2", destdir = "c:\\") :
  no package 'reshape2' at the repositories
 [,1] [,2]


> install.packages('reshape2')
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
  package ‘reshape2’ is not available





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Jeffrey Spies
Sent: Monday, 4 October 2010 10:57 AM
To: Chris Howden
Cc: r-help@r-project.org
Subject: Re: [R] can't find and install reshape2??

The first argument in download.packages should be of type character or
a vector of characters.

This worked for me:

install.packages('reshape2')

as did:

download.packages('reshape2', '~/Downloads/')

Cheers,

Jeff.

On Sun, Oct 3, 2010 at 8:57 PM, Chris Howden
 wrote:
> Hi everyone,
>
>
>
> I’m trying to install reshape2.
>
>
>
> But when I click on “install package” it’s not coming up!?!?! I’m getting
> reshape, but no reshape2?
>
>
>
> I’ve also tried download.packages(reshape2, destdir="c:\\") &
> download.packages(Reshape2, destdir="c:\\")…but no luck!!!
>
>
>
> Does anyone have any ideas what could be going on?
>
>
>
> Chris Howden
>
> Founding Partner
>
> Tricky Solutions
>
> Tricky Solutions 4 Tricky Problems
>
> Evidence Based Strategic Development, IP development, Data Analysis,
> Modelling, and Training
>
> (mobile) 0410 689 945
>
> (fax / office) (+618) 8952 7878
>
> ch...@trickysolutions.com.au
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can't find and install reshape2??

2010-10-11 Thread chris howden
Hi Guys,

Thanks for your suggestions and sorry for the delay in replying, I've been
having one of those weeks.

I feel a little silly not trying the package name input as a character
string, I should have know that. However I have tried your suggestions and
neither worked. The code and error messages are at the bottom of this email
and U can see the reason would appear the "reshape2" package is not
available on the repositories I'm trying to access.

I then tried closing R, reopening it and looking in the following CRAN
mirrors:
Australia
UK(London)
Canada(BC)
USA(AZ)

Reshape 2 was in none of them, my choices were:
ResearchMethods
Reshape
ResistorArray

But no reshape2

Any ideas as to why I can't see reshape2? 

Is it just me or are other people having this problem?

thanks

> download.packages('reshape2', destdir="c:\\")
Warning in download.packages("reshape2", destdir = "c:\\") :
  no package 'reshape2' at the repositories
 [,1] [,2]


> install.packages('reshape2')
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
  package ‘reshape2’ is not available





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Jeffrey Spies
Sent: Monday, 4 October 2010 10:57 AM
To: Chris Howden
Cc: r-help@r-project.org
Subject: Re: [R] can't find and install reshape2??

The first argument in download.packages should be of type character or
a vector of characters.

This worked for me:

install.packages('reshape2')

as did:

download.packages('reshape2', '~/Downloads/')

Cheers,

Jeff.

On Sun, Oct 3, 2010 at 8:57 PM, Chris Howden
 wrote:
> Hi everyone,
>
>
>
> I’m trying to install reshape2.
>
>
>
> But when I click on “install package” it’s not coming up!?!?! I’m getting
> reshape, but no reshape2?
>
>
>
> I’ve also tried download.packages(reshape2, destdir="c:\\") &
> download.packages(Reshape2, destdir="c:\\")…but no luck!!!
>
>
>
> Does anyone have any ideas what could be going on?
>
>
>
> Chris Howden
>
> Founding Partner
>
> Tricky Solutions
>
> Tricky Solutions 4 Tricky Problems
>
> Evidence Based Strategic Development, IP development, Data Analysis,
> Modelling, and Training
>
> (mobile) 0410 689 945
>
> (fax / office) (+618) 8952 7878
>
> ch...@trickysolutions.com.au
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory limit problem

2010-10-11 Thread Chris Howden
Hi Daniel,

There are a number of ways to deal with data without forcing them into
RAM.

If your comfortable with SQL the easiest way might be to use sqldf to join
them using a SQL select query. Try googling "Handling large(r) datasets in
R" Soren Hojsgaard.

Or if u definitely only want to do a cbind and not a merge U might be able
to use one of the following packages. These store the data on disk (rather
than RAM) and might allow u to cbind them.
Filehash
    Ff




Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
ch...@trickysolutions.com.au

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Daniel Nordlund
Sent: Tuesday, 12 October 2010 3:00 PM
To: r-help@r-project.org
Subject: Re: [R] Memory limit problem

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of David Winsemius
> Sent: Monday, October 11, 2010 10:07 PM
> To: Tim Clark
> Cc: r help r-help
> Subject: Re: [R] Memory limit problem
>
>
> On Oct 11, 2010, at 11:49 PM, Tim Clark wrote:
>
> > Dear List,
> >
> > I am trying to plot bathymetry contours around the Hawaiian Islands
> > using the
> > package rgdal and PBSmapping.  I have run into a memory limit when
> > trying to
> > combine two fairly small objects using cbind().  I have increased
> > the memory to
> > 4GB, but am being told I can't allocate a vector of size 240 Kb.  I
> > am running R
> > 2.11.1 on a Dell Optiplex 760 with Windows XP.  I have pasted the
> > error message
> > and summaries of the objects below.  Thanks for your help.  Tim
> >
> >
> >>  xyz<-cbind(hi.to.utm,z=b.depth$z)
> > Error: cannot allocate vector of size 240 Kb
>
> You have too much other "stuff".
> Try this:
>
> getsizes <- function() {z <- sapply(ls(envir=globalenv()),
>  function(x) object.size(get(x)))
> (tmp <- as.matrix(rev(sort(z))[1:10]))}
> getsizes()
>
> You will see a list of the largest objects in descending order. Then
> use rm() to clear out unneeded items.
>
> --
> David,
>
> >
> >> memory.limit()
> > [1] 4000
>
> Seems unlikely that you really have that much space in that 32 bit OS.
<<>

Yeah, without performing some special incantations, Windows XP will not
allocate more than 2GB of memory to any one process (e.g. R).  And even
with those special incantations, at most you will get no more than about
3.2-3.5 GB.  The other thing to remember is that even if you had more than
enough free space, R requires the free space for an object to be
contiguous.  So if memory was fragmented and you didn't have 240KB of
contiguous memory, it still couldn't allocate the vector.

Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] can't find and install reshape2??

2010-10-03 Thread Chris Howden
Hi everyone,



I’m trying to install reshape2.



But when I click on “install package” it’s not coming up!?!?! I’m getting
reshape, but no reshape2?



I’ve also tried download.packages(reshape2, destdir="c:\\") &
download.packages(Reshape2, destdir="c:\\")…but no luck!!!



Does anyone have any ideas what could be going on?



Chris Howden

Founding Partner

Tricky Solutions

Tricky Solutions 4 Tricky Problems

Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training

(mobile) 0410 689 945

(fax / office) (+618) 8952 7878

ch...@trickysolutions.com.au

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to replace NA with a specific score that is dependant on another indicator variable

2010-09-01 Thread Chris Howden
Hi everyone,



I’m looking for a clever bit of code to replace NA’s with a specific score
depending on an indicator variable.



I can see how to do it using lots of if statements but I’m sure there most
be a neater, better way of doing it.



Any ideas at all will be much appreciated, I’m dreading coding up all those
if statements!





My problem is as follows:



I have a data set with lots of missing data:

EG Raw Data Set

Category variable1 variable2 variable3

  15NA
NA

  1   NA
3  4

  2NA
   7NA

etc



Now I want to replace the NA’s with the average for each category, so if
these averages were:

EG Averages

Category variable1 variable2 variable3

  1   4.5
3.2   2.5

  2   3.5
   7.4   5.9



So I’d like my data set to look like the following once I’ve replaced the
NA’s with the appropriate category average:

EG Imputed Data Set

Category variable1 variable2 variable3

  153.2
2.5

  1   4.5
3  4

  2   3.5
 7 5.9

etc





Any ideas would be very much appreciated!



thankyou





Chris Howden

Founding Partner

Tricky Solutions

Tricky Solutions 4 Tricky Problems

Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training

(mobile) 0410 689 945

(fax / office) (+618) 8952 7878

ch...@trickysolutions.com.au

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to replace NA with a specific score that is dependant on another indicator variable

2010-09-01 Thread Chris Howden
Hi everyone,



I’m looking for a clever bit of code to replace NA’s with a specific score
depending on an indicator variable.



I can see how to do it using lots of if statements but I’m sure there most
be a neater, better way of doing it.



Any ideas at all will be much appreciated, I’m dreading coding up all those
if statements!





My problem is as follows:



I have a data set with lots of missing data:

EG Raw Data Set

Category variable1 variable2 variable3

  15NA
NA

  1   NA
3  4

  2NA
   7NA

etc



Now I want to replace the NA’s with the average for each category, so if
these averages were:

EG Averages

Category variable1 variable2 variable3

  1   4.5
3.2   2.5

  2   3.5
   7.4   5.9



So I’d like my data set to look like the following once I’ve replaced the
NA’s with the appropriate category average:

EG Imputed Data Set

Category variable1 variable2 variable3

  153.2
2.5

  1   4.5
3  4

  2   3.5
 7 5.9

etc





Any ideas would be very much appreciated!



thankyou





Chris Howden

Founding Partner

Tricky Solutions

Tricky Solutions 4 Tricky Problems

Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training

(mobile) 0410 689 945

(fax / office) (+618) 8952 7878

ch...@trickysolutions.com.au

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data from SpatialGridDataFrame

2010-07-20 Thread chris howden
I'm not that familiar with this type of data.

I just had a similar issue, but had a GIS person do it in Arc view.

But maybe try some of the following functions?
Match
%in%

Plus I'll forward U the replies I got to my post

Good luck :-)

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of kfl...@falw.vu.nl
Sent: Tuesday, 20 July 2010 9:42 PM
To: r-help@r-project.org
Subject: [R] data from SpatialGridDataFrame

Dear All,

I have a raster map of the class 'SpatialPointsDataFrame' and coordinates
of the class 'SpatialPoints'. I would like to retrieve the values that are
contained in the raster map at the specific locations given by the
coordinates.

Can anyone help me out?

Kind regards,
Katrin Fleischer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] do the standard R analysis functions handle spatial "grid" data?

2010-07-12 Thread chris howden
Hi everyone,

I'm doing a resource function analysis with radio collared dingos and GIS
info.

The ecologist I'm working with wants to send me the data in a 'grid
format'...straight out of ARCVIEW GIS.

I want to model the data using a GLM and maybe a LOGISTIC model as well. And
I was planning on using the glm and logistic functions in R.


Now I'm pretty sure that these functions require the data to be in a 2-D
spreadsheet format. And for me to call the responses and predictors as
columns from a data.frame (or 2-D matrix)

However I'm being told they can handle the data in a 'grid' format. So I'm
pretty sure this would mean I would be calling the responses and predictors
as 2-d matrices...and I don't think these functions can do that?


Can anyone enlighten me?

Am I right in thinking these function cannot handle data in a 3-D 'grid'
format and require data to be entered as a 2-d data.frame or matrix?


Are there other special functions out there that can handle this type of
data, and I should be using these instead?

Thanks for your help

Chris Howden
Founding Partner
Tricky Solutions 
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
ch...@trickysolutions.com.au

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can't use function vcov with a GAMLSS object??

2009-11-23 Thread Chris Howden
Hi everyone,

 

I'm trying to use function vcov to extract the covariance matrix from a
GAMLSS object. But I'm getting some strange errors and I was hoping someone
could help me out? Vcov works with the same model for lm and glm objects,
but not gamlss objects. I've searched various help sites to no avail.

 

Its very possible the reason is that vcov failed though, since I got the
following error message in the summary of the model  "summary: vcov has
failed, option qr is used instead"

 

In which case I was wondering if anyone could help me out by explaining how
I can find the covariance matrix equivalent without using vcov? 

 

 

The code and error messages I got for vcov are as follows.

> vcov(paper_size_type_income)

 

The following object(s) are masked _by_ .GlobalEnv :

 

 child id paper 

 

Error in gamlssNonLinear(family = RG, data = paper3, y = paper, mu.formula =
paper ~  : 

  NAs in y - use na.omit()

 

I then try using na.omit and I get a different error message (even though
there are no NA's in the data set, I checked using table(paper3$paper,
exclude=NULL))

 

> temp <- gamlss(na.omit(paper)~size + type + income, family=RG,
data=paper3)

GAMLSS-RS iteration 1: Global Deviance = 1160.816 

GAMLSS-RS iteration 2: Global Deviance = 1159.963 

GAMLSS-RS iteration 3: Global Deviance = 1159.951 

GAMLSS-RS iteration 4: Global Deviance = 1159.951 

> vcov(temp)

 

The following object(s) are masked _by_ .GlobalEnv :

 

 child id paper 

 

Error in inteprFormula.default(formula, .envir = envir, .start = start.v,  :


  covariates in formulae with unknowns must not be factors

check sizecovariates in formulae with unknowns must not be factors

check typecovariates in formulae with unknowns must not be factors

check income

> 

 

 

thanks

 

 

 

 

 

Chris Howden

Marketing Scientist

For all your Analysis, Modelling, Experimental Design and Training needs

(mobile) 0410 689 945

(fax / office) (+618) 8952 7878

tall.chr...@yahoo.com.au

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem reading in a csv file

2009-06-22 Thread Chris Howden
Morning all,

I'm trying to read in a csv file and R is having some problems. For some
reason its not 'seeing' all the columns for each row, and as such is not
reading in the file.

I've opened the file in EXCEL and I can't see any problems with it. All rows
have the correct number of columns.

The code and error messages I've used are below. I've also run
'count'fields" and have included that output too.

Any help or suggestions would be much appreciated.

Thanks


<-read.table("RWM Shopper Tracker - RAW DATA - 22JUN09 - Copy.csv",
header=TRUE, sep =",", row.names=NULL)

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
: 
  line 13 did not have 2264 elements


> count.fields("RWM Shopper Tracker - RAW DATA - 22JUN09 - Copy.csv",
sep=",")
[1] 2264 2264 2264 2264 2264 2264 2264 2264 2264 2264  384 2264 2264
384 2264 2264 2264 2264 2264 2264 2264 2264
   [23] 2264 2264 2264 2264  152 2264  384 2264 2264 2264 2264 2264 2264
2264 2264 2264 2264 2264 2264 2264 2264 2264
  

Chris Howden
Marketing Scientist
For all your Analysis, Modelling, Experimental Design and Training needs
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
tall.chr...@yahoo.com.au

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can anyone suggest some r packages for Experimental Designs, specifically for choice and conjoint??? (or is intersted in helping me make 1)

2009-05-12 Thread Chris Howden
Afternoon everyone,

 

I've spent the last week or so looking at all the experimental design
packages I can find in R. AlgDesign, design.conf and BHH2 being the best one
I could find.

 

Unfortunately none of these do a particularly good job for complex designs,
in particular for conjoint or discrete choice. (or perhaps they do, and I
can't make them work correctly)

 

Specifically, the problem is that none of them optimise the design for main
effects or 2-way effects balance. So although the 'd-efficiency' is
optimised some 2 way interactions are not present in the design (thereby
preventing the interaction from being modelled). And the main effects
balance is also quite bad, some levels being seen twice as often as
others

 

So I was wondering if anyone out there has any experience in using R for
complex design issues, and if so if they could point me in the direction of
some good packages? Or maybe help me out with my 'balance' problem?

 

Thanks for your help

 

PS: And if all else fails I'm thinking about trying to extend AlgDesign to
incorporate balance as a criteria when searching for designs. So just
wondering if there's anyone out there keen to help me (even if its just
testing out my beta versions)

 

Chris Howden

Marketing Scientist

For all your Analysis, Modelling, Experimental Design and Training needs

(mobile) 0410 689 945

(fax / office) (+618) 8952 7878

tall.chr...@yahoo.com.au

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can anyone suggest some r packages for Experimental Designs, specifically for choice and conjoint??? (or is intersted in helping me make 1)

2009-05-11 Thread Chris Howden
Afternoon everyone,

 

I've spent the last week or so looking at all the experimental design
packages I can find in R. AlgDesign, design.conf and BHH2 being the best one
I could find.

 

Unfortunately none of these do a particularly good job for complex designs,
in particular for conjoint or discrete choice. (or perhaps they do, and I
can't make them work correctly)

 

Specifically, the problem is that none of them optimise the design for main
effects or 2-way effects balance. So although the 'd-efficiency' is
optimised some 2 way interactions are not present in the design (thereby
preventing the interaction from being modelled). And the main effects
balance is also quite bad, some levels being seen twice as often as
others

 

So I was wondering if anyone out there has any experience in using R for
complex design issues, and if so if they could point me in the direction of
some good packages? Or maybe help me out with my 'balance' problem?

 

Thanks for your help

 

PS: And if all else fails I'm thinking about trying to extend AlgDesign to
incorporate balance as a criteria when searching for designs. So just
wondering if there's anyone out there keen to help me (even if its just
testing out my beta versions)

 

Chris Howden

Marketing Scientist

For all your Analysis, Modelling, Experimental Design and Training needs

(mobile) 0410 689 945

(fax / office) (+618) 8952 7878

tall.chr...@yahoo.com.au

 


[[alternative HTML version deleted]]


Send instant messages to your online friends http://au.messenger.yahoo.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.